User talk:Robert Ullmann/2008b

Definition from Wiktionary, the free dictionary
Jump to: navigation, search

archive pages, page history with archives

Contents

Re-run Variations?[edit]

Robert, my friend. You have, as always, been doing wonderful work here, and I thank you for it. Could you do me the favor of re-running User:Robert Ullmann/Variations - and if possible separate out the redirect as you did in the last run, and list the words with existing appendices separate from those without? Cheers! bd2412 T 04:06, 16 July 2008 (UTC)

I can look at it when I am home in Nairobi. But we haven't had an XML dump in long time now. It got stuck on 1/7 when we were almost at the front of the queue and he restarted all the dumps and now we are very near the end. Must poke him about this. The existing code doesn't "know" whether the appendicies exist, just links to them. (And they aren't in the dump I usually use; but I should get the fuller one next time anyway.) Robert Ullmann 07:38, 16 July 2008 (UTC)
Thanks - it's no problem if the list can not be sorted by what appendices exist, I can do that myself in a few minutes. My main concern is finding all the new entries that have been created which ought to be appendicized! bd2412 T 15:13, 16 July 2008 (UTC)

{{context|frequently}}[edit]

Hi,

  1. May I call you "Robert"? (I've seen you refer to yourself as "Ullmann", and I don't want to presume.)
  2. {{context|frequently}} renders as (frequently) — i.e., with a stray space — I guess because it's usually followed by another context label, so we infer an underscore. However, at [[irreplaceably]] and [[irreplaceability]] it comes out looking silly. Should I just use {{i}} instead, or is there a better way?

Thanks in advance!

RuakhTALK 02:36, 19 July 2008 (UTC)

"Robert" is just fine. I and others use "Ullmann" at various times because it is a lot less ambiguous. (There are a few Ullmanns around, but not nearly as many as Roberts.) And in UK/Commonwealth usage (e.g. East Africa) it is perfectly common to call someone by family name (i.e. without "Mr." or "Mzee" etc) and can be quite familiar. Either way. Just not "Bob" ;-)
Seems to me that in this case {{qualifier|frequently}} is better. Those adjective templates with the implicit space cause various bits of trouble; I was copying a feature from the previous implementation of cattag/context that perhaps I should have thought though more thoroughly. Robert Ullmann 03:29, 19 July 2008 (UTC)
Fixed this in {{context new}}, will be fixed when that is installed. Just ignore the spaces for now. (Although I put context new in {frequently} to test.) Robert Ullmann 04:18, 19 July 2008 (UTC)
Also allows {{context|frequently|,|something|lang=und}} to work properly: (frequently ,, something) which was the original intent. Robert Ullmann 04:18, 19 July 2008 (UTC)
Awesome, thanks! I was not expecting all of {{context}} to be magically improved by the time I got up in the morning. :-P —RuakhTALK 15:08, 19 July 2008 (UTC)

Translations to be checked of tower[edit]

Hi, Robert, long time no see. I'm User:KYPark refusing log-in. In a sense I've often missed you though we shared something unpleasant. Now I'm not debating that way, but just answering your message.

First, have a look at the silly "Translations to be checked" of voyeur about which I recently discussed. This is the worst, under which there are so many bad instances.

I do agree that a word may make more than one sense. The tower, however, makes practically one sense allowing for easy one-to-one matching. Strictly, I hesitate to match it with Korean (, tap, “tower”), which is obviously the best candidate but of very different history and connotation, as you may notice from my recent edit on it. Then, translations should remain either too hard or necessary evil.

Much if not most of Wiktionary eidts may do with or without strict scrutiny, not to mention Translations. Everything is to be checked, whether included in "Translations" or "Translations to be checked" that may damage the reliability of right translations. Sorry to have talked too much. --59.5.239.90 13:54, 19 July 2008 (UTC)

Biblical names[edit]

I'm confused as to why you're making changes like this: [1]. People's names are capitalized in English ;) --EncycloPetey 19:24, 19 July 2008 (UTC)

This is the CS deletion process fixing links in transwikis that are almost always WP capitalization. I rv these as I monitor what it is doing. Robert Ullmann 22:09, 19 July 2008 (UTC)

Template:fr-noun[edit]

Hi. You seem to have been involved in Template:fr-noun. Could you add functionality such that uncountable nouns or plurals that the author doesn't know can be dealt with? For example, I suspect that galimatias is uncountable, but I don't actually know, and the current template leaves me no wiggle room. Let me know what you think/do. Thanks! -Oreo Priest talk 03:12, 20 July 2008 (UTC)

You could always use {{infl|fr|noun|g=?}} (where '?' stands for the gender 'm' or 'f'). That's probably the easiest solution for the time being. (I accidentally saw this comment. Sorry for anticipating a decision of yours.) -- Gauss 20:22, 23 July 2008 (UTC)

Wiktionary:Grease_pit#Bot_requests[edit]

Would you be willing to take a look at this? I don't know if anyone else really knows what to do with it. I certainly don't. -Atelaes λάλει ἐμοί 05:47, 21 July 2008 (UTC)

It is more that I find people with perfectly good accounts that refuse to log in fairly annoying. Robert Ullmann 17:30, 24 July 2008 (UTC)
Perhaps I'm missing something, but I see no evidence that that editor has an account (though obviously, even if (s)he doesn't, (s)he could easily create one). —RuakhTALK 23:37, 24 July 2008 (UTC)
Should we adopt an explicit policy that bots may only be run by users with accounts? --EncycloPetey 23:42, 24 July 2008 (UTC)
I think that's reasonable, but in this case it's a bot being requested by an anonymous editor, which I think is fine (especially since the editor has been editing, with a stable IP address, at a decent rate, for more than five months). —RuakhTALK 00:12, 25 July 2008 (UTC)
Wait a second, you are misinterpreting! I mean that I personally am not inclined to pay attention to requests etc from people who refuse to use their login (and yes, this user has a login, I'll not ID it because he/she may be using the IP to not be the same user—though you would think that a different login would be better!). I, like everyone else here except the WMF paid staff, work on what I want to. (Well, also things I may really not want to at a given time, but have taken responsibility for ;-)
As to policy, of course IP-anons can request what they like, suggest what they like, etc. In the case of running a bot, the bot itself has to be an account, and one would think it is reasonable that the bot-runner be required to have an account. (Note that the first item required by the bot policy for a request is "Your user name" ;-) Robert Ullmann 11:12, 25 July 2008 (UTC)
Re: "I mean that I personally am not inclined to pay attention to [] ": Yeah, I got that, don't worry. :-)   —RuakhTALK 11:36, 25 July 2008 (UTC)

Thanks Robert[edit]

I was just coming to that realization about a header. It looks like the whole issue may end up moot. Although I'm continuing to work on cleaning up the entries already made in wikisaurus and adding synonyms where I can, it looks like we're heading a different way in meanings of synonyms. (more along the lines of each synonym having meaning tacked to it.) You wouldn't know anything about mouseovers, would you? Amina (sack36) 12:49, 24 July 2008 (UTC)

Thanks[edit]

Sorry about that. I'll look through the Appendix article and see if there is still anything worthwhile to merge. The documentation at meta wasn't too clear on the specifics per project and the transwiki log documentation here did not make that explicitly clear (if it did and I missed it I apologize again). In the case that the "new" article is a duplication of the appendix, you can probably delete that new transwikied page. I don't know the deletion procedures here but if you have a "requested by author" reason I'll assent to that. Protonk 17:19, 24 July 2008 (UTC)

Multiple languages codes[edit]

Hey, thought I'd run a question by you. What's your view on having 2 language code templates for the same language (e.g. {{es}} and {{spa}})? It does allow greater facility to editors in cases where the code is expanded to the language name, such as with {{term|...|lang=}}. However, if the template uses the code verbatim, not expanding it, such as in {{pejorative|lang=}}, the entry will get categorized wrong. Should we allow both, and then have a bot rename alternative codes to primary codes in template invocations that use the codes verbatim? Or should we encourage editors to standarize on one particular code? Thanks. --Bequw¢τ 21:20, 24 July 2008 (UTC)

Or all template that don't expand language codes could first pass the code through a {{standard lang}} that could have the necessarily alternate to primary language code mappings (e.g. spaes). Just another thought. --Bequw¢τ 20:30, 25 July 2008 (UTC)
I just made up a quick list of all the macrolanguages (as determined by SIL) we're currently using on Wiktionary at User:Bequw/alpha-3#Macrolanguages (some 639-1 codes and some 639-3). I don't know if we should be using the ~35 or so, but it's interesting to see them. --Bequw¢τ 08:30, 27 July 2008 (UTC)

I would never have created the 3-letter code templates for languages with two letter codes; creating them was, IMHO, a very bad idea.

As to the list of "macrolanguages", in all the cases there I am familiar with, there is a standard language, and the others are dialect or variant. (zh being an extreme case; but note that there is "Standard Chinese", which we call "Mandarin" ;-). In other words, in each case the "macrolanguage" (IMHO a bogus concept) "contains" a language of (usually) the same name wich is the language, the others are variants.

If one really were to apply this concept, one would have to consider English to be a "macrolanguage", but that would hardly mean there wasn't an "individual" language called English. (The U.S. dialect being a horse of a different colour. ;-) Robert Ullmann 14:16, 12 August 2008 (UTC)

True, Arabic has "Standard Arabic" for example. I'm not an expert on them all to say more than that. As for duplicate language codes, I cleaned up a couple cases where both were in used (ht/hat, lwi/lw) shifting them to the two-letter. The only one left is 'fiu-liv' and 'liv'. Most cats in Category:Livonian language use 'fiu-liv:' but there is Category:liv:Uralic derivations. Do you think we should use the WMF code or the ISO code in this case? I'd prefer the ISO, but maybe as a WMF we should be biased the other way :)--Bequw¢τ 10:47, 18 August 2008 (UTC)
PS - Most of the 3-letter duplicate codes aren't used (and only about half of the possible ones exist). What would you say I cleaned them out and them nominated them for deletion? That would make things much easier (unless there's history on this issue which I don't know about). --Bequw¢τ 10:51, 18 August 2008 (UTC)
"fiu-liv" was invented by User:Flying Saucer (aka WF), is not used by WMF. (hence the ??? that was there); we should use the ISO code of course. I do think that we should lose the redundant 3-letter codes. Do let's move carefully, there are a bunch of side effects (on iwikis and such). Robert Ullmann 11:58, 18 August 2008 (UTC)
Ah.. now I see. I replaced all 'fiu-liv' references with 'liv'. I looked at the duplicate codes and found that when they exist, the 3-letter was rarely used. I found all categories that used the duplicate 3-letter codes and moved them all to 2-letter codes. Then I changed all references to the 3-letter templates to the 2-letter ones (except for lat "Latin" which is fairly common). There could be some uses still of duplicate 3-letter codes that I missed (eg someone manually categorizing an entry to a cat that hasn't been created yet). I didn't do any automatic find-and-replace and I didn't touch those codes in Wiktionary:Language code extensions, so are there other "iwiki" issues to deal with? Or can we post a note on BP and many see if the community wants to get rid of them? Thanks for your help. --Bequw¢τ 09:00, 19 August 2008 (UTC)

Wiktionary:Grease_pit#Wikisaurus_Template_Naming[edit]

Would you be willing to take a look at this discussion? I figured you might have a number of useful thoughts on the matter. Many thanks. -Atelaes λάλει ἐμοί 03:25, 26 July 2008 (UTC)

Wiktionary:Language code extensions[edit]

Hi Robert, at Wiktionary talk:Languages without ISO codes there is a proposal that you may or may not be aware of to merge Wiktionary:Language code extensions, which you created, into Wiktionary:Languages without ISO codes. Your comments would be very welcome. Thryduulf 10:59, 27 July 2008 (UTC)

Wiktionary:Language code extensions is a list of the extensions in current use, decided by WMF. I created it because the other was already adrift. (and now is moreso: Proto languages don't meet CFI and are out, most of the others we do not need any codes for at all) Ety languages do not and should not get template codes. The only things I see there that we should add now to Wiktionary:Language code extensions at present is Jèrriais. The rest is all talk page material that might (and mostly not) be relevant in the future. Robert Ullmann 20:29, 27 July 2008 (UTC)

Standardizing dialects[edit]

I was thinking of proposing a format for standardizing the treatment of dialects, using a system similar to that in place for the treatment of languages (e.g. {{grc}}, {{lang:grc}}, etc.). Basically, a dialect template would consist of a series of information concerning the dialect, such as what its full name is (for use outside of parent language entries), its short name (for use inside parent language entries), where the Wikt entry is, where the pedia article is, etc. I've set up a demo at {{grc-ion}}, which would be the dialect template for the Ionic dialect of Ancient Greek. I was thinking that perhaps we could program certain templates to read such dialect templates (notably {{etyl}}, but perhaps also {{context}}, and maybe even {{infl}})? So....maybe an editor could type in {{etyl|grc|la|dial=ion}} (if a Latin word came from a specifically Ionic word), and the output would be [[w:Ionic Greek|Ionic Greek]] [[Category:la:Ancient Greek derivations]]. However, I've never had to have a template call a specific piece of info from another template, and am not sure if it's even possible...... Would I have to create {{grc-ion-name}}, {{grc-ion-short}}, etc.? In any case, I thought I'd run this thing past you for technical advice, and you and EP for practical advice before I present it to the community as a whole. I figure the discussion will be a bit more focused if some of the details are ironed out ahead of time (my assumption is that it will be a messy enough convo as is). I've put {{etyl}}'s code into {{grc-test}} and will be putting it into User:Atelaes/Sandbox shortly, so I can monkey with it. You are, of course, quite welcome to also monkey, if you have ideas. Many thanks. -Atelaes λάλει ἐμοί 20:13, 27 July 2008 (UTC)

We've already covered this with regional/dialect context templates; trying to overload "etyl", which you note already is not working because (languages used in etys) != (languages used in entries) in various serious ways. I will look at this much more tomorrow. (Is late here, and I am on an early schedule right now ;-) Robert Ullmann 20:29, 27 July 2008 (UTC)
I can't find it right now, but I know I was in a conversation about ISO naming standards in the past year. It may even have been Robert who directed me to the web address where there are ISO instructions for constructing personal-use codes for dialects. But, I'm not sure we'd want to use this beyond the naming of audio files. MW already has some hyphenated language names as a result of them not having ISO codes (yet), and this could throw a spanner in the works. --EncycloPetey 20:32, 27 July 2008 (UTC)
I don't know if this is the discussion you're referring to, but you and I discussed this back in February, at Wiktionary talk:About Latin#{{la-x-new}}. Going by Special:SiteMatrix and Wiktionary:Language code extensions, it looks like WMF's hyphenated language codes wouldn't conflict with a standard approach: it uses a standard approach, but without the -x- of private use codes. —RuakhTALK 23:00, 27 July 2008 (UTC)
Yes, that's it. Thanks. --EncycloPetey 23:19, 27 July 2008 (UTC)
It appears that at least some of them do conform, such as be-x-old. And, perhaps I should note that I have absolutely no problem with grc-x-ion. It could become tricky, as we would probably want to do a combination of types of dialect differentiation. So, for example, there is fr-CA (Canadian French), which is a spatial distinction, which is probably the type we will most often make with modern languages. However, for Latin, the divisions we want to make are largely temporal (Classical, Medieval, New). With Ancient Greek, there's a bit of both (Ionic vs Attic is regional, but Attic vs Koine vs Byzantine is largely temporal). I guess I could use grc-TU35, since İzmir Province most closely lines up with where Ionic was spoken, but that seems a bit overly pedantic. Ultimately, I think these rules are primarily written for living languages. So, perhaps we could use standard codes such as fr-CA when practical, and use ISO-x-whatever when not. Does that work for you folks? -Atelaes λάλει ἐμοί 01:14, 28 July 2008 (UTC)
Re: "Ultimately, I think these rules are primarily written for living languages": I think you're right, but they do attempt to cover historical variants. For example, frm-1606nict is registered for "Late Middle French (to 1606)", i.e. "16th century French as in Jean Nicot, 'Thresor de la langue francoyse', 1606, but also including some French similar to that of Rabelais"; fr-1694acad is registered for "Early Modern French", i.e. "17th century French, as catalogued in the "Dictionnaire de l'académie françoise", 4eme ed. 1694; frequently includes elements of Middle French, as this is a transitional period"; and el-polyton is registered for "Polytonic Greek", which I believe is a temporal classification; and so on. If coverage is spotty (and it is), it's because we haven't been registering our needs with the IANA. :-)   —RuakhTALK 02:06, 28 July 2008 (UTC)
Interesting. I was not aware of such specificities. If you know of a place where dead language variants might be found, I'd be quite interested to read it. My guess is that el-polyton is for modern Greek (i.e. after 1453) which used the full complement of diacritics (three types of accent, breathing marks, etc.). The current standard did not become officially accepted until 1976, even though it was in use for at least a century and a half before that. But, as I said before, for specifications which have no precedent, I am quite open to whatever naming scheme people think will be the most orthodox, as long as we can make the distinctions where we need them. I suppose this conversation has gotten to the point where we should move it off Robert's talk page and on to the BP. I guess I was sort of hoping that the technical details could be ironed out before presenting to the community at large, as there are enough issues without that. Then again, perhaps constantly flashing him with "You've got new messages!" might be just the requisite impetus to get the technical issues resolved in a timely fashion.  :) -Atelaes λάλει ἐμοί 02:23, 28 July 2008 (UTC)
Naw. The flashing messages just means someone (the Madison North/UWisc student, or the 14-year old from Toronto, etc) have once again demonstrated their immaturity. See my WP talk page, which is the same as Dec 2007, the entire 2008 history is vandalism and reverts. I sometimes just click on (open in new tab) and immediately close the tab w/o even looking (:-) Robert Ullmann 08:36, 28 July 2008 (UTC)
Oh, sorry, I should have given the link: <http://www.iana.org/assignments/language-subtag-registry>. As you can see, there really aren't very many temporal-variant subtags, but I really think the standards bodies/registration authorities/whatever would be open to more. —RuakhTALK 11:40, 28 July 2008 (UTC)
One thought that occurs to me: Whatever we decide to do from a technical perspective, we shouldn't rely on abbreviations based solely on English names for the dialects, as we would like them to be useful and understandable to others. So for New Latin, la-new is a bad idea, and even la-neo might be a problem since "neo" is actually a Greek prefix, and I don't know that it's used generally to describe Neo-Latin outside of English texts. Perhaps la-nov would be best, with "nov" short for novus. We may end up using an English-based abbreviation in some of these cases, but we should keep this issue in mind when making selections, should we choose to do so. I do agree with Atelaes that we need to be very aware of whether we are making spatial or temporal distinctions. For French, it's more logical to limit it to spatial, because Old French is treated as a separate language. For Latin, the subdivisions must be temporal, as it is not subdivided at all temporally as a language, and regional dialects are usually ignored. --EncycloPetey 23:10, 28 July 2008 (UTC)

RE: TV Characters[edit]

Oopsy, I haven't thought of that. Oh, by the way, I'm still 12.--124.104.47.5 12:50, 29 July 2008 (UTC)

deletion of wrong-case entries[edit]

Thanks for taking care of deleting wrong-case redirects. Is What's your phone number? a good candidate for that, though? We usually keep redirects for phrases from similarly-worded phrases, no? and the question mark in the redirect in this case guarantees that didyoumean won't find the correct page.—msh210 17:57, 29 July 2008 (UTC)

Note that the target is what's your phone number and we do have a redirect from what's your phone number?. So if you follow a link to the deleted redirect: [2] you will arrive ... (modulo a auto+redirect bug that I still need to fix) point being that we don't need or want both capitalizations of the form with the ?, and keep the one that "matches" the target case. And yes, the code is checking all this.
Try go/search on the form with the ?, Special:Search/What's_your_phone_number?. Robert Ullmann 22:57, 29 July 2008 (UTC)
And please do question anything you don't understand; there is some new code, it is going very slowly so I can make sure I recheck everything, and there always may be a bug ... Robert Ullmann 23:05, 29 July 2008 (UTC)
Fixed the above mentioned bug, note the link given does take you to the entry. Robert Ullmann 15:11, 30 July 2008 (UTC)

eruciform[edit]

Thanks. I have a good source! Cheers, Dlohcierekim 21:30, 31 July 2008 (UTC)

Template:proto[edit]

I made some changes to {{proto}}, as no one heeded my comments at the talk page. Please take a look and make sure I didn't break it. Many thanks. -Atelaes λάλει ἐμοί 20:59, 1 August 2008 (UTC)

This looks fine. Robert Ullmann 10:33, 2 August 2008 (UTC)

Sorry to hear about SO[edit]

I hope you get some good news soon. Do you have any idea what the problem might be? Is calling around not an option? DCDuring TALK 00:25, 2 August 2008 (UTC)

I did get an SMS; is okay for now. Trying to call around would be a big problem here. In the US, I once had a friend I was trying to reach because he was in the the ER waiting area at a (known) hospital; I took ~20 minutes to get though before it occurred to me (me being one of the top telecoms guys in the world ;-) to ask ATT for a sup, was talking to the ER doc in seconds ... doesn't work that well here, although I know a Safaricom sup personally. Would have taken all morning, with my stress level going steadily up. Anyway, seems okay for now. Thanks! Robert Ullmann 00:33, 2 August 2008 (UTC)
Glad to hear it. Rest easy. DCDuring TALK 01:30, 2 August 2008 (UTC)
Everything OK? DCDuring TALK 23:13, 5 August 2008 (UTC)
yes, she is fine, and niece and nephew (The Monsters, age 3 and 4) are dancing. (not at this exact moment, you know what I mean ;-) Thank you so much. Robert Ullmann 01:09, 6 August 2008 (UTC)

Template:wlink[edit]

Same thing as proto, just wanted to make sure I didn't screw anything up. Thanks. -Atelaes λάλει ἐμοί 23:10, 3 August 2008 (UTC)

no, you added in a newline, and didn't even get what you wanted (at least a missing :). Please can we talk about this first? wlink is intended to replace a seriously missing WM function, not be a general purpose section link. Robert Ullmann 23:20, 3 August 2008 (UTC)
job queue is 268K ... ;-) Robert Ullmann 23:22, 3 August 2008 (UTC)
Damn. I suppose I should have tried this out on {{grc-test}} first. Sorry. Any idea where the linebreak is coming from? -Atelaes λάλει ἐμοί 23:26, 3 August 2008 (UTC)
of course: you left a break after /noinclude. Is significant. And why is circeus (or whatever he is called) mucking with colour panel with no idea what he is doing? Geez. is 2:30 AM on a Monday here! (;-) Robert Ullmann 23:29, 3 August 2008 (UTC)
Well, don't I feel stupid. Is there any point in trying it again w/o the line break, or would you prefer to let sleeping dogs lie. I have no idea about Circeus. Looks like someone's got a case of the Mondays.  ;-) -Atelaes λάλει ἐμοί 23:33, 3 August 2008 (UTC)
How 'bout we talk about it when I am awake? (;-) Some kind of link-to-section would be good, but not overloaded on this. Or something like that. Robert Ullmann 23:36, 3 August 2008 (UTC)
Cool beans. I'll set up a trial run on {{grc-test}}, with a demo on User:Atelaes/Sandbox. Just take a look whenever you're feeling up to such an arduous task. :-) -Atelaes λάλει ἐμοί 23:42, 3 August 2008 (UTC)
oh shit, will need to clean up {{colour panel}} tomorrow. (today?) ah well. my nephew I haven't seen for a few months is coming on school break in a few hours (he is 4!) so we will have fun. Robert Ullmann 23:40, 3 August 2008 (UTC)

Redirects, categories, context tags for "." IP address elements and page extensions?[edit]

Would it make sense to have redirects from terms such as ".asp" to the corresponding entry, in this case ASP ? Would it be useful to have a category for the entry and/or a context tag for the definition. This is triggered by someone (anon, I think) adding .asp to English requested entries. The relevant line at ASP did not make it clear that it could appear with a dot in the context the user discovered it. I'm sure there are WP pages that provide a fairly complete listing of such "words". Just a thought. DCDuring TALK 21:53, 6 August 2008 (UTC)

Incorrectly formatted proto[edit]

When you get some time, would you be willing to create a list of all entries with an asterisk in an etymology section? I really don't want to do them......but I should. If you can think of a clever algorithm to give AF, that'd be nice. Perhaps something on the order of "if Common/Proto language (Germanic, Indo-European, Finno-Ugric, Oceanic, any of the ones currently in use) + *{{term|word}}/*{{term||word}}/*[[word]]/etc., change to {{proto|language|word|lang=L2}}"? It could probably ignore (read remove) everything between Common/proto and the etymon, as it's mostly unnecessary "root", "verb", etc. Wost case scenario is it makes the sentence a bit ungrammatical, but it'd at least be formatted and categorized properly. Perhaps this is beyond AF's skills..... Also, what might be nice is a list of all {{{1}}}'s in use for {{proto}}. I realize we're sitting on a rather old dump, and it's starting to smell, but I imagine it'll give me enough work to keep me busy til the next one.....and probably long after. Many thanks. -Atelaes λάλει ἐμοί 07:14, 9 August 2008 (UTC)

PS, I'm currently teaching myself Python, so it may not be that long until I'm asking you how to do this myself, instead of asking you to do it for me. That's probably not that comforting, but.........well.....you know the parable about teaching the man to fish and all. -Atelaes λάλει ἐμοί 07:16, 9 August 2008 (UTC)
I have plenty of Python for you to look at. Learning to fish is fine ... when you do, I'll take Tilapia, filleted and lightly fried, with a sauce beurre blanc. Robert Ullmann 11:27, 15 August 2008 (UTC)

Context for Southern US pronunciation and vocabulary[edit]

I've been wondering whether it would be worthwhile to have a context for "Southern US". It seems as if it is big (# of speakers) and distinct, especially in pronunciation. Some of the entries we now show as non-standard might more reasonably be shown as Southern US, sometimes with AAVE. Southern US English probably shares features with rural or extinct UK dialects, probably from the "West Country". Do we know if we have any southern US pronunciations at present? DCDuring TALK 02:40, 11 August 2008 (UTC)

y'all think? Yes, we should, but then "Southern US" can be broken down further in a lot of cases. Definitely need {a} accent templates for quite a number of things. you'se all think? Robert Ullmann 14:23, 12 August 2008 (UTC)
There is a great deal of opportunity for finer differentiation, especially in pronunciation, but initially I am looking to be less insulting to Southerners by substituting "Southern US" for "nonstandard" and to more correctly label some "AAVE" as "AAVE and Southern US". This affects some of the inflected forms (e.g., drug, simple past of drag) as well as lemmas and idioms. Southern US would accommodate a large share of the differences. Further differentiation can work adequately. In the meantime I'll be adding the Southern US context wherever applicable and see if I can recruit folks to provide Southern US-style pronunciations for some characteristic words. DCDuring TALK 14:46, 12 August 2008 (UTC)

teneo[edit]

Regarding [3], the verb is defective, which means it is missing a number of major inflectional forms and so cannot use on of the verb templates. --EncycloPetey 19:30, 13 August 2008 (UTC)

But it would still be a good idea if the table formatting was templatized, even if all the forms had to be entered by hand. It would probably we worth investing in a Latin counterpart to {{grc-conj-present-blank-full}}. -Atelaes λάλει ἐμοί 19:34, 13 August 2008 (UTC)
Perhaps the table is necessary, but is the use of {{#if:}}? —RuakhTALK 01:39, 14 August 2008 (UTC)
No. --EncycloPetey 03:13, 14 August 2008 (UTC)
A bit of automation and I modified the table to use bar instead of {{#if: |foo|bar}} and | instead of {{!}}. Another bit of automation and I checked that no text was changed in the table (copy/paste from browser rendering into text editor, run diff, confirm no changes), but if you'd like to make sure the table's accurate, please do. :-)   My change is not mutually exclusive with the idea of moving this into a one-use template (as discussed briefly at Wiktionary talk:Wikitext style#.2822.29 No HTML tables); it is mutually exclusive with Atelaes' suggestion, but I won't be offended if someone implements his suggestion and trashes my change in the process. —RuakhTALK 17:08, 15 August 2008 (UTC)
Given a layout template {la-2nd-blank} (or some such) {la-2nd (teneo)} could then use it, and not have another copy of the colours etc. (The "call depth" of 2-3 or so is not a problem, there is no branching.) Thanks for fixing that one; regardless of whether the table is in the page, I did want to get rid of the conditional parser syntax. (oh and the {!}'s were never needed; parser functions and piped links nest just fine in table syntax, it is t'other way 'round that doesn't work!) Robert Ullmann 17:15, 15 August 2008 (UTC)
I meant that my change is mutually exclusive with Atelaes' suggestion, because my change involved removing parser-functions, etc. that aren't the same changes we'd make for a blank template. But, it's fine. :-) —RuakhTALK 17:30, 15 August 2008 (UTC)
Yes, I'd messed up the referent of my turn of phrase; fixed now ... we could just "push" the table down into the single verb template as it stands. Robert Ullmann 17:32, 15 August 2008 (UTC)
Possibly, but for defective verbs this means a lot of unnecessary space taken up by blank inflection cells. It would also be nice to link to the LAtin section of each target page, instead of simply to the entry page. --EncycloPetey 18:49, 15 August 2008 (UTC)
Once it is using a layout template, that in turn can do various bits of magic. (Making blank rows disappear, etc, or using variant layout templates.) I don't understand the latter comment: since the templates are still specific to language, why is there a problem linking to section? We're not talking about language independent layout templates. Robert Ullmann 18:53, 15 August 2008 (UTC)
Take a look at the blue links in the table for teneo. Some of these are pages with Italian entries, which will necessarily precede the Latin ones. So, someone following the link will be deposited at the Italian section rather than directly to Latin. --EncycloPetey 19:13, 15 August 2008 (UTC)

On a related note, could you generate a page (in my user space) listing all 108 Latin words that use table syntax instead of templates? Some of these are one-offs, where the inflection pattern is unique to the word, but there are many of these likely to be old additions to Wiktionary that need cleanup. --EncycloPetey 19:13, 15 August 2008 (UTC)

Category:zh:Beginning Mandarin and Category:zh-cn:Beginning Mandarin[edit]

Hi Robert. Category:Mandarin nouns contains both pinyin, traditional and simplified form entries but in the beginning mandarin categories it is split up. Should the beginning mandarin categories be merged so that they will look like the other mandarin categories? Kinamand 09:05, 15 August 2008 (UTC)

Re:Zähne[edit]

Ok. I just assume the policy is that definitions of inflected forms aren't supposed to be on the first line (where there's the information "plural, accusative, singular, dative, third-person, etc. of such-and-such"). So, is this version better? -- Frous 21:46, 15 August 2008 (UTC)

Is this the only plural form, or just the nominative plural? --EncycloPetey 02:18, 16 August 2008 (UTC)
It's the nominative, accusative and genitive plural. -- Frous 10:56, 16 August 2008 (UTC)

Re:Template:pap[edit]

I am sorry, I want create the Chinese page for this template. It's my mistake. Thank you for correct it.--Dingar 10:49, 17 August 2008 (UTC)

templates[edit]

You are right, of course. it is {{fr-conj}} that adds the link, not {{fr-conj-table}}, and I could have transcluded the latter. I'll fix that when I have the time. Circeus 18:57, 17 August 2008 (UTC)

Only one issue is left: switching a template from {{fr-conj}} to {{fr-conj-table}} breaks the auxiliary parameter, but I'm stumped ATM as to how to restore it. Circeus 20:00, 17 August 2008 (UTC)

Another quick note to say that I'm leaving the library now, so if I messed up on any of my template edits, I won't be able to fix it until tomorrow. Circeus 20:45, 17 August 2008 (UTC)

List of context labels[edit]

I don't know if you auto-generate User:Robert_Ullmann/Context_labels or not. Several rarely-used, 3-letter redirects were in the way of language codes. I orphaned them and then made them language templates (most recently {{txt}} and {{gui}} but I think there were some others). Cheers. --Bequw¢τ 21:29, 19 August 2008 (UTC)

reflexive verbs[edit]

I think something need to be done to properly deal with those. I'm rapidly coming to the conclusion that entering these at their pronoun form (s'esclaffer) is downright improper, especially for verbs who are not restricted to reflexive use (se maquiller). Then there's the fact that "irregular" -er verbs (s'efforcer) need to be taken into account by the reflexive conjugation templates in the same fashion that {{fr-conj-er-imp}} now does (see neiger for an application).

So, in summary, this issue is a mess that might be best dealt with by adding a "reflexive verbs" section to Wiktionary:About French and linking to it from {{reflexive}}, or by the use of a conjugation note rather than a table. Any thoughts?

I'll come back to it when I'm done cleaning up the other conjugation templates. Circeus 18:36, 20 August 2008 (UTC)

In case your ears are tingling.....or is it your nose?[edit]

You're being talked about here, and it might be helpful if you could set us all straight on exactly what the limitations are on your cute, fuzzy little bot. -Atelaes λάλει ἐμοί 07:05, 21 August 2008 (UTC)

I'm curious: when you wrote "fuzzy little bot" did you know that AF contains a routine called fuzzy? Robert Ullmann 16:46, 23 August 2008 (UTC)
Nope. Simply trying to be demeaning. -Atelaes λάλει ἐμοί 02:42, 24 August 2008 (UTC)

Deletion Request[edit]

Could you delete מעלתא and מעלא please? Thanks in advance. --334a 03:20, 22 August 2008 (UTC)

Done. Robert Ullmann 03:24, 22 August 2008 (UTC)
Thanks again, Robert. I actually created those articles while I wasn't signed in (so my IP address showed up). I've re-created them under my username, so just ignore those articles (i.e. don't delete them). :) --334a 03:29, 22 August 2008 (UTC)

apron[edit]

moved to AF talk

Spelling differences:"nadzuke" versus Hepburn "nazuke"[edit]

I have added my 2 cents over at Template talk:ja-readings and Talk:nazuke. I don't think that a spelling change to the template parameter is a plausible solution (bajillion edits would have to be made for such change), but the links within the template could point to "nazuke" instead, if you agree that is. —Tokek 23:02, 22 August 2008 (UTC)

Changed it, so that "nadzuke" will work as well. But it is only used on a few dozen pages, so we could fix them. (The common nazuke are not listed for every kanji possible, this parameter is generally used for the uncommon or unique readings.) Robert Ullmann 14:13, 23 August 2008 (UTC)

Me messing with dialects again[edit]

Hey, I just figured I'd tell you about {{grc-alt}}, which can be seen in action at πρός (prós), just so we could get the you yelling at me bit out of the way. Cheers. -Atelaes λάλει ἐμοί 08:15, 24 August 2008 (UTC)

Template:see[edit]

Maybe it would be better just to pretend the Seneca language doesn't exist. --Jackofclubs 10:49, 24 August 2008 (UTC)

Maybe it would be better if you did not intentionally create a problem. Is disruptive, and you are warned; I will block you if you engage in any further wilfully disruptive behaviour. Robert Ullmann 10:51, 24 August 2008 (UTC)
I understand. It was not meant to be disruptive, just that eventually this template problem will crop up anyway. Also, my first comment wasn't meant to be entirely serious. --Jackofclubs 10:55, 24 August 2008 (UTC)

How about using {{iro}}, which is Seneca's ISO 639-2 code (although, I must admit, I'm a little clueless on this area)? --Jackofclubs 11:13, 24 August 2008 (UTC)

iro is code for the collection of all "Iroquoian languages". It's not a code for an individual language, like we require here. --Bequw¢τ 09:09, 25 August 2008 (UTC)
Hi Robert,
I think I took care of all the remaining { {see}} templates in User's talk pages and other pages outside the main namespace. I saw you did something very clever with it when I wanted to adapt the template itself. What does subst'able mean in this context. Does that have something to do with the other neat trick you did recently making the :lang: versions obsolete? --Polyglot 17:22, 6 December 2008 (UTC)

no XML dump[edit]

I've posted a request on Brion Vibber's WP talk page w:User_talk:Brion_VIBBER#Wiktionary_constipation asking about this. It's ridiculous that we haven't had an XML dump since March. Connel didn't know whom to contact to get action, but recommended contacting Brion. --EncycloPetey 22:35, 25 August 2008 (UTC)

The last XML dump was 13 June. Robert Ullmann 15:44, 26 August 2008 (UTC)

{{ethnologue}}[edit]

I don't know if you watch the page, but see Template_talk:ethnologue. Basically, I'd like language name entries that have a 639-1 and a 639-3 code to list both codes on the page. I was thinking it could be added to {{ethnologue}} as an optional parameter. --Bequw¢τ 23:34, 25 August 2008 (UTC)

I'd rather not overload {ethnologue} since SIL has nothing to do with the -1 (2 letter) codes. So it doesn't really belong there. But you are correct something in the references/external links section should highlight the -1 code. How about {{iso 639-1|(code)}} to put first, and we'll figure out a useful place to point it? Robert Ullmann 15:47, 26 August 2008 (UTC)
Over on the talk page I mentioned that I created {{ISO 639}} which can take the -1 code as a parameter in addition to the -3. You can also specify showing a link to the Ethnologue, Linguist List (for artificial, ancient, etc.) or just the SIL page (if no link parameter is given). It'd be nice to have all the code stuff on one line. This template (with the ethnologue parameter) could be used instead of {{ethnologue}} on the languages with duplicate codes (others left untouched). This way {{ethnologue}} isn't overloaded, the info is compact, and we show the -1 code (helping people pick the right code when making their entries). How does that sound?--Bequw¢τ 07:35, 27 August 2008 (UTC)

Category:ja:Physics[edit]

I think you should take a look at this since you're an expert when it comes to things like templates and programming. In a nutshell here's the problem (I hope it can be fixed):Acoustics is listed as a sub-category of Sound but the reverse is also true so something like this happens:

X=Other Categories

Acoustics

Sound
Acoustics
Sound
Acoustics
X
X
X
X
X
X

...and so on in a terrible, unending way.--50 Xylophone Players talk 16:52, 28 August 2008 (UTC)

must be an echo (:-) ... can't look at it right now Robert Ullmann 16:56, 28 August 2008 (UTC)
Thanks for responding, anyway.--50 Xylophone Players talk 22:19, 28 August 2008 (UTC)

Yes, they are subcategories of each other. Our category structure is not strictly a hierarchy. --EncycloPetey 22:31, 28 August 2008 (UTC)

Identifying space wasters[edit]

I'm not sure of your motivation for working on Pronunciation tables, but I found that they wasted space. Some other space-wasters are vertical lists. My priority on these would be the English section, especially items that appear above definitions (Alternative spellings and forms, Pronunciation) Etymologies that are longer than 2 lines (on most screens) as well.

Is there a way to highlight these for changes. For Alternative spellings there seems to be tentative agreement on the adequacy of horizontal lists.

I'm not sure about Pronunciation. I thought there is agreement that all phonetic alphabet representations of a given accent-pronunciation could go on the same line. I'd favor all pronunciation that was specific to a given pronunciation (including audio and rhymes) appearing on the same line. I don't want to get under anyone's skin with this, however. DCDuring TALK 15:46, 30 August 2008 (UTC)

Then don't try to push a new format secretively when you already know there are community members who object. Make a coherent proposal in the BP and then take it to a vote. --EncycloPetey 15:54, 30 August 2008 (UTC)
In what possible way is he "push[ing] a new format secretively"? Huh? Robert Ullmann 16:09, 30 August 2008 (UTC)
What I have been doing is knocking out as many things as I can do with various bits of magic from User:Robert Ullmann/Pronunciation exceptions so that Thryduulf isn't looking at a lot of things that don't have to be done manually. The tables are one of them. At the same time, AutoFormat is folding cases we've already agreed on. See this edit for example. (and it finds them by itself, they don't have to be flagged) By all means comment on ideas at WT:PRON talk page (best place) Robert Ullmann 16:09, 30 August 2008 (UTC)

Pronunciation sections[edit]

Thank you for all the work you're doing on this. Thryduulf 18:39, 30 August 2008 (UTC)

Ditto. I see a lot of the same mistakes / inconsistencies in formatting of this section, and it's nice to see it (finally) being cleaned up. I do them when I see them, but there are too many for one person to find and fix by hand. --EncycloPetey 22:27, 30 August 2008 (UTC)
You are both welcome. I had been asked several times to do or have AF do something with pronunciations, but never had a starting point; Thyrduulf asked for an exception list and when I did that I found I had 20-30K entries listed, with some obvious classes of things to fix. The exception list less the things AF has ruls for is now < 1K. Not bad. (while AF has some 60K tasks queued, between this and {see} to {also}, less whatever overlap there is, which is significant ;-) Robert Ullmann 16:22, 31 August 2008 (UTC)

References for translations (re: ice skate)[edit]

I have added a references section as you suggested in your edit summary to ice skate. My feeling is that references supporting red-links in the Translations section are a good idea, while references supporting blue-links are unnecessary. I am not following a guideline in this, nor have I run across discussion of this previously. Regards Ceyockey 02:44, 2 September 2008 (UTC)

Mismatched wikisyntax[edit]

The entry for Mitteleuropa includes a quotation that uses "1)" in the quote. So, we can't "fix" the unpaired right-hand parenthesis, since there wasn't a left-hand one in the source of the quote. Where would something like this be listed to keep it from being marked every time as an error? --EncycloPetey 21:01, 4 September 2008 (UTC)

Nevermind. Thryduulf provided the answer to me. --EncycloPetey 21:59, 4 September 2008 (UTC)

Flooding "structure problems"[edit]

Semper is helping me add conjugated forms of regular Latin verbs. As a result Category:Entries with level or structure problems will be flooded with entries tagged by AF. Each regular 1st conjugation verb will produce 3 such entries. The goal is to add conjugations for about 10 Latin verbs per day, which works out to 30 newly tagged "problems" per day for the forseeable future. This can't be fixed with Etymology, because the two pronunciations have the same etymology; they're both inflections of the same verb. We can't mark pronunciations by part of speech, because that's also the same. See laudaveris for a "simpler" example, and levaveris for one even more complicated (2 etymologies this time, each with two pronunciations). Again, there isn't variation in the etymology or POS, and there will be three such entries for each regular verb in Latin. The same phenomenon will occur when we start inflecting Adjectives and Nouns. The maintenance categories will be flooded if AF tags all of these as "problems". --EncycloPetey 02:12, 5 September 2008 (UTC)

Would you kindly remind me where the last discussion of "pronunciation N" (or whatever it is) was in BP? I seem to recall it had a non-obvious title. (Not the current discussion of line formats.) Robert Ullmann 17:00, 5 September 2008 (UTC)
Looks like it's at WT:BP#rfc-level_false_positives. I've also started drafting an explanation of my current thoughts in the draft WT:ALA. The relevant section is WT:ALA#Multiple_etymology_or_pronunciation_sections, which is intended to cover situations with either multiple ety sections, multiple pron sections, or both. The section "Macron forms" deals with the current situation. The following section uses palma as a complex example that's more thoroughly flshed out (because there are two lemmata on th page). --EncycloPetey 17:47, 5 September 2008 (UTC)
That might be useful for Hebrew as well, where the diacritics make an even bigger difference than vowel length; the Hebrew Wiktionary actually uses the diacriticked versions as L2 headers (which we can't do). —RuakhTALK 20:59, 5 September 2008 (UTC)

While you are thinking about it, it would be handy (not essential) if the structure problems list could be divided by the language(s) that have the structure problem. Perhaps the entries could be added to a hidden language-specific cleanup category. Even before the great flood, there were mini-concentrations in Aramaic, Hebrew, Faroese, and Latin. Some of the structure problems are hard to resolve without knowledge of the language. Even if they can be, it is much faster if one can get the pattern of structure problems, often from a single contributor or due to language-specific difficulties. DCDuring TALK 12:46, 5 September 2008 (UTC)

AF already does that (sort of). At least for Latin entries, it double categorizes the entry in both "Structure problems" and in "Latin words needing attention". However, for Latin, this merely floods two categories. --EncycloPetey 15:29, 5 September 2008 (UTC)
I was looking for lists that excluded what I (and others like me) couldn't competently do anything about and sort items into categories that called for the right kind of assistance. I like intersections, not unions of sets. DCDuring TALK 16:34, 5 September 2008 (UTC)
While that can be a good idea, it can also be problematic to have dozens of such categories for each language. When the category is split into every little language, some entries are not going to get corrections as quickly. So, there's a tradeoff between the two. Robert would know better than I which splinter categories might be populous enough to create, but as a general solution for all languages it's probably not a good idea. If it's possible to generate categories only for the langugaes you mentioned above, then it sounds like a good idea. --EncycloPetey 16:57, 5 September 2008 (UTC)
In addition to the considerations I've mentioned there is the waste of time in opening up a multi-lingual entry that exists in one of one's languages only to find that the problem is in some other language, which requires two page loads to determine. Small wonder that items sit on some of the maintenance lists for a long time. I wouldn't care if entries were on inclusive maintenance lists, so long as they were also on more selective short ones. I also like the idea of the longest-on-list sublists, though they, too, seem to get clogged with refractory entries. There is no general solution to problem of matching specialized problems and specialized problem-solvers when the latter are sometimes quite scarce. DCDuring TALK 20:37, 5 September 2008 (UTC)
I had been using the list to clean up structure problems. It is getting simply too tedious to sort through the list now to find the English entries. Even among English entries, I need to remember which English entries have problems that I can't solve. Adding the Latin entries, which I have learned to avoid, makes the list unfruitful.
The numbered pronunciation heading issue simply needs to be resolved.
I don't really understand why we have uninformative non-lemma entries (in any language) when we still lack good quality lemma entries for many words. Hard links to lemmas would be nicer to users, instead of making them click on a link and wait. DCDuring TALK 10:48, 24 September 2008 (UTC)
I don't know what to do about EP continuing to plow ahead creating more and more entries with Pron N in flat violation of WT:ELE instead of properly proposing it and getting it adopted first. I am thinking it may be necessary to propose a vote completely prohibiting Pron N, to remain in effect until superceded by a proper proposal. Of course the fundamental problem here is that Pron N does not work; it is a bad hack that looks like it works in some very simple cases, but since pronunciation is an attribute and not a structural characteristic (like etymology), it doesn't.
I have found it difficult to participate in discussions on the matter without feeling hounded and belittled. DCDuring TALK 12:02, 24 September 2008 (UTC)
The "non-lemma" problem is because some people insist that those entries must be stupid sub-minimal redirects instead of being useful (like actually having definitions and examples and so on). So others think they ought to be that way. The fact is that those "form-of" non-definitions were invented by Wonderfool, wanting to add a bunch of French entries but being too lazy to create proper entries. (do note that we can't have hard links in any case: the forms will frequently be for multiple languages)
I keep on forgetting about the problems of having multiple languages. There are many constraints that we face as a result of "all words in all languages" that make it likely that the single-language dictionaries will be more user-friendly to those who only care about one language at a time. I hope we get enough advantages from multiple languages to offset the costs and limitations. DCDuring TALK 12:02, 24 September 2008 (UTC)
I'll think some more on how to sort these. "Entries with illegal Pronunciation N headers"? 11:08, 24 September 2008 (UTC)
That would be fine. The language sort might offer additional advantages, but has drawbacks as well. DCDuring TALK 12:02, 24 September 2008 (UTC)

Sense[edit]

I had noticed that quite often (and this is a problem that also shows up with the context templates), people use the template without putting a space, assuming one is included (not necessarily a bad assumption), and was trying to fix that issue. Circeus 22:48, 5 September 2008 (UTC)

Assuming a space is included is a terrible assumption. Whyever would you (or anyone)? Robert Ullmann 23:49, 5 September 2008 (UTC)

Template:ISO 639[edit]

Pardon me, but this is horrible. You've taken a couple of perfectly simple templates that can each be used as needed or desired, and turned it into a confusing mess.

It make it look like the ethnologue uses the 2-letter codes, which it doesn't. Link can't be both Linguist list and Ethnologue, and it can't display the SIL link if the link is Ethnologue. A mess.

Well, from the user's perspective I'd ranked the usefulness of the pages as Ethnologue > Linguist List > SIL (as this page really has no info). So if an Ethnologue entry exists, it's link is, shown. If there isn't one, but there is an LL entry (like for some conlangs/ancient languages), it is preferred. As a backup there's the SIL page. Someone might get confused thinking you could use the 2-letter codes in LL or Ethno, though there is the link. I'm fine with rewording. --Bequw¢τ 18:27, 7 September 2008 (UTC)

And you are leaving out the * wikisyntax before it, which needs to be in the page text.

I'm not sure what you mean here, the template does include the bullet (it was nested in the #if's). --Bequw¢τ 18:27, 7 September 2008 (UTC)

Please can we just use {ethnologue}, {linguist list} separately, and have this perhaps show the -2 code and perhaps link a -3 code to SIL? As it is it is just a confusing layer. Please? Robert Ullmann 13:43, 7 September 2008 (UTC)

I'm sorry, you did leave a note on my talk page, and I didn't point all this problem sooner. Can we please just define {ISO 639} as {{ISO 639|(2-code)|(3-code)}} and nothing else, linking to SIL, and leave the links to ethnologue and linguist list separate, and clear?
At Arabic is good to add all those, but it starts out confusing: we use ar = arb = (standard) Arabic, and don't use ara because it is a -3 that is not an individual language. (Note that no application using 2-letter codes ever "redefines" them as "macrolanguages": they remain the specific individual language in actual use.) This entry should start with {{ISO 639|ar|arb}}, {{ethnologue|code=arb}}, and then the list of dialects.
I was not aware of that. Why then does SIL equate ar to ara instead of to arb? And why is our {{ar}}={{ara}} not {{arb}}? Should Arabic be changed? --Bequw¢τ 18:27, 7 September 2008 (UTC)
Because it is political. The entire concept of redefining languages as "macrolanguages" is political. (You would not believe the whinging: "But MY language is NOT a VARIANT!") Is pathetic. So ISO/SIL pretend Arabic is a "macrolanguage" and pretend to re-assign the meaning of "ar". But of course, that is and was always used for "standard" Arabic, and will continue to be, thus in practice ar = arb (and ara is ignored). Our {{arb}} is wrong because Widsith "fixed" it a couple of weeks ago. EP had it right. Robert Ullmann 10:32, 9 September 2008 (UTC)
As I said, I'm sorry I didn't say anything sooner, but can we please fix this? Robert Ullmann 14:21, 7 September 2008 (UTC)
I'll try and have the format show the separation between SIL/Ethnologue/LL. --Bequw¢τ 18:27, 7 September 2008 (UTC)
How's that change? Is the seperation more clear? I did all the format changing inside of {{ISO 639}} because I didn't want to convert all occurences of {ISO 639}→ {ISO 639}{ethnologue}. I think it's cleaner this way, but feel free the change the formatting or provide more suggestions. --Bequw¢τ 19:08, 7 September 2008 (UTC)
You took them all out, and now complain it is too much trouble to put them all back? Yowza. Anyway, they still work. Robert Ullmann 11:27, 9 September 2008 (UTC)
Is still a mess. The wikitext * should be outside the template, and there simply is no reason to add this layer of cruft over {{ethnologue}} and {{Linguist List}}. Why couldn't you just leave them alone and add a simple template? It is confusing to call and confusing to understand what it is saying. Robert Ullmann 10:32, 9 September 2008 (UTC)
I changed it to provide a simple clean call. (without breaking the existing ones you set up) See? Robert Ullmann 11:07, 9 September 2008 (UTC)
Looks good. 2 lines are fine. FWIW I included the "* " in the template because {{ethnologue}} did. Thanks for taking the time. --Bequw¢τ 04:55, 10 September 2008 (UTC)

bot help[edit]

At Semper's suggestion, I've spent today trying to figure out how to run a bot like the one he has (loading pages from a file). I seem to have everything working but for one small problem...the test I ran to upload eight pages sent them to Wikipedia. Could you help me figure out why? --EncycloPetey 01:52, 10 September 2008 (UTC)

Never mind. Problem solved. --EncycloPetey 03:46, 10 September 2008 (UTC)

Yes, I've borrowed SB's code for Italian verb entries. He explained how to modify it for Latin, and tested his bot on some data I set up for him. I've now learned to use it myself, and tried it out on a small number of Galician conjugations, so I know I understand the basic setup and how to modify for a new language (easy). The hardest part was understanding the set-up instructions posted on Meta, which told me to do things without explaining what the purpose was. The steps included links to several off-site instructions not specific to the process, which led me to waste a lot of time trying to figure out which instructions I actually needed, and which were intended for other purposes I had no need for.

Somehow the name PetaBot doesn't grab me. Like several other punny ideas I had, it sounds a bit rude. My favorite discarded idea was JohannSebastianBot, but I think I'd tire of that name too soon. ;) I'd post my current short-list of options, but don't want to find that the account has been taken by some unscrupulous person when I go to set it up. --EncycloPetey 20:09, 10 September 2008 (UTC)

AF and Template:also[edit]

Robert, what does AF do when it encounters more than one {{see}} / {{also}} on a page? Can it detect this? Can it merge them and eliminate duplicate items? I ask because one of the things I'd like to do with FitBot is conjugate Galician verbs. There are a number of forms in regular Galician verbs that differ only in the presence/absence of an accent (e.g. miraras & mirarás), and since this is always the case for these regular verbs, I'd like to add it in with the Galician entry. The only catches are:

(1) If the page already has other languages, then {{also}} won't be added at the top. However, as I understand it, AF can already deal with moving {{also}} to the top of the page.
(2) If the page already has an {{also}}, then there will now be two such templates on the page, and their content may differ.

So, can AF handle the second situation and consolidate the two potentially different {{see}} / {{also}} templates? If not, would it be an easy bit of coding to add? --EncycloPetey 01:05, 12 September 2008 (UTC)

Could I get at least a partial answer? Can AF handle pages that have more than one {{also}}/{{see}}? This will let me know whether I can safely add the {{also}} by bot, or whether I need to watch and handle such cases manually. --EncycloPetey 18:58, 15 September 2008 (UTC)
I want to and will do it, but have to figure out a few details. Removing duplicates is easy. But if you have two templates with different references, they need to be sorted together, and the order matters; the order already given in one or the other should be preserved, and the other merged it. There is also the issue of see vs xsee; I'd like to combine those two, so that {{also}} can do both; and that introduces more cases.
So the simple case is easy, the general case needs to be worked out. What I'd say for you now is don't worry about it; add the correct see/also template, AF will sort it to top. If it is in fact a duplicate, it will get fixed at some point; when I have taught AF to handle this, it will also be taught to hunt down the duplicates. In the meantime, I do review a number of AF edits based on the edit summary, I may catch them myself. Robert Ullmann 19:08, 15 September 2008 (UTC)
OK, thanks. --EncycloPetey 19:11, 15 September 2008 (UTC)

Re: Chinese characters[edit]

See: User_talk:Nbarth#Chinese_characters

As per your recent comments – thanks for the feedback, and no worries! Trust your trips have gone well, and happy hacking.

Nils von Barth (nbarth) (talk) 20:25, 14 September 2008 (UTC)

Wiktionary:Votes/pl-2008-09/Whitelisted users autopatrol[edit]

Would you mind rewording per the talk page, since you can doso more intelligently than I? Thanks.—msh210 16:42, 15 September 2008 (UTC)

Yes, I'll get to this :-) Robert Ullmann 18:55, 15 September 2008 (UTC)
Oh, sorry, I saw msh210's {{premature}} tag and WAS BOLD before seeing this here. (Not sure why I have your talk-page watchlisted, actually …) Anyway, please feel free to trample my changes. :-)   —RuakhTALK 01:54, 16 September 2008 (UTC)
Thank you, gentlemen.—msh210 16:55, 16 September 2008 (UTC)

Template:FFFF[edit]

I noticed you added this template to requests for deletion. I was about to reply, but there was an edit conflict and you removed it. Are you withdrawing the rfd? I was about to write this:

And you completely misunderstand. It's for letters that are themselves uniquely collated. For example, in Romanian, t and ţ are collated as entirely different letters. t < tzzzz < ţ < u. If you want to collate letters with another letter, you just type in that letter for the sort order. But with this particular template, it's not the case.
But you understand now what it's actually for? Maybe I just don't understand how Spanish is collated, but my understanding from years ago was not to collated letters with diacritics as the same. :3 But the letter collation order for Romanian specifically treats letters with diacritics as separate letters collated after the letters they modify but before the next letters. - Gilgamesh 18:09, 16 September 2008 (UTC)

see User talk:Gilgamesh, nothing else about this here.

Latest crisis[edit]

Have you noted this discussion in the Grease Pit? WT:GP#Navigation issues Someone messed with some code in a way they shouldn't have, and it's broken the link to the main page, among other things. --EncycloPetey 19:30, 18 September 2008 (UTC)

Should be okay now. Robert Ullmann 20:03, 18 September 2008 (UTC)
The navigation bar is fixed, yes. RC still doesn't display the page count like it did yesterday, but that could be a separate problem. --EncycloPetey 20:09, 18 September 2008 (UTC)
Fixed. Robert Ullmann 23:45, 18 September 2008 (UTC)

Ambient/Ambience[edit]

Apologies for not being au fait with the rules on Wiktionary, but I'm mainly a Wikipedia editor. I have the problem that the Wikipedia article "Ambient" has suggested links to "Ambient" and "Ambience" in Wiktionary. You deleted the "Ambient" redirect (from what I can see, Wiktionary doesn't redirect capitalisations), but the "Ambience" redirect to "ambience" still exists. Should you delete that as well? I've just fixed the Wikipedia problem by making the words lower-case, so this is just a suggestion to aid consistency. Hope it's helpful. Cheers --RexxS 00:13, 19 September 2008 (UTC)

We use page titles as the exact spellings, not as "subject names" as the 'pedia; so the re-direct rules are different. Queries or links will end up on the correct (or present) capitalization. "Ambient" was deleted as it had no internal links (which would just break), "Ambience" had such, since fixed, and will get deleted at some point. Thanks, Robert Ullmann 00:21, 19 September 2008 (UTC)

Where are we again?[edit]

Two brief questions: First, in lieu of a dump (an issue which, I understand, is of no concern to you :P), and thus in lieu of an updated User:Robert_Ullmann/L2/invalid, is AF tagging entries with L2's which don't match up with language templates? I seem to recall that it was, but I can't seem to find any documentation on it, nor the cat. Secondly, I was just curious as to whether you had seen my response to your comment on my talk page. That is all. -Atelaes λάλει ἐμοί 06:25, 25 September 2008 (UTC)

in May, you asked at User talk:AutoFormat#Is AutoFormat getting sleepy in its old age? about this. Of course then I was expecting we would still have XML dumps, even if only once a month. I did read your reply. My bookmark for wiktdev are (both) broken? Don't know where it is. Robert Ullmann 06:56, 25 September 2008 (UTC)
I've added a progress report to the WT:GP#XML dumps discussion. I'm back to having very variable amounts of time on my hands as the holidays are almost over :). Generating XML dumps ourselves is easily possible. Conrad.Irwin 00:21, 26 September 2008 (UTC)

note to self: starting from [4] et seq, look at revids that we don't have current. Robert Ullmann 09:15, 26 September 2008 (UTC)

ping from Template talk:documentation[edit]

--Yecril 15:03, 25 September 2008 (UTC)

Wikipedia crap, deleted/ignored Robert Ullmann 23:17, 25 September 2008 (UTC)

The word you used above is vulgar and offensive. --Yecril 09:31, 26 September 2008 (UTC)

crap (sense 2: worthless, of poor quality) is slang, but neither vulgar nor offensive. And quite precisely correct. (If you are imagining a comparison to feces, that is in your mind. ;-) Robert Ullmann 10:03, 26 September 2008 (UTC)

Template:new en noun[edit]

Yes, I deleted the wrong entry. (I've done a few things like that before, but I think I was too tired to catch it this time. Thank you for catching it!) --Neskaya talk 13:57, 26 September 2008 (UTC)

User:Robert Ullmann/Trans languages[edit]

I don't suppose you'd be willing to update this with a slightly less putrid dump, would you? Many thanks. -Atelaes λάλει ἐμοί 07:16, 28 September 2008 (UTC)

Okay, I'll update it to 13 June plus some. I've written an XML updater, if I can run it for a (longish) while, or start with a new dump sometime in the near future, I can spin new dumps on demand ;-) Robert Ullmann 14:14, 28 September 2008 (UTC)

lang: templates[edit]

OK.....it's just, I keep being asked for them. Eg {{lus}} exists, but not {{lang:lus}}, which is breaking in translation tables. Ƿidsiþ 08:31, 28 September 2008 (UTC)

Why do people want to add {t} for languages that are never going to have any iwiki links?
The language named at bee is Mizo: which isn't what template lus says.
I see, if the xxx template does exist, and has wikilinking, then the lang: template is needed (for now, one day they will all be put of their misery) Robert Ullmann 14:21, 28 September 2008 (UTC)

Autoformat table balancing[edit]

Hi,

Can you make AutoFormat balance tables other than trans tables as well? Or does it already? Specifically, I’d be very glad if it took over the trouble of balancing the table at -ure! H. (talk) 14:17, 8 October 2008 (UTC)

I'm happy to help with the manual standardization steps for the non-conforming tables. I don't think I've seriously screwed up too many of the translation tables. (I would like to know if I have, BTW.) DCDuring TALK 14:31, 8 October 2008 (UTC)
This was brought up somewhere, and the conclusion was that the general case of these tables can't (or shouldn't) be automatically balanced because the columns sometimes are used to mean something. (this case in col 1, this other in col 2) Also while a simple flat bullet list could be handled, and sublists or text or headings would be a problem. AF can handle this for trans tables by looking for a fairly rigorous format, and tagging exceptions; but the general case is not tractable. Robert Ullmann 14:41, 8 October 2008 (UTC)
How about if we created an {{rfc-auto-sort}} template, which would tell AF that the table really is just a single bulleted list at heart? I know it sounds silly, but it can sometimes be a fair bit of effort to sort and balance tables manually. BTW, major props on the XML dumps. Rock. :-)   —RuakhTALK 18:39, 11 October 2008 (UTC)
Can the special cases be flagged for manual inspection and either reformatting or long-term exclusion from the process? DCDuring TALK 14:45, 8 October 2008 (UTC)
A possible solution would be to add a parameter to {{rel-top}} etc that didn't display anything on the page, but would indicate to AF that it should balance that table. Perhaps {{rel-top|AF=yes}}, so that people could add AF=no to explicitly say this shouldn't be sorted in case we want to change the default behaviour in the future. Thryduulf 15:43, 15 October 2008 (UTC)
I was (sort of vaguely) thinking of something like that; the special cases can't otherwise be distinguished. Maybe it might be useful to have a tag to tell AF to pick the entry up and sort/balance the table once, removing the tag. That would be helpful in the worst cases. (Note that there is both sorting and re-balancing, it might be told to do either/or.) Robert Ullmann 15:51, 15 October 2008 (UTC)

New dump (forever!)[edit]

I would love to see an updated User:Robert Ullmann/L2, User:Robert Ullmann/Trans languages, and Wiktionary:Statistics if you have time. I realize Connel generally did the latter, but we've not seen hide nor hair of him in some time, and the page desperately needs updating. Oh, and did I mention "fantastic job, you've saved the day!"? Whatever, get to work. -Atelaes λάλει ἐμοί 18:54, 8 October 2008 (UTC)

Need to look at the level2.py code a bit, I have different programs generating different language tables, and should sort it. Trans languages I can run again (will in a few minutes). The statistics are generated by code Connel has, not me. Oh, I re-ran User:Robert Ullmann/t18, AF and others have done about 1/2 of them in the last two weeks. Robert Ullmann 16:56, 9 October 2008 (UTC)
Reran L2 and Trans languages Robert Ullmann 17:45, 9 October 2008 (UTC)

Not counted[edit]

I'm quite happy to have a fresh run. It seems to me that some of the entries are not fixable, inasmuch as they consist entirely of templates that do not allow brackets in their arguments. On the next occasion that it is run - and only if it is easy to execute - it would be nice to exclude such, unless my imagination or knowledge is simply deficient in knowing how to make them "count". Another approach would be to alter the templates, but, in some cases that seems contrary to our goals (eg, {{only in}}). Let me know your thoughts. DCDuring TALK 23:22, 10 October 2008 (UTC)

Languages for entries created by Tbot[edit]

Hi, Robert. For what languages does Tbot create new entries (based on existing translation tables)? Will languages such as Bislama or Volapük have their entries created by Tbot? Cheers, Malafaya 21:06, 11 October 2008 (UTC)

The languages that have their own wikts. Tbot cross checks with the corresponding entries there. So yes for both of those languages, using entries from bi.wikt and vo.wikt. But if no-one is populating entries in those wikts, Tbot will not be able to do much. Robert Ullmann 13:59, 14 October 2008 (UTC)
Well, given that the Bislama wikt is closed, not so much. (Why do they do that, it makes it impossible to get the project started!) Robert Ullmann 14:02, 14 October 2008 (UTC)

server/xml/dumps[edit]

So... what's happening with the xml dump? The dump on the server, is that all entries, all namespaces, all revisions, or what? How would I import it into devtionary.org/mw/phase3 ?? - Amgine/talk 22:41, 11 October 2008 (UTC)

The dump is all content entries (no talk space, no User, Wiktionary, etc). I don't know anything about loading a wiki from the XML format ... Robert Ullmann 13:55, 14 October 2008 (UTC)

Webster's[edit]

I have a copy of the 1913 edition. And its title is "Webster's New International Dictionary of the English Language." Wahrmund 01:16, 14 October 2008 (UTC)

You have this.
We are referring to this, which is the source of the MICRA/ARTFL online copy and the other online sites, including the imports to Wiktionary. Robert Ullmann 06:47, 14 October 2008 (UTC)

User:Robert Ullmann/L2/invalid[edit]

This is now sorted, with the exception of a few on which I am awaiting replies from folks or are simply beyond my reach (now noted on the talk page). If it's not too much trouble, it would be sweet if this could be updated on a weekly basis. Many thanks. -Atelaes λάλει ἐμοί 06:44, 17 October 2008 (UTC)

Antarctican[edit]

"do essentially the same thing, without DL and DD tags in text (how would anyone parse that? they'd just end up with them in the quote(s))"

I'm not sure I follow... sewnmouthsecret 15:34, 17 October 2008 (UTC)

Think of the poor person trying to parse the XML or page source to re-use our content in some application (which is, of course, the primary purpose of the whole project). All of the quotations parse fairly nicely, and then they hit an explicit DL tag in this one anomalous case. What to do with it? If noticed, it is a pain, if not, then "DL" shows up in their application output somehow. Robert Ullmann 15:40, 17 October 2008 (UTC)
Excuse my lack of knowledge: I am not a coder, firstly. I removed the Neologism tag simply because this is not a neologism. I don’t think we should leave a tag in for potential coders when the tag misleads readers.
Also, what are DL and DD tags? sewnmouthsecret 15:46, 17 October 2008 (UTC)
Sorry, I was re-editing from a previous version when you removed the warn neo template, that wasn't what I was fixing. I was taking out the HTML tags put in to try to force formatting in two of the quotations. Robert Ullmann 15:57, 17 October 2008 (UTC)
I was wondering what coding had to do with what I did! :) sewnmouthsecret 16:02, 17 October 2008 (UTC)
MediaWiki gives us these tags for a reason: there are some things that can be done with the syntax that resembles (X)HTML, but not with the syntax that tries to be editor-friendly. It sucks that wikisyntax has so many special cases for external applications to worry about, and I support efforts to standardize our wikitext by choosing only one variant when two things are equivalent; but when something should be formatted a certain way for users' sake — both our readers, and those using well-done external applications — the external applications need to suck it up. Especially, your statement that "'DL' shows up in their application output somehow" seems like a stretch: a parser who doesn't want to deal with (X)HTML-like syntax should strip out tags, not escape them. (The existence of (X)HTML-like syntax is a fundamental property of wikitext. Applications that don't want to handle all the details, don't have to, but they have to take basic steps to not produce garbage.) —RuakhTALK 16:59, 17 October 2008 (UTC)
MediaWiki doesn't "give us" those tags at all. They just don't get stripped by HTML tidy.
You are being WAY too demanding of applications reading the wikitext. Most people doing things are no where near the level of ability to deal with several levels of syntax parsing. (Strip out HTML tags? even some very good programmers don't know how to strip tags properly! And of course if they do that, they get "page 62" in the quotation text.) We have a format for quotations that is at least decently tractable. Hacking in HTML to try to push some bits around is an atrocity. (I'm one of the better programmers around, and I would scream curses at the perpetrator if maintaining an external app and I came across a crock like that.) Robert Ullmann 17:10, 17 October 2008 (UTC)

good job[edit]

I'm claiming that is the millionth article! --Jackofclubs 16:03, 18 October 2008 (UTC)

that's the way I count it too by the {NUMBEROFARTICLES} counter; but that is known to be broken. (By its criteria, entries with brackets in them, we are supposedly not there yet; it was mis-reset; by a real count, we were there two days ago.) But so what? that is what people look at most of the time, even if badly wrong ;-) Robert Ullmann 16:06, 18 October 2008 (UTC)

Exclusions from maintenance lists[edit]

Answering your note, I haven't come across any major classes of items to remove.

My last thought was thought I would like a way to "subscribe" to maintenance lists and then have the items, either by elimination of the tag (by anyone) or by my own determination that it was beyond my skills or interest. It's a wishlist item that interested Ruakh a bit, I think, but not anyone else, AFAICT. DCDuring TALK 01:36, 21 October 2008 (UTC)

{{legal}}[edit]

Per the BP discussion of {{law}} I was going to change the label of {{legal}} to read (legal) but I hesitated because I wasn't sure if I should change the topcat param also. It would be better to have the cat=label, but would I have to delete and create all the new categories with that change? --Bequw¢τ 07:25, 22 October 2008 (UTC)

The cats aren't a problem; the presentation in the entries is the problem. It should read "law"; legal makes no grammatical sense in some (perhaps all) cases. And note that people can (perhaps should) just use {{Law}}, the "legal" name is just a workaround. Robert Ullmann 10:06, 22 October 2008 (UTC)
I'll "let sleeping cats lie" I guess:) Thanks. --Bequw¢τ 01:28, 23 October 2008 (UTC)

SemperBlottoBot borken[edit]

How do I "re-synch" or otherwise fix this new problem? SemperBlotto 17:00, 25 October 2008 (UTC)

Gory details in the WT:GP. Robert Ullmann 17:16, 25 October 2008 (UTC)

Template:pt-verb form of[edit]

Thanks. :-) As you said, {{qualifier|Brazil}} looks better. Done.

When I made the {{pt-verb form of}}, I was trying to make a simpler way to gather all the information required to generate one or more definitions. So, it would be something like {{pt-verb form of|achar|ar|indicative-present-singular-third|imperative-affirmative-singular-second}} to generate: Third-person singular (ele, ela, also used with você) and second-person singular (tu) affirmative imperative of verb Template:wlink. Now it is possible to use {{{1}}} instead of infinitive, but even if I chose to use named parameters, it could change anytime. Daniel. 14:26, 28 October 2008 (UTC)

Looks good. I added 'pt-verb form of' to the list that AF can add links to so as to make the page countable. It will find any existing or new ones. Robert Ullmann 14:43, 28 October 2008 (UTC)
It will be useful, thanks again. Daniel. 17:24, 28 October 2008 (UTC)

Just curious...[edit]

I recently saw a message left by you in one of the discussion rooms (mind you the date wasn't vert recent at all) as a reply to someone who said we were just 300 entries behind the French. You said they would be back from their vacation soon. Were you referring to the rentrée?--50 Xylophone Players talk 16:05, 28 October 2008 (UTC)

I don't remember. If it was in August, quite possibly. Robert Ullmann 16:07, 28 October 2008 (UTC)
Thanks, so you mean the French Wiktionary is not very active during that period?--50 Xylophone Players talk 17:06, 28 October 2008 (UTC)
In France, public life is more or less suspended during the summer holidays. Compare w:fr:Rentrée and references therein. -- Gauss 21:25, 3 November 2008 (UTC)

cites from WP[edit]

Once the entry is added and properly cited (or in clearly widespread use), can the cites page be deleted?—msh210 17:40, 28 October 2008 (UTC)

The general idea is that illustrative or otherwise useful quotations go in the entry; other stuff on the citations page. (This is just like dictionaries like the OED, which maintain huge clippings files, and then put useful quotes in entries.) If it ends up with the WP cite the only thing on the citations page, and it is useless, then could be deleted. But in any other (i.e. most) circumstance, it is harmless to leave it. Robert Ullmann 17:45, 28 October 2008 (UTC)
Oh, okay. Seems to me the WP cite of agencies falls into the former category (the WP cite is the only one, and it's useless, since the cite uses the word exactly the way it's used countless times daily), so it's deletable. That said, I won't delete it. Thanks for the clarification.—msh210 17:51, 28 October 2008 (UTC)
Interesting thing there is that the code picked up an English plural missing from an existing entry. Citations:gyms is probably even more useless once we have the plural. I should think about forms and such. Which is the main purpose of the experimental code. (I also seem to spend more time telling it how to fix WP spelling errors than getting new cites for us. ;-) Robert Ullmann 18:14, 28 October 2008 (UTC)

AutoFormat bot[edit]

Hi Robert,

Is the Python source code of your bot available somewhere? I don't like reinventing warm water, especially since you have done such a good job already. I may restart work on the bot I started to work on about 3 years ago. I'm aiming for a bot that would be semi automatic, rather more a help for editing and interchanging with other projects. Polyglot 20:57, 3 November 2008 (UTC)

User:AutoFormat/code -Atelaes λάλει ἐμοί 21:03, 3 November 2008 (UTC)
But notice that the code is not released under the GPL; also, notice that the page is 13.5 months out of date, so might not use the new edit API (I say "might not" because I haven't actually looked through it; it might actually be using an external library, in which case the current version of that library might well use the new edit API). —RuakhTALK 21:42, 3 November 2008 (UTC)
I have no idea what the implication is of it being under GFDL. Shouldn't I use parts of it in a GPL'd project? Should I even have looked at it, if I wanted to incorporate parts of it in a GPL'd project. Should I now release whatever I do under GFDL from now on? Oh well coding is hard enough labour as it is, without such headaches, but that's life, I guess.
All that said, I must say the code is beautiful. I don't understand half of it, with all the regular expression magic et al., but it would certainly speed up my own development if I could borrow parts of it. With the code I had started creating (which is under the wiktionary directory of pywikipediabot and probably looks extremely amateuristic), I was trying to parse a Wiktionary entry and store it into Python objects/classes. The intention was not so much to simply clean up en.wikt, but to be able to reexport to whatever wiktionary project including what is now known as OmegaWiki. My focus was on translations. Transtool comes close to what I wanted to do, but no cigar. Anyway, Transtool and AutoFormat do something that's actually useful and my own code has never been used for production use, since my priorities and focus changed all of a sudden three years ago, when my daughter was born. Now my interest has been rekindled and I'm thinking of it more and more while taking care of ttbc:Dutch. Transtool is not suitable since the format coming out if is not exactly right (*Langname instead of * Langname) and it drops information (sc etc). So it helps for formatting translation entries, but not enough. Polyglot 23:44, 3 November 2008 (UTC)

User:JAnDbot[edit]

First, I should say that I agree with blocking this unauthorized bot, especially given the problems I've seen it cause on WP and the lack of constructive responses from its owner. However, the particular edits it made here all look valid to me, since it was removing only interwikis that point to redirect pages, which I've always understood to be incorrect for Wiktionary iw links. Does your understanding on this issue differ from mine? --EncycloPetey 07:48, 4 November 2008 (UTC)

We had several discussions while I was creating Interwicket to replace the (then) defunct RobotGMWikt. The general outcome was that it is a good idea to link to redirects; this respects whatever policy the FL wikt has on using redirects.
As an example, we redirect variant forms of idioms to a canonical form (which is sometimes fairly arbitrary), an FL wikt doing the same, but using a different canonical form, would want to have iwikis to our redirect, and vice versa. I'm sure you can easily see a number of other cases, where one or the other wikt uses redirects for various purposes.
Other wikts, notably sv.wikt, also link to redirects. And VolkovBot is well behaved here, using the standard pybot framework.
The iwiki bot operators are annoying in general, they seem to think that nothing they do can possibly be wrong. I'm sure I don't have to explain to you the problems with these in the 'pedias, where the assumption that A->B means B->A is sometimes invalid, and A->B->C means A->C (or C->A) is frequently so. At least we can match exact spelling. (;-) In the 'pedias, they should only be adding links where the bot runner knows both languages being linked, and something about the subject area. Little hope of that. Robert Ullmann 16:51, 4 November 2008 (UTC)
A comment regarding interwiki redirect links: In some cases, the redirect isn't what it first seems. Consider our link from Aries to la:Aries, which is a redirect to la:aries. This is linked directly from our entry aries, (which is where our Latin entry is anyway).  :P --EncycloPetey 17:10, 4 November 2008 (UTC)

battle[edit]

Hi Robert,

I switched on Connel's Javascript code to help with semi-automatic formatting. It created a top/mid/bottom and I should have paid attention to it. I saw it do it, but I'm kinda focused on the translations. That's why I missed it. I didn't know about the top2 template though, so I would probably have fixed it the wrong way anyway. Polyglot 13:26, 5 November 2008 (UTC)

Yes, I recognized it. (CM's JS) Btw: "French" comes after "Finnish" ;-) (but AF easily fixes that) Robert Ullmann 13:35, 5 November 2008 (UTC)

User:Robert Ullmann/Trans languages[edit]

I don't suppose this could be set up for a weekly update too? -Atelaes λάλει ἐμοί 22:42, 6 November 2008 (UTC)

When we only had long-delayed XML dumps (which we still do ...) I and CM and so on ran a number of things whenever we could. Now a bit easier, but I am not cron. Okay, it is late, and I still have my Obama t-shirt on (national holiday here). Will run. Robert Ullmann 23:08, 6 November 2008 (UTC)
Well, clearly you're not, but I figured some of your programs could play that little game. :) In any case, thanks. -Atelaes λάλει ἐμοί 23:11, 6 November 2008 (UTC)
Oh, and while I'm being a whiny pain in your ass, I don't suppose you could separate it into valid and invalid, like you have for the L2's, do you? It makes it quite a bit easier to focus on what needs attention, and keep track of progress and whatnot. -Atelaes λάλει ἐμοί 23:33, 6 November 2008 (UTC)

Luo and Swahili[edit]

A question about these languages: is 'ck' sometimes used in words in Luo or in Swahili? My feeling was that it's not used, and that Barack is an English spelling of the name (but I don't know the Luo spelling). Am I wrong? Lmaltier 14:16, 7 November 2008 (UTC)

True[edit]

I don't have a Wiktionary account.

Of course not. What could I have been thinking?

mwclient[edit]

The API based https://fisheye.toolserver.org/browse/bryan/mwclient/trunk/ may be of interest for the next time pywikipedia gets broken by interface changes. Conrad.Irwin 16:07, 13 November 2008 (UTC)

Yes, I'm seen that, and already stolen code from it for my mwapi.py ;-) Thanks! (and convenient to have a link here in case I lose it somewhere) Robert Ullmann 16:13, 13 November 2008 (UTC)

Changing the names of given name and surname categories[edit]

Would you have time for this problem in the near future? I need to create 20-30 new subcategories for given names, and as topic categories they would just make the task worse later. New categories are created and surnames added every week.

There are about 190 given name categories and 28 surname categories to be renamed. I could create the new categories and check that the old ones are empty, but someone else has to delete them. There is no surname template, should they all be changed by hand?(about 1 000 surnames) I have time in the next two weeks, ca. 2 hours each weekday.--Makaokalani 16:15, 13 November 2008 (UTC)

Okay, I'll find a bit of time to look at this. Robert Ullmann 16:39, 13 November 2008 (UTC)
There are enough given name categories created now so that you can make that magic change. I've purposely left out some categories out in order to think about them. I'll be busy sending out "delete"- messages for the rest of the week.
An ideal surname template would be similar to the given name one. {surname|from=Latin|lang=de} would produce "A surname" and Category:German surnames from Latin. Ideally an adjective could be inserted: "A patronymic surname", "An occupational surname". SemperBlotto defines surnames like that. But if it's not practical I'll take your word for it. No need to explain because I wouldn't understand in any case.
In the spirit of a child asking Santa Claus for new toys, --Makaokalani 12:32, 18 November 2008 (UTC)
Okay! I have changed the template. Seems to check out properly, will take a little bit for the job queue to move them.
re surname: all very doable; I'll move the existing one to {{new surname}} and create a new one. There is a little bit of magic needed: note your examples "A patronymic", "An oocupational" ... "A" and "An" (;-). This takes a small tweak. You will see ... all very good. Robert Ullmann 14:37, 18 November 2008 (UTC)

note to self[edit]

mbira, bhaji

exported, feathers, scuttled

zii

iwiki stat: 4245067 in union index, 1097695 entries, 6604 possible, 6237 updated

AF bot source code[edit]

Hi Robert,

Would you mind sharing the code of AF with us once more? I would like to learn from it to know how things are done nowadays on en.wikt. The code of Tbot would probably also be interesting, but I haven't looked at it yet. I'm trying to create a bot to automate the 'harvesting' work I've been doing manually lately. I was doing this to get a feel for what it involves. Of course if you don't like that I might take inspiration from your code, it's probably better not to post it. I'll find out how to do things the hard way then. I'm still a bit confused about what the implications are of the code being GFDL. I probably won't be copying it wholesale, since what I want to do is different from what AF, Tbot and Interwicket are doing. Also, I didn't think of a license yet. What I did three years ago is publicly visible as Wiktionary.py on PyWikipediaBot. Now, me and Conrad want to pick it up again.

Thanks. --Polyglot 17:15, 16 November 2008 (UTC)

Sorry, I have been meaning to get to this. I updated the code to current. There are a number of issues and a few missing bits that should be documented, but I can't do that right now; will try to get to it. Robert Ullmann 21:46, 16 November 2008 (UTC)
Note in particular that the prescreen queuing is broken, because it expects that the set of see->also conversions remaining to be done is (approximately) the set not in its cache; and Conrad broke that assertion this morning by converting most of them. Hence lots of null edits, hence it just finds that there (supposedly) is not much to do, and sleeps for many minutes at a time. Fixing it now, with a new dump, and keeping myself awake for two extra hours ... 22:21, 16 November 2008 (UTC)
Thanks for updating it. Don't worry about it not being complete. I'm not so much into the prescreen stuff anyway. What I was interested in is how you were sorting the translations tables, given that Chinese, Norwegian and Serbian need special treatment nowadays. After looking at the code and reviewing my 'test balloon entries', I notice that it's not doing that. It was Stephen G. Brown who did it manually. Of course this makes me wonder what's going to happen the next time AF touches it. I checked at pig and AF doesn't break it. nice!
To illustrate, this is what I'm talking about (taken from pig and hydrogen:




My problem is that I take it all apart even further than you do, considering Norwegian and Nynorsk as two completely distinct languages, for instance. So when I'm building up the translations table again, I have to sort it, then move nn under no. Then find out whether more than one Chinese entry is present, find out where it fits, then put Cantonese, Mandarin, Min Nan, Old Chinese and Wu with *: in front. I haven't decided how I'm going to treat Serbian with its two scripts yet. For Japanese with its three scripts, we simply use tr=, but apparently it's not a simple transcription but an actually valid spelling.
Only after doing this reshuffling will it be possible to rebalance, so I'll probably have to do that in a separate loop. I'll share the result once it's finished, but I don't know if it's going to be of much help, since I stored them in a totally different way (indexed by their iso codes). It's probably also not that important for AF to get this right, as long as it doesn't break it once it's done (which it doesn't). --Polyglot 10:36, 18 November 2008 (UTC)

Chinese Zhuang et al[edit]

Dear Robert, advice, or help, on making templates for Zhuang would be greatly appreciated. Zhuang is written using several different scripts, the 3 main ones being an official latin based script, a Cyrillic script (the latin script is a transliteration of this script which it replaces), and Zhuang characters which are basucally Chinese characters. In practice just the Latin and Character scripts require adding, the Cryllic script is no longer used and if required can be generated from the Latin form by a simple auotmated process. Entries for Zhuang would therefore appear in two locations Latin, and Chinese. I would like to develop a good format for such entries, after which adding more good quality entries would be straight forward. Whether done by myself or other. The format for vunz has a problem in whilst it indicates this is a Zhuang noun, this is part of a larger group namely all Zhuang words. How would it be best to have both categories - Zhuang (for all Zhuang words), and Zhuang noun (which is really a sub-category, a useful category). This would seem to require a template of some sort. Also the character form is part of a category Zhuang characters, some sort of template may be useful for that.

One problem is that the Chinese entries seem to be a bit of a mess, for example definitions are frequently placed in the translingual section, but such definitions are in fact sometimes language, or even dialect dependent. The result is that almost all Chinese character pages are labeled as needing definitions, and where there is a difference of usage this is not shown. Which is certainly not helpful to readers to say the least. Some automated, or semi-automated solution may work best. his is not a question that has a quick solution, and though it would be something worth talking about. As to myself I am a native speaker of English, who is fluent in Mandarin Chinese, with some knowledge of Zhuang. As far as computing is concerned, I write perl scripts to process language data from time to time. I also made some earlier comments in the Greasepit on some of the above.Johnkn63 02:34, 21 November 2008 (UTC)

formatting with 2 etymologies?[edit]

Could you check taxis for me? I just changed the format because one version has an alt spelling but not the other so I put etymology 1 and etymology 2 sections around them but I'm not sure those are standard headings. Thank you. RJFJR 14:42, 22 November 2008 (UTC)

Helping to run a tbot instance[edit]

Hi Robert,

It seems like I'm back from my, apparently much needed, wiki break. Do you think it would help if I would be running an instance of tbot to convert translations entries to t templates and check whether they exist on the foreign Wiktionary? I understand you run an adapted version of pywikipediabot, but I would simply run it with the standard version or with mwclient, not sure yet. If it fails because of time outs, I'll simply restart it. But the basic question is: would it help? I would be running it as the PolyBot user and I will have to finally ask bot permission for that account, I guess. But first I would want to let it run a few test runs, slowly, so it can be checked that it doesn't do anything wrong. Of course, if I start running the same code you use, there shouldn't be anything wrong with. --Polyglot 13:58, 25 November 2008 (UTC)

Tbot uses the iwiki links that Interwicket updates. The usual sequence I intended was (run Iwikt) (get XML dump) (run Tbot) (get XML dump) ... repeat. That got extremely disrupted by WMF's inability to produce dumps for 7+ months. Now that we have dailies, I can do that much better. Tbot uses the pywikipediabot framework + a module called mwapi, which is designed to play nicely witht he framework (one can do mwapi.getedit and then wikipedia.put on a page). It also uses a local cache to keep from trying the same things over and over; since you would have a different local cache, each instance would duplicate the other. Still something to think about. Robert Ullmann 14:07, 25 November 2008 (UTC)
What if I took all the entries in the Category of English words and simply got it working on all those sequentially or randomized? It might be that I'm not fully understanding the issue of the local cache. When you say get XML dump. Is that a full dump the next day or is it get the next item that needs attention from the XML dump?
I'm glad the answer isn't a flat no. Would the extra processing power be useful? The idea to propose this, came because I did something to detect existence on another wiktionary for the Python code I created to harvest those interwikipedia links. It's not that hard to do. Only uses some bandwith and probably processing power on the WM servers. Of course converting what is in the entries now to a t template is another matter. I'll probably have a look at your code to see how you accomplished that.
Oh, and do you think it makes sense to propose to add the translation comment to the t template as well? I would like that because the data becomes a little bit more structured that way. I'm talking about things like (archaic), (used in such and such region), (typically ...), (older sister), (parental uncle), etc. --Polyglot 00:18, 26 November 2008 (UTC)
The code uses a lot of heuristics, and takes a bit of watching when running; it also has various bits and parts floating around because I hadn't/haven't yet tried to package it for someone else to use. You could probably deal with all that.
It looks at the FL.wikt entries (just as easy as getting the en.wikt, but then there are lots of formats), and uses the latest XML dump to minimize references to the live DB; not so much to save load on the servers, but to keep from doing a lot of unneeded ops. Then it uses the cache to keep from repeating the ops it does do. (If "foo" isn't in xx.wikt today, no sense looking tomorrow or next week. After 2-3 months or so, sure.) Do look at the code, but it isn't current. (May fix that in a while, I'm running it now to see where I am.)
I wouldn't add the other stuff to t (even the genders aren't really needed, except that it makes it a bit simpler in a lot of cases). Consider what happens when the link is part of a phrase (with an article or such): {t} needs to fit inside. Even in that case, it is often necessary to separate gender so the marker(s) aren't in the phrase. And the qualifiers very often apply to several terms, especially when they are regions. (And so on ;-) Robert Ullmann 14:22, 27 November 2008 (UTC)
When editing manually it's easier/quicker to type the genders when they are included in the t templates. What I like about it is that it keeps everything for one term together. It makes them inappropriate for use in phrases though, that's true. When the qualifiers apply to more than one term I think this qualifier should be in front followed by a colon. But how does one know how many of the following terms belong to that qualifier in an unambiguous way? Maybe the ones that belong together should be separated by semicolons instead of commas then. If they had been included in the t template there would have been no other option than to repeat them all the time. Very verbose, but also very clear. You probably already understood that I'm trying to be able to process the contents of Wiktionary in an automated way. Having qualifiers that apply to more than one translation makes this endeavour a little bit harder than it already was.

I'll hold off a bit with the idea of running a tbot instance. I understand what you mean by cache better now. It's the information you gathered from the XML dumps in a more accessible way for the python code. If I were to start running the same code, they will probably start chasing each other's tails. I'll keep updating the t templates for the entries I'm processing and leave it at that for the time being. You are right it doesn't make sense updating them on a daily/weekly basis.

Kind regards, --Polyglot 21:58, 28 November 2008 (UTC)

User:Robert Ullmann/L2/invalid[edit]

Something's wrong here. A bunch of good L2's have been picked up, presumably because of the new template format? -Atelaes λάλει ἐμοί 19:50, 25 November 2008 (UTC)

Yes, needs some work .... Robert Ullmann 23:16, 25 November 2008 (UTC)
That should be better. Robert Ullmann 05:56, 26 November 2008 (UTC)
Many thanks Robert. -Atelaes λάλει ἐμοί 07:09, 26 November 2008 (UTC)

Overriding inline styles in script templates[edit]

The nested spans method which we discussed at Wiktionary:Grease_pit#Move_inline_fonts_from_script_template_to_common.css has a problem. It can't be used when the font-size is set, because relative font sizes accumulate in the cascade.


<style>
 .CLASS { font-size: 125%; }
</style>

<span style="font-size: 125%;><span class="CLASS"> footext </span></span>

In the above example, “footext” ends up being resized ×125% ×125% =156.25%.

My tentative solution is to nest a dummy span inside, to provide a hook for a more-specific style-sheet rule:

<style>
 .CLASS span { font-size: 125%; }
</style>

<span class="CLASS" style="font-size: 125%;><span> footext </span></span>

Unfortunately, it makes the code for overriding the style rather specific. Any thoughts? Michael Z. 2008-11-26 18:03 z

(sigh, did a hour of testing this morning before I discovered you have modded ug-Arab ...)
I changed it again, using the old <font size=3> inside the class, to be removed later. This means that the size can't be customized until the 30 days runs out, but then it couldn't be customized before either (;-). size=3 seems to do about the right thing but not perfect, for the tests I've run; and it will go away presently. It is supposed to be =large (which doesn't work! it does relative, not absolute!) which is =120% if done to spec. You think? Robert Ullmann 08:09, 27 November 2008 (UTC)
See this rev of my sandbox, especially the last two lines. They look the same to me in FF, IE7 and Chrome (WinXP on a fairly high-res screen). The fonts are customizable, as it everything else except size, which will be in a month when we can remove the cruft from the template. Good? Robert Ullmann 08:30, 27 November 2008 (UTC)
I think I've restored the font size in all of these templates for now, and I will look into the overriding issues in detail after I finish my cleanup. This morning I still have to restore the non-MSIE font specs for a few of the templates. I've responded about that on my talk page in a bit more detail.
Sorry for the screw-up, and I really appreciate your patience and understanding. Regards. Michael Z. 2008-11-27 16:33 z
Karibu. To be precise, I am suggesting doing:
<span style="font-family: (fonts) ;"><span class="UG"><font size=3>{{{1}}}</font></span></span>
as font size=3 does 120% absolute, and that is very close in all the cases where we are increasing the size. Everything except size can be customized ordinarily (not "span.UG span" ;-) and in 30 days we remove the outer span and the font element. Robert Ullmann 16:57, 27 November 2008 (UTC)
I think I've made the style sheet emulate the original templates now, regarding MSIE/non-MSIE specs. I posted details at the bottom of Wiktionary:Grease pit#Migrating inline styles to the style sheet. Please check the display of scripts you are familiar with.
In your sandbox rev, the last two lines look identical in Safari/Mac and Firefox/Mac (actually, a few pixels are different in FF, but the size is virtually the same). They also stay identical to each other, or very close, as I step my browser's text display much smaller or larger.
<font size=3> should be equivalent to <span style=font-size:medium;>, or 16px, which I think would be one step larger than the default size of text in Wiktionary (it is set to x-small in the body, and 127% in div#globalWrapper, which rounds out to 13px or equivalent to small, in Safari.). I'm not sure how the the font element interacts with CSS.
By the way, I just found out that CSS 2.1 no longer mandates 120% per step, but I don't know if or how this is reflected in released web browsers.[5]
Short answer: looks like it works. Michael Z. 2008-11-27 19:31 z

Game 4[edit]

Hmmm... you may have already killed that game, unless someone thinks of a word starting with "lls-". Maybe something Welsh? :) --EncycloPetey 16:37, 28 November 2008 (UTC)

As you can see, I reconsidered that. Just too cruel. (I've already discovered an advantage in having my Wikamusi ya Swahili, with lots and lots of words that start with "nd" and "mt" and things like that. Wanted to use (Catherine) Ndereba, but I think I'll avoid proper nouns.) Robert Ullmann 16:48, 28 November 2008 (UTC)

Dual redirect fixes[edit]

Yeah, those were dual redirect fixes... Hope nothing broke :( Sfan00 IMG 20:49, 28 November 2008 (UTC)

Surname template and Japanese given names[edit]

Could you please link the word "surname" in the surname template? And the dot could be taken away, too; it's easy to type a dot, but taking it out is complicated. I'll be starting on surnames next week.

In the Category:ja:Female given names there is always a hiragana spelling after the category name, like this for 亜華実:

 [[Category:ja:Female given names|あけみ]] 

Am I allowed to take it out and delete the category? If not, there's no point in using the template. But EncycloPetey is planning to change all male/female categories into masculine/feminine (sigh); if I don't use the template, he'll have to repeat everything I do - for 293 names.--Makaokalani 14:48, 1 December 2008 (UTC)

He is planning on doing WHAT? And where was THAT discussed? It does NOT make any sense for languages without noun gender (English for example). Is just so totally wrong. And I for one haven't heard jack about it until this moment.
In the surname template, the dot is a good default, and you just use dot= in the cases where it has to be replaced or suppressed. The given name template should have been done this way, as are all the others (with dot or nodot).
Japanese categories are sorted by hiragana, so we need a sort key added to the template. I'll look at it. Presently. (;-) Robert Ullmann 15:00, 1 December 2008 (UTC)
Wiktionary:Beer parlour#Category:Lithuanian male given names and co. I'm glad if somebody can dissuade him:-).--Makaokalani 15:17, 1 December 2008 (UTC)
Yes, thank you; I've found it. The others who have commented disagree with him as well. And something you might notice: we didn't change the male/female part of the category names, just the form of the language specifier. And the previous names have been male/female for years without EP ever objecting or suggesting there might be an "improvement" to be made. So don't get caught in the argument he tries to make (in the BP disc.) that these changes weren't discussed or appropriate. okay? Robert Ullmann 15:29, 1 December 2008 (UTC)

sc- classes[edit]

I figured someone would miss this until I got to work on it. Not too late to change, but please see Wiktionary:Grease pit#New classes for script templates and comment. (Maybe I should cross-post to the BP.)

Short story: having a prefix will help avoid problems in managing up to 130 or more script classes. Michael Z. 2008-12-01 23:35 z

They are already in a fairly unique space. We have this discussion re: the template names, and decided that they were fine. If we were to use a class prefix, "sc-" is a fairly bad choice: what does "sc-Latn" mean? Sardinian in Latin script? (;-) Robert Ullmann 07:48, 2 December 2008 (UTC)

Iwikt stats[edit]

last run:

4227953 in union index, 1084763 entries, 6115 possible, 5507 updated

about to start next run Robert Ullmann 23:32, 4 December 2008 (UTC)

4245067 in union index, 1097695 entries, 6604 possible, 6237 updated

ending 6 December Robert Ullmann 11:43, 7 December 2008 (UTC)

User:Robert Ullmann/Trans languages[edit]

Poke. -Atelaes λάλει ἐμοί 10:57, 7 December 2008 (UTC)

Wow, that was fast. However, there's the same problem with the new language template format. Now, I imagine you are already aware of this, and are currently tweaking the code to fix it (and if not, you should pretend this is the case). Also, while I've already mentioned this, would it be at all possible to divide the sheep and the goats, like the L2's? Makes it more convenient for us drones to clean up. Many thanks. -Atelaes λάλει ἐμοί 11:32, 7 December 2008 (UTC)
Yes, I know, running again (give it about two more minutes). The table is sortable; you can sort on the code column to separate them. Robert Ullmann 11:42, 7 December 2008 (UTC)

missing languages[edit]

For some reason, the latest run of User:Robert Ullmann/Trans languages jumps from Japanese to Polynesian, omitting languages like Kurdish, Lao, Mandarin, Novial, and Occitan that certainly are used in Translations sections. Any idea what happened? --EncycloPetey 17:04, 7 December 2008 (UTC)

Is all there, some cruft from an entry that got into the language name hiding it. Fixed both. Robert Ullmann 17:10, 7 December 2008 (UTC)

Italian missing forms[edit]

Hi there. I'm about half way through cleaning up this little lot. Do you want me to remove batches from your list as I do them? SemperBlotto 10:16, 11 December 2008 (UTC) p.s. I think I've now finished them - perhaps you might like to recreate it to see if I've missed any. SemperBlotto 12:26, 11 December 2008 (UTC) p.p.s This showed up a bug in my bot's noun program - now fixed.

I'll recreate it presently (maybe about 7-8 hours from now? 01:00 UTC? Not sure what I'll be doing. Robert Ullmann 17:18, 11 December 2008 (UTC)
yes, much better ... Robert Ullmann 00:45, 12 December 2008 (UTC)

split page Missing forms/English?[edit]

How hard would it be to split User:Robert Ullmann/Missing forms/English into multiple pages next time you generate it? It's so long that it is slow to load. RJFJR 16:49, 11 December 2008 (UTC)

Reducing the size of it might be better (;-). I'm going to get rid of the rest of the FL forms, and a couple of other things tbd; I'll see about splitting it if still out of control. Robert Ullmann 17:13, 11 December 2008 (UTC)
I'm working on the entries listed, but every time I go back to the page for the next entry there's a delay while I wait for it to load, which slows down shortening it. RJFJR 20:34, 11 December 2008 (UTC)
You aren't just switching tabs (or windows?) Robert Ullmann 19:17, 12 December 2008 (UTC)

Language attributes[edit]

I've written up a proposal at User:Mzajac/Language attributes. Would you mind having a look and commenting on the talk page? Thanks. Michael Z. 2008-12-12 19:04 z

I've looked at it just for a minute or two, looks good, but I am going out for a few hours. There are some fixes to the template syntaxt to make it do what you are trying to do (;-); is it okay if I just edit it later for that? Cheers, Robert Ullmann 19:16, 12 December 2008 (UTC)
Yes, please go ahead. Thanks. Michael Z. 2008-12-12 20:48 z
I think I have figured out the parser functions, and I've added a working test case. Please check if the code is optimal. Michael Z. 2008-12-14 19:45 z

Entries which need [x] script[edit]

Would it be possible to get a list (or have autoformat tag) of entries that have Greek (or another language) in the etymology section, but which are immediately followed by something that is not in the correct script? I realize that there are a lot of variables in etymologies and it might not be possible to recognize automatically when the wrong script is used, just wondering if it's feasible. Nadando 03:15, 14 December 2008 (UTC)

Oooh, I second that; would be very nice. -Atelaes λάλει ἐμοί 04:27, 14 December 2008 (UTC)
Sorry, I haven't had much time, and been busy with what I had. Suppose I create a list of entries with Etymology sections that (a) contain "Greek", "etyl|el" or "etyl|grc" but (b) do not contain any Greek characters? As a first pass. Then we will have a better idea whether and how to check for immediate adjacency, and how many there might be? Others are then just some sort of table changes. Robert Ullmann 19:05, 18 December 2008 (UTC)
Sounds good to me. -Atelaes λάλει ἐμοί 19:51, 18 December 2008 (UTC)
Okay, sorry it took a while (about 20 minutes work, but took forever to get to ;-) see User:Robert Ullmann/t23. 1233 listed, a few extras like gringo and such which mention "Greek". Also references to Mycenaean Greek and so forth. But almost all what you are looking for. Robert Ullmann 13:49, 19 December 2008 (UTC)
Wow, that'll take me awhile to go through. Thanks Robert. -Atelaes λάλει ἐμοί 06:07, 20 December 2008 (UTC)

Sorry about that.[edit]

About the {{see}} templates, that is. Hope everything got ironed out without too much trouble. bd2412 T 06:29, 14 December 2008 (UTC)

kangxi zidian[edit]

The pages are back so you can use the link in {{Han KangXi link}}. Koxinga 23:34, 19 December 2008 (UTC)

Did you forget or is there another reason ? Koxinga 00:28, 25 December 2008 (UTC)
Sorry, just hadn't gotten to it. Done. Robert Ullmann 06:24, 25 December 2008 (UTC)
No problem. Thanks for all your work ! (your parsing of dump to detect possible problems is very good and I am trying to do something similar on the french wiktionary). Koxinga 11:01, 26 December 2008 (UTC)

Template:myn[edit]

For codes which are only 693-2, should we be deleting them on site (obviously after taking care of links), or is there a way to tag them so AF doesn't confuse them for kosher language codes or....what? -Atelaes λάλει ἐμοί 03:25, 25 December 2008 (UTC)

(you mean "on sight"? ;-) hmmm... I eliminated the B-codes (so they don't show up even if the template exists, some did at the time). Haven't looked at excluding -2 T codes not in -3. It would be simplest to get rid of the templates (such as this one), but I still need to have the code that analyses the templates check this, as they come and go. Is there any reason why we would want the template? Robert Ullmann 06:22, 25 December 2008 (UTC)
Gah! What an embarrassing typo. As far as I am aware, there is no specific use for these right now. However, there has been a little talk of how to format descent from a language group (e.g. {{Ger.}}). For this template specifically, it is most often incorrectly used, where {{proto|Germanic}} should be, but sometimes it is simply a borrowing from a Germanic language, but no further specification is possible (generally with very old borrowings). I continue to waffle on whether it would be a good idea to open the floodgates of using {{etyl}} and family codes. The beauty of using the 693-2 codes is that it would limit how many groupings we would use (e.g. since 693-2 codes for, say, Germanic but not West Germanic, we would be limited to Germanic), resulting in a workable number of these cats. Then again....there are other things that we might want to note descent from that don't (and shouldn't) have codes, such as Pre-Greek. Perhaps one of these days we should revisit your proposal for ad-hoc etyl stuff. I still say my dialect codes were a good idea, even if I couldn't figure out how to get the code working for them and even if you continue to hate them (was that in any way related to this conversation thread? Perhaps not). Anyway, since we have no current use for these, and since the small number of them would make them relatively easy to create when necessary, I think we should just delete them....er.....on site. ;-) Finally, have you looked at the recent proposal for Latin templates? I don't know what your Latin background is, but I don't think you'd need much to have an informed opinion here, as it's largely technical. Anywho, sorry for the loquaciousness. -Atelaes λάλει ἐμοί 07:49, 25 December 2008 (UTC)

Category:Neapolitan derivations[edit]

I created this category along with the associated parents / description templates but for some reason the categories aren't showing up. Do you know what's causing this? {{nap}} looks correct. Nadando 09:33, 26 December 2008 (UTC)

lang=en has to be explicit; Atelaes has fixed it. Robert Ullmann 09:46, 26 December 2008 (UTC)

{{defective spelling of}}[edit]

Good to know, thanks! —RuakhTALK 20:09, 27 December 2008 (UTC)

{es-verb-form}[edit]

Hi Robert. Now that I have some free time, I decided to try to finish the job that got me blocked last time around. I had been adding {es-verb-form} like this, which was a problem. I think this fixes it, by adding a line break after the inflection line, not before. Let me know if that looks right to you. Dmcdevit·t 10:40, 29 December 2008 (UTC)

Yes, that looks good. Is a quibble in any case, but if one is going to have a bot do something, might as well get it just right. (And of course I didn't block it because it was doing something evil, but because I couldn't get your attention (as you know ;-)). Very good, pray carry on. Cheers, Robert Ullmann 10:46, 29 December 2008 (UTC)