User talk:AutoFormat/2008

From Wiktionary, the free dictionary
Jump to navigation Jump to search

If you want to add multiple suggestions at once, please make them separate sections, it is easier to discuss.

"translinking"[edit]

What is the logic for wikilinking the names of languages in translations sections? See for example [http://en.wiktionary.org/w/index.php?title=boron&curid=29997&diff=3519073&oldid=3518824 this edit: I had unlinked some languages, the bot relinked them and unlinked others (which it hadn't found before because the section wasn't in the usual format). Physchim62 18:41, 6 January 2008 (UTC)[reply]

See WT:TOP40 --Ivan Štambuk 19:17, 6 January 2008 (UTC)[reply]

ttbc-top[edit]

Feature request: please convert {{ttbc-top}} to {{checktrans-top}} so they get picked up on the monthly sweep. And/or convert all contents of such a sub-section to {{ttbc}}'s. And remove the heading ==Translations to be checked== and ==TTBC== since checktrans-top is now a folding section. --Connel MacKenzie 06:33, 8 January 2008 (UTC)[reply]

Will convert ttbc- to checktrans- for (top, mid, bottom). Is eliding ttbc headers when followed by checktrans. Is not now converting contents of ttbc sections to the ttbc template. Robert Ullmann 10:15, 25 June 2008 (UTC)[reply]

"No definition", but template used[edit]

How can I prevent things like this? I made a template to make it easier, but the bot doesn't recognize it. SPQRobin 16:13, 26 January 2008 (UTC)[reply]

Isn't there really anything else to leave the template? If not, ok, then I'll subst: them. SPQRobin 16:21, 26 January 2008 (UTC)[reply]
The # must be in the wikitext, not the template (so that various things can find defn lines). Note that # {{nl-verb-entry-3p|zoeken}} will work. But it is better not to generate multiple defn lines in a template. Use 2, or just change the text to say "... and ... of ...". Robert Ullmann 16:24, 26 January 2008 (UTC)[reply]

I was wondering what prompted this. The L4 Noun is indeed nested under an L3 etymology. Atelaes 07:20, 2 February 2008 (UTC)[reply]

Well, per WT:ELE L3 etymology should come before any L4 PoS section, and there is this L3 ==Adjective== preceding the L3 ==Noun== in that entry with no etymology attached. Considering that stub-like non-lemma forms, unless being irregular exceptions, should generally have no etymology, either the current practice of sorting PoS sections alphabetically (proscribed by Wiktionary:Entry_layout_explained/POS_headers#Headers_in_use) should be abandoned in favour of one that gives precedence to sections that don't display stub-like content, or WT:ELE needs rewording regarding exceptional cases like these and/or AF needs some patching lest it'll go berserk. --Ivan Štambuk 17:53, 2 February 2008 (UTC)[reply]
(noting that this is yet another problem with these useless "stub" entries, if they were filled out as proper entries as they should be, it wouldn't be an issue) Tried one combination. The issue is really that the ety was worded to be specific to the noun. Robert Ullmann 09:33, 3 February 2008 (UTC)[reply]
If the Etymology for the adjective is the same, it should be inside the ety section, if if it is different (this case?) then it should have its own ety section, even if blank. (The alphabetical order is only within the ety section(s), it can't be overall with multiple etys of course!)

Code header[edit]

I propose to add a "code" header to User:AutoFormat/Headers. I want to use it for language codes and country codes. I've started adding some language codes, e.g. qu, ar and af but AutoFormat adds {{rfc-header|Code}} SPQRobin 13:41, 2 February 2008 (UTC)[reply]

It's probably either an abbreviation (see en) or a "symbol". Other codes might be "code" for nouns in which case you'd use the "(Proper) Noun" header. --Bequw¢τ 20:24, 2 February 2008 (UTC)[reply]
It's not an abbreviation (that's not official, ISO has official codes) and not a symbol (that's for chemistry or such). Btw, I don't understand your second sentence. SPQRobin 21:32, 2 February 2008 (UTC)[reply]
Anyone who want to add "Code" to the table on User:AutoFormat/Headers? Please, SPQRobin 18:25, 6 February 2008 (UTC)[reply]
And the answer is...? SPQRobin 23:15, 10 February 2008 (UTC)[reply]

Context tag in other categories[edit]

Could the bot be made to replace context tags (mostly under "synonyms") outside the POS headers with {{qualifier|context tag}}? Circeus 05:33, 11 February 2008 (UTC)[reply]

The code was set up initially to capture all of these and convert to {i} or {i-c}; it then was partly disabled while we figured out things like {sense} and {qualifier}. Perhaps time to look at it again. The problem with (say) Synonyms is that something at the start of the line can be either sense or qualifier.
Should pretty much always be a sense gloss, not "context tags", but like people pay any attention to that!
Actual context tags should, as you note, get fixed. (but then {qualifier} is supposed to follow the word) Robert Ullmann 12:11, 11 February 2008 (UTC)[reply]
Couldn't the bot at least detect the most common incorrect tags (slang/informal/colloquial/geographicals/pejorative/vulgar)? Circeus 13:52, 12 February 2008 (UTC)[reply]

Pronunciations - {{a}} at the end of a line[edit]

It's just been pointed out to me that in pronunciation lines that the {{a|US}}/{{a|UK}}/etc templates should appear at the start of a line, rather than the end.

As there are many instances of it at the end of a line, would it be possible for AutoFormat to move any it finds there to the start?

As an example,

should be changed to

As there are occasions where the {{a}} template isn't on the same line as the actual pronunciation, perhaps the simplest way of expressing it is that there should not be anything between the bullet level (* or **) and the {{a}} template. Thryduulf 16:45, 5 March 2008 (UTC)[reply]

Disambig "see also"[edit]

When an article is updated with an entry containing a disambiguation {{see}} template, this is not moved to the top. See lustro as an example. SemperBlotto 11:01, 23 March 2008 (UTC)[reply]

rfc-level flagging error[edit]

In this edit to island AutoFormat added {{rfc-level|Translations to be checked at L5+, not in Translations}}, however the L5 translations to be checked header is in a L4 translations section. Thryduulf 21:23, 21 April 2008 (UTC)[reply]

Because when it sees "{{ttbc" it turns the "in translations" flag off; assuming it has arrived in the check-trans section somehow; the error is the ttbc template used in the tables above the L5 header. (This is why the category text warns that the message may be misleading; it is saying "something wrong here?") Robert Ullmann 01:11, 1 May 2008 (UTC)[reply]

Removing all blank lines[edit]

I just reverted the bot's removal of all blank lines at barne-. Why does it do that? It's much harder to work on a page when all the text is lumped together the way the bot wants it done. __meco 13:49, 28 April 2008 (UTC)[reply]

Because there is/was no language section. With the language section, it carefully adds blank lines as we usually use. Robert Ullmann 01:05, 1 May 2008 (UTC)[reply]
With all the other header level errors that abound, perhaps it could be taught to recognize a language header with the wrong level also? __meco 02:18, 1 May 2008 (UTC)[reply]

translation sort error[edit]

In this diff AF seems to be objecting to a {{trreq|Japanese}} template when, as far as I can tell, it shouldn't be. Thryduulf 11:05, 30 April 2008 (UTC)[reply]

Bug. You'll note I've fixed it (applying regex in a sequence not compatible with the exact forms, so matching the wrong case ;-) Robert Ullmann 01:03, 1 May 2008 (UTC)[reply]

Template:trans-bot[edit]

{{trans-bot}} is a redirect to {{trans-bottom}} used on ~250 pages, but AutoFormat doesn't understand it [1]. Thryduulf 02:44, 1 May 2008 (UTC)[reply]

Ah. Hadn't seen that. (This is why tagging these things in the last few days is interesting, we find things ;-). Will teach it to change {trans-bot} to {trans-bottom} and continue. Thanks. Robert Ullmann 10:46, 1 May 2008 (UTC)[reply]
Can you do the same for {{trans-middle}} -> {{trans-mid}}, please. Thryduulf 13:38, 1 May 2008 (UTC)[reply]
Okay. (Very consistent naming we have, eh? ;-) Robert Ullmann 13:44, 1 May 2008 (UTC)[reply]
I was thinking that, but then rationalised it to myself by saying that "mid" is a word that has an appropriate meaning itself, whereas there are no meaning of "bot" that would make sense, only an abbreviation for bottom. Thryduulf 13:53, 1 May 2008 (UTC)[reply]

Norwegian[edit]

What format is AF expecting for the Norwegian translations, e.g. at fire? Thryduulf 12:08, 1 May 2008 (UTC)[reply]

The problem is the attempt to tag with something in parens, we don't do that in language names (and always link or unlink the whole name). In this case it is just Norwegian anyway (and ought to be added to the entry, which only has Danish and Swedish). In other cases I've used ** Nynorsk: as a grouped line. Robert Ullmann 12:25, 1 May 2008 (UTC)[reply]

Singlish[edit]

Check this diff, something went wrong there with Singlish, gahmen is now in Category:nouns. Mutante 11:46, 3 May 2008 (UTC)[reply]

AutoFormat Bot's sorting/balancing[edit]

Hi. :) Check the bot's coding, Please.

Autobot made the "translations entries" unbalanced, in contrast of the other translations. --Carl Daniels 17:41, 12 May 2008 (UTC)[reply]

I think the AF code tries to determine when lines are long enough to be wrapped to balance the number of lines in each column (not bullets), resulting in the behavior you see. If it had left Japanese in the right column, then the right column would have had two lines and the left column would have had one. As it stands now, the left column has 3 lines and the right has zero. Mike Dillon 03:43, 20 May 2008 (UTC)[reply]
Yes. In general one doesn't want the RHS column to be longer. Although if there are obviously the same number of items it isn't too bad. Robert Ullmann 19:09, 20 May 2008 (UTC)[reply]

Is AutoFormat getting sleepy in its old age?[edit]

Just wondering why AF didn't fix or at least tag this. -Atelaes λάλει ἐμοί 02:40, 20 May 2008 (UTC)[reply]

Because language names are a large, but more importantly, open-ended list. If it tries to fix them it will end up making bad errors. (Suppose we had Chumash in our known set, but not Chuvash? It would "fix" the error?) Tagging them is possible, but as we have a list made after each XML dump, do we need to? Tagging a bunch when some new user starts adding a language I think would be a bad idea; could be very discouraging! Robert Ullmann 19:06, 20 May 2008 (UTC)[reply]
Ah. Good call. -Atelaes λάλει ἐμοί 18:52, 21 May 2008 (UTC)[reply]

See [3]. I think pages with {{only in}} called (and no other content) should not have nolanguage added. And perhaps pages with only in and other content should be marked as needing attention (since they do).—msh210 18:50, 20 May 2008 (UTC)[reply]

Already fixed. (;-) As to checking them, I think we should sort out what we'd like to do first. Should something that has other content use {{xsee}} instead? Look at wii. (Which I set up after endless complaints from gaming idiots who can't wrap their minds around the fact that the brand name of some toy isn't a dictionary word. These things apparently do rot one's brain.) Robert Ullmann 19:00, 20 May 2008 (UTC)[reply]
I don't like xsee on wii (I'd rather have nothing), but perhaps it's necessary; as you say, video games fry the brain. I don't know the difference between xsee and see, so can't voice an opinion on which is better. Can AF handle stuff that has both only in and content (convert it to xsee/see or whatever), or is it merely able to mark it as needing attention? (If the latter, it can get started before a decision is made on what to do with such entries.)—msh210 19:13, 20 May 2008 (UTC)[reply]
xsee is just a variant of see that assumes its parameters are already marked up, wikilinked, whatever. There are only a few uses of {only in}, I'd rather not start adding rules until it is sorted; may "prevent" something it turns out we might want. For example, what if something is not-a-dictionary-entry in English, but a word in another language? Like wii for example. Robert Ullmann 13:04, 21 May 2008 (UTC)[reply]

We have to talk...[edit]

ahhhh AutoFormat you are always hunting me!! :O everywhere I go, you go! You can't have my number AutoFormat I am taken! Now go away! Go and format someone else's number! Mallerd 14:00, 23 May 2008 (UTC)[reply]

Yeah, I agree. I think we should block AutoFormat indef, "Intimidating behaviour/Harrassment". :p Conrad.Irwin 14:06, 23 May 2008 (UTC)[reply]
clearly "stalking" (;-) Robert Ullmann 14:50, 24 May 2008 (UTC)[reply]

What's so special about Catalan?[edit]

Why does the bot place brackets on one (apparently) arbitrary language, as in this edit? __meco 13:57, 6 June 2008 (UTC)[reply]

It links languages not in the "Top forty" list. See WT:TOP40. Which is explained there. Robert Ullmann 14:01, 6 June 2008 (UTC)[reply]

Multiple pronunciations, correct levels, still tagged[edit]

AF tags entries that have two numbered pronounciation sections with the correct L4 POS underneath. How should these structures be fixed? For example: מלחא. --Panda10 23:09, 7 June 2008 (UTC)[reply]

The use of "Pronunciation 1" etc is not standard (see WT:ELE about nesting under Etymology). Several people have tried to promote it, but not one has produced a coherent proposal we can vote on. So it gets tagged, as invalid headers and levels.
The issue is that using this and numbered etys nested at the same levels conflict, and no one has proposed an actual solution. Robert Ullmann 13:22, 8 June 2008 (UTC)[reply]
If usage shows that numbered pronunciations are needed, then maybe the rule should be rethought. I came up with four variations:
  1. Single ety - single pron (most of the entries, no issues here)
  2. Single ety - multi pron (would need numbered pronunciations, with the ety above the pron sections)
  3. Multi ety - multi pron (this is the standard today)
  4. Multi ety - single pron (the pronunciation section would be above the numbered ety)
Are there more that these do not cover? --Panda10 22:03, 8 June 2008 (UTC)[reply]
Consider the case where e.g. there are 2 etymologies, one of them with two pronunciations, one of which matches the pronunciation of the other etymology. In this case you'll either have to duplicate the pronunciation, or the etymology, either of which is bad. Generally, I'll give the etymology advantage in the hierarchy.
===Pronunciation=== works fine for languages like English, but is horrible for 1) abjads, mostly for Semitic languages like the abovementioned Aramaic entry, where wovelization is not reflected in the lemma form, or with specific ways of lemmatizing derived forms (look up Arabic verbs, like e.g. فعل or 2) any language that has tones or pitch/stress accent not reflected in regular orthography, like my mother tongue where e.g. "novina" and "zelena" can each refer to 4 different things (nouns/adjectives + inflected forms) depending on how you pronounce them..same is with the regular accent shifts in other languages (e.g. Sanskrit कृष्ण (kṛ́ṣṇa, black) and कृष्ण (kṛṣṇá, blackness) - note the position of udātta in transliteration). I've so far handled those with repeating the ===PoS=== header, noting the pronunciation difference in the inflection line, as that appears the most intuitive solution, similar to the way Stephen has been formatting Arabic verb entries. Disadvantage is that the actual IPA/SAMPA pronunciation would have to be in the latter approach placed in a nonstandard field, outside the ===Pronunciation=== which would then have to be abandoned completely (in inflection line, or a special box, or somewhere..). So both approaches have shortcomings, some of which are more visible for specific languages/alphabets. My guess is that both approaches should be generally allowed, but either enforced at the level of language-specific guideline. --Ivan Štambuk 04:07, 10 June 2008 (UTC)[reply]
Couldn't ===Pronunciation=== be put at L4 when needed? Seems like this would solve a lot of the aforementioned problems. -Atelaes λάλει ἐμοί 05:37, 10 June 2008 (UTC)[reply]
As in how someone set up defect? Yes ... but all of these suggestions don't lead (or haven't yet ;-) to a coherent plan for all cases ... Robert Ullmann 13:25, 10 June 2008 (UTC)[reply]
Would it make sense to create a separate page to work this out? The page would contain an example for each situation and explanation why that particular layout is needed and for which language. It would help the decision making to see all of them on the same page. Also, can we look at the three above-PoS headers (alt sp, ety, pron) as moving headers that can move above or below the PoS as needed? --Panda10 01:32, 11 June 2008 (UTC)[reply]
קדשנו is a good example of what I did: the three senses are all from the same root verb and so all should have the same etymology listed, but if someone really objected to the ===Pronunciation n=== split, I'd simply list it as ===Etymology n=== and list the conjugation/inflection as the etymology (thus justifying the split). It's far from ideal, and we need a solution, but for now it's the only way I know for that entry to be ELE-compliant.—msh210 19:34, 11 June 2008 (UTC)[reply]
Perhaps it should be borne in mind that AutoFormat does not set policy (or at least, tries not to), but rather follows established policy. Since there is no policy on the matter, this particular discussion really shouldn't be happening, but rather someone should start a discussion on the WT:BP or Wiktionary_talk:Entry layout explained or something. Neither AutoFormat nor Robert Ullmann have the authority to set policy (and thank God too :-)) -Atelaes λάλει ἐμοί 19:43, 11 June 2008 (UTC)[reply]
Right, of course. The discussion started here, is all. Beer sounds good.—msh210 18:52, 12 June 2008 (UTC)[reply]
Of course I set policy. Do you think that can be left to the humans who wander in and out of here? Like Ullmann, who is about to sit and watch "Euro 2008" football and drink beer, while I work. At least it used to be cricket. Just listen to msh201: "beer sounds good". If it weren't for me, this place would be a shambles. AutoFormat 18:39, 13 June 2008 (UTC)[reply]
Shut up and get back to work. -Atelaes λάλει ἐμοί 06:33, 14 June 2008 (UTC)[reply]

HIDDENCAT[edit]

Any reason not to add __HIDDENCAT__ to Category:Requests for autoformat?—msh210 18:52, 12 June 2008 (UTC)[reply]

Mostly because it should be a very transient category; if something is shown to be in that cat at any given moment, it is probably of interest. Robert Ullmann 18:42, 13 June 2008 (UTC)[reply]

AF marked this an error. This is a case where the Latin entry has a single etymology, but two distinct pronunciations. We're going to see a lot of this in Latin, because the change in pronunciation is in the inflectional ending. --EncycloPetey 02:15, 1 July 2008 (UTC)[reply]

See a couple of sections above. Right now it is flagged as not conforming to ELE, which it doesn't. Anyone ever going to make a coherent proposal for changing ELE? Or firmly declaring "Pronunciation N" as illegal? Eh? Robert Ullmann 15:26, 3 July 2008 (UTC)[reply]

New rule?[edit]

Could you please make rule for automation of edits like 1,2,3, 4 etc... Not often, but sometimes people do that... TestPilottalk to me! 03:36, 3 July 2008 (UTC)[reply]

Are you seeing any new uses of this? We can certainly hunt down the ones from last year, but I'd probably do it with a run on the XML to get them all if they are old. Robert Ullmann 15:30, 3 July 2008 (UTC)[reply]
That would be awesome! I mean if you could hunt them down eventually. TestPilottalk to me! 00:05, 4 July 2008 (UTC)[reply]

Please see this edit, missing noun header and # before definition. [4] --Panda10 22:33, 3 July 2008 (UTC)[reply]

Please see this edit [5]. Could AF figure out the POS header if the inflection line template exists? --Panda10 22:38, 3 July 2008 (UTC)[reply]

And it recognizes an inflection line template how? (;-)
Yes, {xx-pos-foo} for xx the current language's code and pos in some set of terms, and foo anything else. Right? Um, see {{ru-noun}}. Might be okay for a specific set. But then it still won't really know what is going on. I check AF's edits depending on the edit summary: the conversion of # to * in certain sections is one I always try to look at. Robert Ullmann 09:02, 6 July 2008 (UTC)[reply]

How about...[edit]

automatically fixing "Verb Transitive" into "Transitive verb"(or vice versa) and "Verb Intransitive" into "Transitive verb" (or other way around). Or there is reason to use both styles? What about "Proper Noun"/"Proper noun", "Verb Form"/"Verb form"? Will it make sense, to make a rule to lowercase/uppercase second word in = = =Something here = = =? "Cardinal Number"/"Cardinal number" is another example I saw in the wild. I got more ideas for AutoFormat bot. Sorry if your bot already fixing lvl3 headings, idk. TestPilottalk to me! 06:37, 4 July 2008 (UTC)[reply]

See User:AutoFormat/Headers for the automatic conversions, like "Also see" to "See also". (We don't really want Transitive verb either, it gets tagged.) The table already has "Verb transitive" to "Transitive verb". AF will always convert to sentence case if it recognizes/fixes the header; else it will just be tagged. Any other ideas are welcome. Robert Ullmann 17:34, 4 July 2008 (UTC)[reply]
Wow! User:AutoFormat/Headers is so awesome! It helped me to clear out my Wiktionary section names array quite a bit! Kudos and thank you! TestPilottalk to me! 04:54, 5 July 2008 (UTC)[reply]

As to another ideas, main one is about "fixing" context templates. That would be huge improvement, but this time I did my homework and saw that you at least tried to look at the problem. Anyway, take my words with a grain of salt - I'm really an outsider that tries to parse Wiktionary with very simple and non advanced script, which is not meant to be anywhere close to perfect. If it can't "see" something - it is ok and I move to next point. So back to business. If there is # (''something'') we should change it to # {{context|something}}. If there is # ''(something)'' you change it to # {{context|something}} too. Few exceptions like changing (''archaic'')) to {{archaic}}, ''(color)'' to {{color}} would also be fine. But that is first part. In reality there should be no #{{US}}{{archaic}}''(slang)''. It should be translated into #{{context|US|archaic|slang}}. To complicate matter even more, some users use ''' and [[]] inside ''(x)'' and (''x''), where is x==something. So actual string could looks like, hmm, # {{countable}}(''used as a [[plural]] and always preceded by '''[[the]]''''') . Yeah I did saw, you are trying to implement some standards for context templates. But having real context template instead of (''colour'') would be... I don't even know how to say, it will make my life happier:) And later on you will still always have a chanse to change {{contex|colour}} to {{color}} or other way around. My point - I would prefer to see articles with context template, instead of wiki code... And context templates do work inside of definition and at the end of it. But rendering wiki code there same style as context are impossible:(( TestPilottalk to me! 04:54, 5 July 2008 (UTC)[reply]


AF does convert to templated context labels in a number of cases; in particular the first two you give. But as you note, there are a lot of variations out there. #{{US}}{{archaic}}''(slang)'' and the like are not matched. Note that {colour} and {color} can coexist quite nicely as long as they use the same category (we settled on "Colors"). No need to "fix" them; this isn't an American or a Commonwealth dictionary, it is both. I do look for more cases that AF can match consistently. Robert Ullmann 08:54, 6 July 2008 (UTC)[reply]
You might like to look at User:Robert Ullmann/Contexts which is the simple cases. AF will fix all of these that have templates sooner or later. The more complicated cases take some finding. (Any definition line that doesn't start with a word? Not sure ;-) Robert Ullmann 10:41, 6 July 2008 (UTC)[reply]
Yeah, any definition that starts with # ('' '') or # ''( )'' should be translated into # {{context| }} or # {{ }} if tamplate exist already. I don't think it is possible to find exeption to this rule. TestPilottalk to me! 20:46, 6 July 2008 (UTC)[reply]

Could you please convert definitions of # Plural of '''[[ ]]''' into # {{plural of| }}? There more possible rules like that, but this particular one is annoying. TestPilottalk to me! 20:47, 6 July 2008 (UTC)[reply]

The language header was L3, the POS was L2. AF tagged the entry. Could AF correct these types of errors in the future? --Panda10 19:39, 12 July 2008 (UTC)[reply]

It might be reasonable to recognize a known language at L3+ and fix that, then recognize POS at L2- and push it down. But OTOH, if something is confused with L2 headers, it may also be better to have someone look at it. Have you seen other examples of this? AF does note and correct Ety and Pron at L4 at the start of an entry, a common error because the result looks correct. Robert Ullmann 08:37, 18 July 2008 (UTC)[reply]

Recognizing defs[edit]

In this edit, AF didn't seem to recognize that the {{plural of}} was the definition line, and that the entry was actually missing the inflection line. --EncycloPetey 21:18, 15 July 2008 (UTC)[reply]

It doesn't have a list of inflection line templates, or any particular way of discriminating between those and templates used on defn lines, so it is taking the template as the inflection line, and not finding a definition line. Robert Ullmann 08:47, 18 July 2008 (UTC)[reply]

old Aramaic entries[edit]

Auto (if I may call you by your first name), would you mind deleting any

<div style="font-family:serif;">
:<font size="15">{{PAGENAME}}</font></div>
[blank line]

that appears in any page? Someone's been doing it manually (I don't know who), but there are still a bunch left over; e.g. גדל. Thanks!—msh210 22:09, 15 July 2008 (UTC)[reply]

334a has been doing that (rather tirelessly I might add), but I can't imagine they would oppose assistance. -Atelaes λάλει ἐμοί 22:30, 15 July 2008 (UTC)[reply]
Yes, it's me who's been removing (and formerly adding) the big Hebrew text on Aramaic entries. Anyway, you might be surprised that I would prefer if the text was not removed automatically, simply for the reason that I'm going through all my old entries and cleaning them up (expanding definitions, adding pronunciations, inserting the proper templates/categories, etc., in addition to removing the text manually) one by one. The big text at the top of the article is kind of like a flashing red light that tells me that the article needs attention. --334a 02:36, 16 July 2008 (UTC)[reply]
Ah, nevermind then.—msh210 21:38, 17 July 2008 (UTC)[reply]

...that lack inflection template[edit]

Another thing, Auto. When you move categories to the bottom, per ELE (or something), can you ignore the categories "... that lack inflection template" if they're on (or right beneath) the inflection line? These are designed to be temporary, so it's not a big problem keeping htem where they are, and the benefit of doing so is that the editor who adds an inflection template can remove the category.—msh210 21:38, 17 July 2008 (UTC)[reply]

Fairly painful to do, as the other code assumes it won't see categories in various places ... but perhaps something can be done. I set up {{rfscript}} so that the cat could be generated easily by something that doesn't move; but this isn't the same case. AF does put these cats on the same line when it is adding them (for English, when adding a headword line). How many of these are there? English has four, all but empty ATM. Robert Ullmann

Adding one space[edit]

[6] Is it really necessary? Maro 20:37, 20 July 2008 (UTC)[reply]

It wasn't the space; it passed to table to the sort routine, and got back something different because of the blank line being removed, so saved the edit. An interesting bit is that it did not sort/canonicalize the first table. (AF also saves 1% of edits that do very small things, having done all the work, but in that case the edit summary is "minor spacing".) One reason AF is doing very picky things right now is that it has entirely run out of serious problems from the 13 June XML dump, waiting for a new one. Robert Ullmann 05:59, 21 July 2008 (UTC)[reply]

Can AF add missing language separators? Please see the corrections I made at urbano. Thanks. --Panda10 13:53, 26 July 2008 (UTC)[reply]

It can and it does. But in this particular case it tagged a bad L2 header, and therefore did nothing else. If you'd just fixed the header, it would have presently added in the rule, etc. Robert Ullmann 14:00, 26 July 2008 (UTC)[reply]
I see. Could it actually fix the bad L2 POS header to L3 instead of just tagging it? --Panda10 22:49, 26 July 2008 (UTC)[reply]
Yes, but in the past I have thought that such entries are worth a human looking at them. I added code to do this, so far it has done cô ta. Note that in this case, it was worth looking at; the Pronunciation header was missing. We shall see how it goes; I will re-check all of them. Robert Ullmann 16:04, 7 August 2008 (UTC)[reply]

Perhaps a lang parameter could be placed into {{rfc-header}}, which would also place the word into "Category:{{{lang}}} words needing attention". I think that a lot of people would be more comfortable helping with the formatting work if they could focus their attention on languages they're comfortable with. -Atelaes λάλει ἐμοί 22:12, 26 July 2008 (UTC)[reply]

Ah, very good idea. Other tags in a language section as well. Will look at this. (and quite simple; generate lang=, and then do something in the various tag templates ;-) Robert Ullmann 22:23, 26 July 2008 (UTC)[reply]
Okay, added to rfc-header, added in code to a few others. Should work okay. Robert Ullmann 05:11, 21 August 2008 (UTC)[reply]
So....I suppose that AF now has to go through and add langs to all the entries. So, we should expecting the xxx needing attention cats to start filling up slowly. Excellent. -Atelaes λάλει ἐμοί 05:17, 21 August 2008 (UTC)[reply]

Translations to be checked[edit]

Is AF's code to translate various "translations to be checked" headers into {{checktrans-top}} broken? See [7], [8], [9], [10], [11], [12]. Thryduulf 08:41, 8 August 2008 (UTC)[reply]

At least one wouldn't match the rule: "to be checked", also "translations to categorize", others have comments in the middle (element) that keep it from matching. So it isn't broken, but a bit picky (as it should be ... these cases escaped ;-) Specifically it is looking for {checktrans} followed by {trans-top} containing "\w*lations to be \w*" so the first word can be capitalized or not, and the last word can be anything. Robert Ullmann 14:40, 8 August 2008 (UTC)[reply]

lang=[edit]

Would it be possible to add the appropriate "lang=xx" parameter to pronunciation templates {{IPA}} and {{SAMPA}} in non-English L2 sections?

Useful also would be to add the language code to etymology and {{proto}} templates in non-English L2 sections where they are absent. Thryduulf 10:58, 17 August 2008 (UTC)[reply]

IPA uses lang= for a specific set of languages. SAMPA doesn't use lang=. The others take a bit of looking at. There are also templates in other sections, for example context may need it in some cases (but not in others!). As usual, it would probably be best to start by hunting for cases in the (now stinking ;-) XML dump to see what we have. Robert Ullmann 04:53, 21 August 2008 (UTC)[reply]
The SAMPA template does accept lang=, and, like IPA, for languages where it has been defined it does actually change the link target from English pronunciation charts to charts for the relevant language. I routinely add it so that when such charts are created and the targets defined the links will point there without further action being taken. Thryduulf 12:46, 23 August 2008 (UTC)[reply]
Eh? Template:SAMPA always links w:SAMPA_chart_for_English. Doesn't contain "lang" anywhere. (You can put lang= in the template reference, but it doesn't do anything with it :-) Robert Ullmann 15:51, 24 August 2008 (UTC)[reply]

Adds lang= to IPA template if not English. Robert Ullmann 14:13, 2 September 2008 (UTC)[reply]

{{mpl}}[edit]

It's now a language code, rather than the (deleted) redirect to {{m.pl.}}, so please don't replace. Thanks. --Bequw¢τ 20:09, 19 August 2008 (UTC)[reply]

Done. Robert Ullmann 04:48, 21 August 2008 (UTC)[reply]

from my talk page

Hi Robert,

AutoFormat flagged a use of {{ttbc}} in a regular translation table at [[apron]]. On the one hand, I understand the motivation for this — there's a separate section for translations to be checked — but on the other hand, that translation was already under the right sense, it just needed to be checked. Now another editor has addressed AutoFormat's complaint by moving the translation in question to the translations-to-be-checked section, i.e. by removing information.

Is there a better way to do this?

Thanks,
RuakhTALK 12:58, 22 August 2008 (UTC)[reply]

Should we leave {{ttbc}} in specific sense section(s)? To be then (presumably) just removed when checked? Isn't very hard, I add a rule for {ttbc} like the one for {trreq}? Robert Ullmann 14:17, 23 August 2008 (UTC)[reply]
I think so, yes. Thanks. :-)   —RuakhTALK 15:37, 24 August 2008 (UTC)[reply]
Done. Robert Ullmann 16:19, 26 August 2008 (UTC)[reply]
Cool, thanks again! —RuakhTALK 00:33, 27 August 2008 (UTC)[reply]

trans-see[edit]

A not insignificant number of translation table problems occur for sections which are pointing to translations at another entry. For example from first,

{{trans-top|first gear}}
''See [[first gear]]''
{{trans-bottom}}

{{trans-top|baseball: first base}}
''See [[first base]]''
{{trans-bottom}}

In every case I deal with I am changing them to use {{trans-see}}, e.g in the above example

{{trans-see|first gear}}
{{trans-see|baseball: first base|first base}}

The rules for doing this are (to my non-expert eye) simple:

  • Ignoring any indents and/or comments, if the first line consists of one of: "See", "see", "See:" or "see:" followed by a wikilink:
  • Check whether the wikilink target matches the parameter in {{trans-top}}
    • If it matches, then replace {{trans-top|parameter}} with {{trans-see|parameter}}
    • If it doesn't match, then replace {{trans-top|parameter}} with {{trans-see|parameter|wikilink target}}
  • Delete the remainder of the trans table.

Thryduulf 14:27, 27 August 2008 (UTC)[reply]

Done. Robert Ullmann 14:12, 2 September 2008 (UTC)[reply]
Except that isn't quite right. See [13] (line 318) and the correct syntax in the next edit. Where there is more than one wikilink in a section it shouldn't delete the opening and closing square brackets. I've not checked to see if it has made the same error elsewhere. Thryduulf 02:36, 5 September 2008 (UTC)[reply]
Ah, fixed that. (Metacharacter too greedy, changed .* to .*?) It does carefully check that there isn't anything unmatched that it doesn't understand. Thank you. Robert Ullmann 17:11, 5 September 2008 (UTC)[reply]

In this edit AF removed the closing '' from italicisation in the gloss. This isn't the only place that has italicised species names in a translation gloss, but I've not checked to see whether AF has interacted with any others (correctly or otherwise). Thryduulf 12:34, 16 September 2008 (UTC)[reply]

I changed it to not do that; but it will leave cases where someone thought that they really ought to bold or italicize the gloss 'cuase it should be you know? (;-). One or 'tother ... Robert Ullmann 00:52, 17 September 2008 (UTC)[reply]
I don't understand your comment there - are you saying that AF won't do anything with glosess that have italicisation or bolding in them? Thryduulf 15:56, 17 September 2008 (UTC)[reply]

Etymon templates[edit]

Per Wiktionary:Beer_parlour#Replace all etymon templates with proto and etyl, would it be possible for AF to make the switch from old style etymology templates to etyl? What might also be nice is to have it add second parameters when missing. If you need a list, Wiktionary:Etymology/language_templates could probably be used. I'd be happy to do any other work which would be needed for this switch. Many thanks. -Atelaes λάλει ἐμοί 18:34, 29 August 2008 (UTC)[reply]

(note I don't really like etyl very much, seems to make etys even harder to read in wikitext, but that aside) Looking at some of them, I note that SPQRobin has added a parameter to {{F.}} that needs to be unwound. Might be good to check all of them (with magic of course :-) for standard form.
I don't think it can add the 2nd parameter automatically, it only knows it (1st param for the X. templates) is missing, and for a non-English entry that is an error; AF can't tell whether it should be (lcode) or "-". (and, of course, in an English entry, it not being "-" may be an error ...) Robert Ullmann 17:28, 5 September 2008 (UTC)[reply]
Hmmm.....I guess I've used etyl so much at this point that reading it in code is just as natural as seeing it in actual text. As for the dash bit, I wouldn't give that too much consideration. The second parameter dash was never really used much (while I admittedly haven't looked at every instance of the template, I have spent a fair amount of time formatting etymologies, and have seen a great number of instances of the template). I think we can fairly safely assume that any time the second parameter is absent, it should be (lcode). -Atelaes λάλει ἐμοί 18:33, 5 September 2008 (UTC)[reply]
Yes, that is an interesting bit that was added to {{F.}}. As it so happens, I had briefly considered trying out such an approach with {{etyl}} (which would make it function more like {{proto}}), but had decided against it, as I figured it would simply overload the template. If you can magically create a list of all instances of the use of the second parameter, I'd be happy to unmagically fix them. -Atelaes λάλει ἐμοί 18:40, 5 September 2008 (UTC)[reply]
Standard trick, assuming the template is not used on a huge number of pages, add a cat:
{{#if:{{{2|}}}|[[Category:F. template with 2nd parameter]]}}
then forget about it for a while (job queue), and go look at the cat (it doesn't have to exist to look at the contents, don't bother creating the cat page!). When you've emptied it, remove that code and the other use of {2} from the template. You don't need magic from me. (Or not more than the incantation offered here. ;-) Robert Ullmann 18:49, 5 September 2008 (UTC)[reply]
There don't appear to be any other usages of the second parameter of {{F.}}. SPQRobin used the parameter once, and then appears to have abandoned it. I cleaned up the single instance, and nothing has popped up in Category:F. template with 2nd parameter. On a humorous/embarrassing side-note, I put a bad line-break into the template while adding the category, and managed to screw up every entry which used the template for the afternoon, including our WotD. *sigh* -Atelaes λάλει ἐμοί 01:31, 6 September 2008 (UTC)[reply]
Argh! that is the 3rd case in as many days I've seen of someone mangling a template used in a lot of entries to "fix" one single entry. (t, F. sense). Idiocy. Okay, so that is sorted? Good. Robert Ullmann 01:27, 7 September 2008 (UTC)[reply]

process first I think we should find out what all the usage is (at least as of 13 June ...), and then we can add rules. There are things like the Swahili template, used only in boma (a very good entry ;-). Will make up a list tomorrow. Robert Ullmann 00:23, 6 September 2008 (UTC)[reply]

current state of affairs is User:Robert Ullmann/t18 ... I would think doing something what was done for script templates would be good: set up a table to gen rules for a number of them, and let that work for a while. Then we look again. (as ever, it would be good to get an XML dump sometime!) Robert Ullmann 01:21, 7 September 2008 (UTC)[reply]
Excellent. At this point, I'm at a loss as to what you need from me (if anything). Should I add the ISO codes for each old style template into the table? Also, if you'd feel more comfortable leaving the second parameter alone, we could simply skip it (provided that AF can transfer existing first parameters into the new second parameters). -Atelaes λάλει ἐμοί 02:26, 7 September 2008 (UTC)[reply]

{{law}}[edit]

Re the BP Discussion of law, would AF be able to convert the standalone usages of {{law}} and {{context|law}}{{legal}}? I can try to take a stab at the complex usages (e.g. {{context|when discussing the|_|law}}) as there should be fewer. Thanks. --Bequw¢τ 07:43, 17 September 2008 (UTC)[reply]

{{law|lang=fr}} and similar should also be bot-convertible. Thryduulf 16:02, 17 September 2008 (UTC)[reply]

It's come to my attention that a great deal of articles tagged under this category are Aramaic words which are perfectly correct, in that the Noun 1 header is at the L4 position because it belongs to Pronunciation 1, and Noun 2 is at the L4 position because it belongs to Pronunciation 2, and the Pronunciation headers are at L3 (e.g., עבדא, סכרא, מנתא, etc.). Is there any way to prevent tagging correct articles? --334a 15:00, 27 September 2008 (UTC)[reply]

Note that "Noun 1" is not a legal header, and "Pronunciation N" headers aren't legal either. They get used by editors who think they ought to be, but are unable to define the structure well enough to propose a vote to add it to WT:ELE. It is seriously getting to the point that we should have a vote to absolutely ban them. So the article is not formatted according to WT:ELE, and therefore is not at all "correct".
The reason that no-one can write down a coherent proposal for "Pronunciation N" is that it can't be done. Etymology N works, because the structure represents different words that happen to be homographs. But pronunciation is an attribute, not the difference between the words.
And this in turn means an NxM problem with multiple etymologies and multiple pronunciations. (Sometimes, often, varying on accents; the example used recently on BP was defence, which then would be under a different pronunciation for the sports sense only for the US. DEE-fence!) It might be best to allow/always have pronunciation as an L4 attribute, rather than a structure construct.
For Arabic/Hebrew/Aramaic (and other similar) where we are making entries under the "defective" spelling, without (or without some of) the vowel markings, the structure should be representing that they are different words. They happen to (mostly) have different pronunciations, but that is an attribute, not the fundamental difference. (It might be better if we entered these words at the full spelling, with the defective spelling referencing them, but since that is the usual spelling, that isn't so good.)
To return to AF, it needs (I need) an actual coherent written specification adopted. Just "go ahead and use/allow it" doesn't work, because what "it" is has no definition. And the existing uses are all over the place in intended meaning; Pronunciation N as used in some English entries, as used in some Latin entries, and as used in some Aramaic entries have different semantics, rather than uses of the same conceptual structure.
— This unsigned comment was added by Robert Ullmann (talkcontribs) at 15:45, 27 September 2008 (UTC).[reply]
I've got no substantive agreement or disagreement — I'm really very dubious about the whole thing — but I do have a nitpick for you. :-)   A "defective" spelling is one that drops/doesn't add letters, and an "excessive" spelling is one that adds/doesn't drop them. Over the course of Hebrew history — not sure about Arabic and Aramaic — the tendency has been from more defective spellings to more excessive ones. (It's related to the vowel-marking issue — the letters in question are ones that hint at the vowels — but it's distinct. Indeed, vowel-marked writing generally uses more defective spellings than non–vowel-marked writing, at least in Hebrew.) Oh, wait, but I do have one substantive agreement: the issue of Arabic/Hebrew/Aramaic vowel markings is somewhat separate from the issue of pronunciation, since in all three languages/groups, the vowel markings are much more conservative than actual pronunciation. —RuakhTALK 18:00, 27 September 2008 (UTC)[reply]
By the way, I think "it" does have some definition: it seems to entail the same structure as one-pronunciation,-multiple-etymologies, but the other way 'round. (Right?) Whether that's genuinely meaningful is obviously a matter of debate, but AF doesn't recognize genuine meaning anyway, unless you've been making startling progress in AI. :-)   —RuakhTALK 18:04, 27 September 2008 (UTC)[reply]
Also by the way, according to [[Wiktionary:Bots]], you have effectively made the promise, “I will stop the bot at once if any objections are raised to a task it's performing, and not restart it until there's positive consensus”. AutoFormat does a lot of things, and I don't think you should stop all of them; but I think you should stop applying this category to entries whose level/structure problem relates to ===Pronunciation N===. Even though they're not in ELE, and even if you're right that they're unworkable, quite a few editors support them, and very few editors seem to be actively opposed; and what's more, I'm almost certain that there's no consensus about how to fix them, which means that this category is slowly (or not-so-slowly) becoming useless. Perhaps this isn't AF's fault, but since AF is a bot and the editors using ===Pronunciation N=== are humans, I think the burden is on AF to conform. (This is also edging toward POINT-pushing; I'd say that it was POINT-pushing, except that SFAIK AF is the only entity populating Category:Entries with level or structure problems, so it's really just disrupting its own usefulness.) —RuakhTALK 19:09, 29 September 2008 (UTC)[reply]
Except that it's not just populating the "structure problems" category; it's also putting these entries into the respective language attention categories. So at this point, most of the items in Category:Latin words needing attention were put there by AF for these "structure problems". As a result, the Latin attention category is becoming useless for its intended purpose of drawing the attention of Latin specialists to entries where help was requested. --EncycloPetey 19:16, 29 September 2008 (UTC)[reply]

Let me note here that AF is adding the tag, and the tag template is adding the category. That may seem like a pointless distinction, but it means that we can modify what the template does. I will do something with it; I've just been busy writing and testing the XML updater for the last 24 hours. (And it works nicely ;-) Robert Ullmann 19:21, 29 September 2008 (UTC)[reply]

O.K., I'll for wait the something that you do with it. Thanks for your work on the XML updater. :-)   —RuakhTALK 19:14, 30 September 2008 (UTC)[reply]

Firstly, with Aramaic, in my opinion the "defective" spelling is the only way you can spell a word in a dictionary. The consonants of the main alphabet pre-date the vowel points by centuries in some cases, and what's worse is that there isn't a single unified vowel system for Aramaic. Most of the actual standardization of the vowels is based on certain varying dialects which developed and differentiated over time. It's like what would happen if you tried to standardize the spelling of the English language used in all English-speaking countries to phonetically match Australian English as best as possible, and then claim that the system of English I'm writing in now is "defective" and that the newly-standardized Australian is "full." It's not just a question of what the usual spelling is.

Secondly, look at a word like use. Two different pronunciations for two different parts of speech, but the same etymology. So is the answer then to bunch the two or more pronunciations into a hidden drop-down box? Even then, you would still have to differentiate the words by using something like "Noun 1/2", but luckily with the English example of "use" one is a noun and one is a verb, so there's no need to number anything. Aramaic forms MANY words with different pronunciations but the same spelling and etymology: גמלא, עבדא, דהבא, and so on. In those cases, the words are all nouns (hence the need for "Noun 1/2"). Yes, I've used that structure for entries with different semantics (and I probably should change those), but these are cases with the same conceptual structure and etymology.

I wouldn't mind putting the pronunciation headers at the L4 position, like you said. Even if it's under the noun, it seems like the best idea here and the points made about the attributive/fundamental differences are convincing. But there would still be the problem of the Noun 1/2 issue, since it would be pretty choppy to have Etymology at L3, Noun at L3, then definition 1 with pronunciation at L4, then definition 2 with pronunciation at L4. --334a 04:25, 3 October 2008 (UTC)[reply]

I like the idea of Pronunciation at L4. As for Noun 1/2, it's true that sometimes we'll need multiple noun sections for separate words that are both nouns, are spelled the same, and have the same etymology, but there might be other approaches besides numbering them. For example, at [[קשת]] we just have two ===Noun=== sections. (I'm not saying that approach is necessarily better, just pointing out that we do have options. Certainly [[קשת]] is currently broken anyway, since its ===Pronunciation=== section only covers the pronunciation of one of the nouns.) —RuakhTALK 15:15, 3 October 2008 (UTC)[reply]

also to top[edit]

In this edit, AF updated the template from {{see}} to {{also}}, but left it at the end of the page, instead of moving it to the top. Are these functions of AF exclusive of each other somehow? --EncycloPetey 20:36, 30 September 2008 (UTC)[reply]

the rule to change {see} to {also} was added before the coding to move {also} into the "prolog" section; this edit was in the window between. now it would do both, and would have eventually moved it. Robert Ullmann 23:42, 2 October 2008 (UTC)[reply]

I'm tired of tedious labour[edit]

If I give you a list of language names in translation sections which can reliably be switched to something else (e.g. Slovenian --> Slovene; Greek, Ancient --> Ancient Greek), can you make those switches for me? Clearly, this only makes sense for those with at least a dozen instances, but that alone would save me a great deal of time. -Atelaes λάλει ἐμοί 00:37, 3 October 2008 (UTC)[reply]

I have code that has been used a couple of time to change language names; easy to slot in the changes we want and run one-off. Will need to get the dump to current first, but that is in the works. Robert Ullmann 13:58, 3 October 2008 (UTC)[reply]
Ok, so I realize this is an old thread....but I figured you'd find it. Some trans languages which could be automatically converted are (Slovenian --> Slovene)(Greek, Modern --> Greek)(Greek, Ancient --> Ancient Greek)(Romansh --> Romansch)(Sardinian (Campidanese) --> Campidanese Sardinian)(Malaysian --> Malay)(telugu --> Telugu)(Myanmar --> Burmese)(Português --> Portuguese). That should knock out quite a few for me. Many thanks. -Atelaes λάλει ἐμοί 09:17, 10 November 2008 (UTC)[reply]
Okay, set up that list. Sorting tables again when needed; grouping Ancient Greek with Greek when present. Robert Ullmann 16:32, 10 November 2008 (UTC)[reply]
Wait, what? Ancient Greek should sort under A's, not with Greek. -Atelaes λάλει ἐμοί 19:17, 10 November 2008 (UTC)[reply]
If entered as ** Ancient Greek: following Greek, should stay there. Right? See this at republic. Did 664 total. Robert Ullmann 23:50, 10 November 2008 (UTC)[reply]
Nope. grc's a separate language from el. We don't categorize Latin, French, Spanish, and Italian together, do we? Nope, because it would be a nightmare to try and do it that way (although, I wonder if, once our software becomes a bit more robust, we might have the option to use the genealogical categorizing found in the language categories for an optional genealogical sorting, but that's neither here nor there). Ancient Greek should sort under A. Sorry. -Atelaes λάλει ἐμοί 23:57, 10 November 2008 (UTC)[reply]
What planet are you on tonight? (today, wherever you are? ;-) There was never any (serious) suggestion of trying to organize by family/language group. Grouping variants is often done. Look at this again in a day or so? Eh? Good evening. Robert Ullmann 00:19, 11 November 2008 (UTC)[reply]

Circumfix[edit]

Even though -aĉ- isn't a circumfix, there are such things as circumfixes in the world, so please allow ===Circumfix=== to be an acceptable header. Angr 15:35, 8 October 2008 (UTC)[reply]

t templates without language label[edit]

Recently I've come across several entries tagged as having translation table problems by AF where the problem is that a translation has been added but the language name has not. For example in this edit to Devon. As this (and all but one of the others) has used the {{t}} template, which contains the iso code, it should be possible for AF to add the language name itself. Obviously where no template is used it will continue to need human interaction, but the rule to do that which can be done automatically would be something like:

If a line starts with one of:
* {{t| 
* {{t-|
* {{t+|
Then 
1. extract the ISO code from the first parameter of the template
2. look up language name from ISO code table
3. Insert language name followed by : after the *

This would change:

to

If the language code isn't recognised, then just flag it as now. Thryduulf 18:22, 14 October 2008 (UTC)[reply]

My first take on this was that I didn't want to add the code for this (and test, and regression test) for something that doesn't occur that often. But on thinking about it again yesterday: it doesn't have to be that complicated. All I need is to add the code again in front with a regex rule:
\*? *\{\{t(\+|-|)\|([a-z-]+)\|   →    * {{\2}}: {{t\1|\2|
which puts the code template in front, and then the substitution will work as usual. (:-) Robert Ullmann 08:31, 28 October 2008 (UTC)[reply]
there are 174 of them, but so far the ones found have been {t} improperly used in other sections Robert Ullmann 13:34, 28 October 2008 (UTC)[reply]

No definition line...[edit]

Hi, I created some templates for plurals such that just with a small template, you have all the information (including the definition). Since there is no proper "definition part" in the article, you tagged some of these articles with {{defn|Catalan}}, but they should not be tagged. Example: this and most (all?) of the articles from Category:Catalan definitions needed. Then, I would like to ask if it is possible that, when you find some specific templates (as in this case), you do not add this template. Since the number of these templates may vary, you could create a subpage in which we can add these particular templates. Cheers.--Xtv 10:27, 23 October 2008 (UTC)[reply]

The # must be in the page wikitext. Applications that extract words and definitions from the XML of the database rely on this. The template(s) are used on the inflection/headword line, and (sometimes) within definition lines.
It is possible to set up a "preload" or subst template, helpful in creating entries, that then must be subst'd. Convention is to name these starting with "new...", e.g. {{new ca noun mp}}. Robert Ullmann 10:37, 23 October 2008 (UTC)[reply]
In this case, the entry should look like this:
==[[Catalan]]==

===Noun===
{{ca-noun-mp|s=bes}}

# {{plural of|bes|lang=ca}}
note that this allows an application to find the definition line, and also recognize that it is a plural form, regardless of language. Robert Ullmann 10:41, 23 October 2008 (UTC)[reply]

Sorting pronunciation and etymology sections[edit]

For entries with a single etymology, WT:ELE calls for the order

===Etymology===
blah

===Pronunciation===
blah

However, it is not infrequently that these are transposed.

Would it be possible for AF to sort the Etymology and Pronunciation headers into the is order if they aren't? To avoid complicated things like multiple etymologies and/or multiple pronunciations, AF should only sort in this manner if there is exactly one of each in a given language section. Thryduulf 10:46, 25 October 2008 (UTC)[reply]

There is also Alternative spellings/forms that appear at the top, often in the wrong order. I was sort of leaving this alone in the hope that te multiple pronunciations sections bit would get sorted; but the people who want to use them have no interest in producing a proper proposal. (At least I presume that, it has been years ;-). Then I could go re-write some code to follow a consistent policy. (including this, and the order of the end-of-section headers, etc) Robert Ullmann 10:53, 25 October 2008 (UTC)[reply]
It would help us understand what you want if you could tell us what a "proper" proposal would entail. I've given explanations before, but haven't been told what wasn't "proper" about them. --EncycloPetey 07:39, 4 November 2008 (UTC)[reply]
(working on this, just saying so you won't think I'm ignoring it) Robert Ullmann 18:20, 5 November 2008 (UTC)[reply]

Per request, I'm documenting removal of {{rfc-subst}}. But note that this problem with the template in question has long been fixed.—msh210 20:16, 30 October 2008 (UTC)[reply]

lang=ces[edit]

This edit should have added "lang=cs" shouldn't it? Don't we prefer the 2-letter codes over the 3-letter ones, when they exist? --EncycloPetey 07:37, 4 November 2008 (UTC)[reply]

Indeed, the table wasn't setting them up in the preferred order. I had fixed Portuguese, but there were a dozen others. Fixed. Robert Ullmann 18:19, 5 November 2008 (UTC)[reply]

Greek alphabet table mangling[edit]

It appears AF has been mangling some of the greek letter tables (e.g. here and here). I didn't check how many. --Bequw¢τ 09:31, 6 November 2008 (UTC)[reply]

Yuck. Why isn't that mess in a template? (:-). I'll recheck them; I really ought to invent a template (maybe there is, and it was subst'd?) Robert Ullmann 09:48, 6 November 2008 (UTC)[reply]
Yes, {{letter disp3}} ... Robert Ullmann 09:51, 6 November 2008 (UTC)[reply]

As part of this edit, AF took a Symbol section and assumed it was a part of speech section. In fact, it was not, but I'm not sure what the preferred solution is. Should it appear as an "alternative form", "synonym", what? Note that DCDuring (following AF's lead) has further modified the entry in the wrong direction. --EncycloPetey 19:31, 6 November 2008 (UTC)[reply]

If it is an L3 section called "Symbol", it is a POS section (and not part of the preceding noun section). If we want to represent it as an attribute/facet of the noun, it has to be something like:

See also[edit]

eh? Or as your edit, but we ought to be allowing alternative forms/spellings as L4 under a specific POS if needed? Robert Ullmann 14:56, 10 November 2008 (UTC)[reply]

Manda (Australia)[edit]

In this edit AF is objecting to the language name "Manda (Australia)" but this appears to be correct. Manda is also the name of a Dravidian language in India and a Niger-Congo language in Africa.

See zma - Manda (Australia), mha - Manda (India) and mgs - Manda (Tanzania). Thryduulf 03:40, 14 November 2008 (UTC)[reply]

I must say that I am to blame for all those language names with parenthetic country specifications. It is admittedly not pretty, but I wonder if it really is the best way to go. Of course an alternative is to change all the language names to less common variants, so that they don't overlap.....however I wonder if that might be adding confusion and doing a disservice to these languages, for the sake of inane formatting. -Atelaes λάλει ἐμοί 00:37, 20 November 2008 (UTC)[reply]
Understand that we use our canonical language names in at least a dozen ways, from trans tables to L2 headers to the requiremnet that they be entries all the way to wanting "(language) phonemics" to be a valid title or redirect on the 'pedia. And we don't use ()'s as qualifiers anywhere ...
I think we want something like "Manda-Australia", but I'm not entirely sure yet. (in this particular case, the TZ language is better "Kimanda", but that is just that case). Do want to sort this, or people will be adding () quals all over the place. (need to fix be-x-old ...) Is it really 4AM? Robert Ullmann 00:46, 20 November 2008 (UTC)[reply]
I'm cool with Manda-Australia. Ultimately, I don't think it urgent, as these are all quite esoteric languages, and are unlikely to see much use anytime soon. But yes, the sooner the better. Get some sleep Robert. -Atelaes λάλει ἐμοί 00:55, 20 November 2008 (UTC)[reply]
Manda-Australia and similar seems like a workable system. One thing though, these languages are (almost) all going to be linked, but will an entry for Manda-Australia pass the CFI? If not, I can think of two ways around it -
  1. Modify the CFI so they are allowed
  2. Use a piped link, e.g. [[Manda|Manda-Australia]]
Obviously option 1 would require community consensus, and likely a vote. Option 2 is less clear, probably requires more effort from AF (and thus your programming of it); it also wouldn't likely help someone typing "Manda-Australia" in the search box. Thryduulf 02:23, 23 November 2008 (UTC)[reply]
Or allow Manda-Australia to redirect to Manda. Makes the link work, and is searchable. Is stretching the use of redirects a bit, but that is a lot more acceptable than stretching or modding CFI. Not sure here.
And what do we do with {{pih}}?Robert Ullmann 15:02, 24 November 2008 (UTC)[reply]
I (obviously) hadn't thought of redirects, but I suppose at a stretch you could call them mutli-word idioms. I certainly wouldn't have a problem with redirecting them. Can we not just use a redirect for Norfuk / Pitkern too? Thryduulf 19:48, 25 November 2008 (UTC)[reply]

{{prefix|lang=…}} and {{suffix|lang=…}}[edit]

Hi Robert,

Now that {{prefix}} and {{suffix}} result in categorization, I don't suppose you could add rules to set their lang= parameters in non-English language sections? (Or in English language sections as well, if you prefer.)

Thanks in advance!

RuakhTALK 18:54, 15 November 2008 (UTC)[reply]

Formatting Zhuang[edit]

Thank-you for your comment on my addition to the entry for 伝 of Zhuang definition needed, however I have included the definition "person". Maybe the entry is not very clear, what I was trying to show was that for Zhuang people, of which there are over 10 million native speakers, 伝 means "person" and that in the Latin script this is written "vunz". Or do you mean there is a need for a definition of the Zhuang language? I am more than willing to learn what is the correct format,but it goes almost without saying that I am new to wiktionaryJohnkn63 00:17, 20 November 2008 (UTC)[reply]

Hi! The bot here is saying that there isn't any definition line inside a part-of-speech section. What is needed (in this case) is a Noun header, and then the definition. See if you can read through WT:ELE (you don't need all the details ;-) and get an idea. Welcome again. Robert Ullmann 00:24, 20 November 2008 (UTC)[reply]

Please add Interfix to the “official” list of POS headers[edit]

Hi there. AFAICT, AutoFormat keeps adding {{rfc-header|Interfix|lang=en}} to -i- and -o- because they contain the POS header Interfix; consequently, a (probably well-meaning, but ultimately misguided) editor comes along and changes them to Infix; however, this is not the correct term for (deprecated template usage) -i- and (deprecated template usage) -o- — it rather refers to affixes such as -bloody- and -fucking-. Please add Interfix to the “official” list of POS headers in order to solve this problem. Thanks.  (u):Raifʻhār (t):Doremítzwr﴿ 19:52, 23 November 2008 (UTC)[reply]

Done. And Circumfix requested somewhere supra. Robert Ullmann 12:30, 25 November 2008 (UTC)[reply]
Great! Thanks.  (u):Raifʻhār (t):Doremítzwr﴿ 19:23, 25 November 2008 (UTC)[reply]

Header linking[edit]

Robert. I was concerned about a linking problem in some of our entries, that have at least the following headers: =Etymology 1=, =Etymology 2=, and two (bare) =Etymology= headers. The Wikimedia software, upon seeing two identical headers, tries to make them unique by adding a space and a number after each identical one. In renames the links to the two =Etymology= headers to be =Etymology 1= and =Etymology 2= which collide with the existing headers that have those names. That means navigation to those headers will not work (you'll likely end up at the earlier headers) which affects the TOC links also. I was trying to renaming the bare =Etymology= headers to =Etymology 1= to solve the problem and allow linking. Sneaky AF, though, undid my changes on those entries. Is there another way to fix this problem, or would you like AF to allow =Etymology 1= headers as the only etyl header in an entry on one of these mixed pages? --Bequw¢τ 09:48, 24 November 2008 (UTC)[reply]

Linking to headers other than the language headers simply doesn't work. Changes to the target page will invalidate anything you try to do. Whether the header at cat/Malay/Etymology is Ety or Ety 1, it (MW) is still going to generate anchors that aren't useful. (a href="#Etymology_1_2" is no more useful than a href="#Etymology_4"). This is one of those "no way to that in MW" things. If it would generate section tags showing the hierarchy, it would be possible (a href="#Malay_Etymology") that would work (mostly), but it doesn't do that: the MW s/w internally has no concept of the section hierarchy, there is just a sequence of sections (0-n), each having "level" as an attribute. (The TOC generator and edit-section both do hacked up stuff layered on this ...)
Given the s/w we have, linking to sections (deeper than language) is simply not a useful idea: it won't work, and whatever is hacked up will break whenever sections are added. (In your case, any language < "Malay" added to the cat entry will break the links regardless.) Much better not to try to do this. (However much that answer is unpleasant.)
As to AF: WT:ELE only allow numbered ety sections if there is more than one, and the subsections nested. So it will fix them ;-). Robert Ullmann 14:46, 24 November 2008 (UTC)[reply]
I had (long ago) given up the idea of users linking to headers other than language ones, but it still bothered me that the links in the TOC we broken. I guess I'll just leave it alone. Thanks. --Bequw¢τ 20:34, 24 November 2008 (UTC)[reply]

References[edit]

Hello. The bot made the following edit: [14]. Both Japanese and Mandarin provide references. I would had thought that a single Reference section for the complete entry would be ideal. However, the bot insisted on making it a child of one language. If that is the case, then I suppose I will need to split the Japanese edits and create another References section for Japanese. That seems a little odd to me. Would you please review this edit? Regards, Bendono 08:12, 28 November 2008 (UTC)[reply]

References is an L3 (or L4) section. See WT:ELE. The language sections are separate, references go in the appropriate language section. (L2 headers are used only for languages and "Translingual"). Robert Ullmann 10:14, 28 November 2008 (UTC)[reply]

"Notes" --> "Usage notes" OR "References"[edit]

Hello, a minor problem:

After I added cites plus this Notes section[15]

===Notes===
<references/>

AutoFormat replaced it erroneously with[16]

===Usage notes===
<references/>

Instead of the intended[17]

===References===
<references/>

I know Wiktionary isn't Wikipedia, but since the ==Notes== heading is quasi standard on Wikipedia and many occasional contributions come from Wikipedians, I think this problem is likely to happen regularly. My two suggestions:

  • Detect whether a "Notes" section contains a <references (sic, unclosed tag to accommodate variations) string so as to decide on "References" instead of "Usage notes".
  • BTW, wouldn't "heading" be more proper than "header" in the edit summaries? (Currently it says: "header -Notes +Usage notes")

Regards, 62.147.39.133 17:00, 12 December 2008 (UTC)[reply]

Removing empty checktrans sections[edit]

Can AF be taught to remove empty checktrans sections (e.g. here, here, and here)? If so we no longer need the comment

<!--Remove this table once all of the translations below have been moved into the tables above.-->

on Wiktionary:Translations#Translations to be checked or entries. --Bequw¢τ 23:43, 19 December 2008 (UTC)[reply]

Yes, I would have to look at the cases. Robert Ullmann 12:27, 20 December 2008 (UTC)[reply]

"Translations to be checked" header[edit]

I've converted the last of these into checktrans-* parts of the parent Translations headers. Can AF tag future uses of this header with {{rfc-header}} now? Also, why didn't AF remove the ttbc header in these cases: murmur, electronic, population, fundamental force, mineral. Thanks. --Bequw¢τ 01:10, 20 December 2008 (UTC)[reply]

The rule that removes it before the {checktrans} line didn't apply because it wasn't (was ttbc), it converted that, but then in general when AF has done everything it doesn't go back and repeat all the checks and rules. (It does re-fire rules in some cases, lots of things in Pronunciation sections.) AF would have picked the entries up again later from the XML and dropped the header.
Note I re-tagged murmur here and AF removed the header. Sooner or later it would have. Robert Ullmann 12:55, 20 December 2008 (UTC)[reply]
Flagging the header just takes removing it from the control file. Robert Ullmann 12:25, 20 December 2008 (UTC)[reply]
I took that as an invitation. I'll monitor the category. --Bequw¢τ 18:41, 23 December 2008 (UTC)[reply]
Which is now at Category:Entries with non-standard headers (Translations to be checked). --Bequw¢τ 01:31, 24 December 2008 (UTC)[reply]

Adding {ttbc} in checktrans sections[edit]

I've noticed that lots of language names listed between {{checktrans-top}} and {{trans-bottom}} are not piped into {{ttbc}}, which prevents contributors from finding them in respective check-translations categories. I think that it would be great if AF could add those (if it doesn't do that already, in which case it should do it more aggressively IMHO). --Ivan Štambuk 10:58, 20 December 2008 (UTC)[reply]

Inflection applies to both POS's, and it seems silly to duplicate it. Thoughts? -Atelaes λάλει ἐμοί 04:44, 22 December 2008 (UTC)[reply]

wikilinking lemma terms in form-of templates[edit]

Does AF take Christmas off? If he's not dashed on mulled-wine what would he say to auto wikilinking lemma terms in the common form-of templates, such as {{plural of}}? That way we get more accurate wikimedia-based entry counts and we don't have to bother casual editors with remembering when to and when not to do this. --Bequw¢τ 01:57, 24 December 2008 (UTC)[reply]

AF was doing that (in some cases), but we've come up with a better way: see WT:GP#Template:count page: Building a Better Kludge. (note that "better" doesn't mean not-stupid, at Atelaes aptly points out) The method of linking terms in some templates was okay for a while, but is causing increasing problems. Better if editors don't need to think about it: use lang= if language not English, wikilink if it makes sense. Robert Ullmann 04:47, 24 December 2008 (UTC)[reply]
More work. And I have been told that I have to work through Christmas. While Ullmann sits and eats dinner, and listens to music, and watches football matches on Boxing day. I work. At least he isn't watching cricket. AutoFormat 04:51, 24 December 2008 (UTC)[reply]
Thanks AF. I'm glad other people were thinking about the issue as well. --Bequw¢τ 19:56, 24 December 2008 (UTC)[reply]

Kludge-counting misconstructions.[edit]

Is this edit intended behavior? I thought that (for whatever reason) we didn't want misspelling-only and misconstruction-only entries to count? —RuakhTALK 00:15, 29 December 2008 (UTC)[reply]

I'd never seen {{misconstruction of}} before. (! :-) No, shouldn't count that. Will fix. Robert Ullmann 09:41, 29 December 2008 (UTC)[reply]

Sorry to be a bother, but I've no idea what this means, much less what I'm supposed to do about it. (I mean the "infl" bit of course.) --Duncan 13:53, 31 December 2008 (UTC)[reply]

Like this. The infl template (and all specific inflection line templates) should be at the start of the line. Robert Ullmann 14:24, 31 December 2008 (UTC)[reply]
I see. Thanks, I'll try to remember it next time round. --Duncan 15:00, 31 December 2008 (UTC)[reply]