Wiktionary:Grease pit/2015/January

From Wiktionary, the free dictionary
Jump to navigation Jump to search

Right-to-left problem[edit]

Somehow Uyghur transliteration is skewed on Happy_New_Year#Translations (note the position of brackets) but no problem here: Uyghur: يېڭى يىل مۇبارەك (yë'ngi yil mubarek). --Anatoli T. (обсудить/вклад) 07:19, 1 January 2015 (UTC)[reply]

There was a right to left mark (U+200F) in the template- please make sure I fixed it correctly. DTLHS (talk) 16:54, 1 January 2015 (UTC)[reply]
Now I'm wondering if there's any context in which U+200E / U+200F in page text would be appropriate- can we just bot remove all occurrences? DTLHS (talk) 17:09, 1 January 2015 (UTC)[reply]
Thanks. The ocurrence would be appropriate if a translation into a RTL language had no transliteration or other Roman letters, such as gender, qualifiers, etc. E.g. see two identical translations into Persian without transliterations: دریا, فانوس. They now appear in the right order - LTR. --Anatoli T. (обсудить/вклад) 00:10, 2 January 2015 (UTC)[reply]
Wiktionary:Todo/Pages containing LTR marks, Wiktionary:Todo/Pages containing RTL marks, if anyone wants to help clean up the ones that don't need the LTR/RTL marks. - -sche (discuss) 18:20, 21 January 2015 (UTC)[reply]

Simplifying Catboiler Templates For Editors[edit]

I do a lot of category creation, and, though it's less arcane and complex now that User:CodeCat has luafied most of the infrastructure, there's still a lot of typing involved.

This is mostly unnecessary: we already have very strictly-enforced rigid constraints on the format of category names, and they generally contain most or all of the information needed in the name itself- so the modules that power the templates should be able to parse it from the page name using the modules that CodeCat has put in place (see Category:Pagename-based auto-fill-in templates for some templates that provide a substitutable front end for existing templates using such techniques).

Here are my ideas with regards to specific templates:

{{topic cat}}: For language-specific cats, the first part of the name is the language code, followed by a colon, followed by the topic name. For the non-language-specific parent categories, the page name is the topic name. See {{tcez}}, which was developed for me by User:Kc kennylau and User:Wyang. I found Wyang's version more useful and robust, so I modified it slightly to get {{tcez1}}

  • Implementation:
  1. If the name contains a colon:
    1. the language code is everything before the colon
    2. the topic name is everything after the colon
  2. If the name contains no colon, the language code is empty and the topic name is the page name
  • Problems:
  1. When the language code is "sms", the string "sms:" is converted to "sms:" (apparently by the Lua string-function backend), and the colon isn't recognized.
  2. Any topic name containing a colon will cause parsing of the non-language-specific parent category to fail.

{{prefixcat}}: the first part of the page name is the canonical name of the language, followed by " words prefixed with ", followed by the suffix, followed by "-".

  • Implementation:
  1. the canonical language name is everything before " words prefixed with "
  2. the prefix is everything after " words prefixed with ", minus the "-" at the end

{{suffixcat}}: the first part of the page name is the canonical name of the language, followed by " words suffixed with -", followed by the suffix.

  • Implementation:
  1. the canonical language name is everything before " words suffixed with -"
  2. the suffix is everything after " words suffixed with -"

{{charactercat}}: the page name is always the canonical language name, followed by " terms spelled with " followed by the character.

  • Implementation:
  1. the canonical language name is everything before " terms spelled with "
  2. the character is everything after " terms spelled with "
  • Note: As far as I know, there's no way to parse the sort parameter from the page name, so those would still have to be entered by hand where necessary.

These methods can be applied to just about every template that uses Module:category tree, with one important exception (below), and quite a few others.

{{poscatboiler}}: the language-specific categories all consist of the canonical language name, followed by a space, followed by what currently goes in the template's second parameter.

This one is trickier to implement, because there's no unique delimiting text, and because of the potential for overlap between parts of language names and parts of the second parameters.

I came up with a kludgy workaround: require a single parameter consisting of the first few characters of the current second parameter. Everything before the first instance of a space + this string in the page name is the canonical language name, and the string + everything after the first instance of a space + the string in the page name is the current second parameter.

This workaround is potentially defeatable by new canonical language names that would contain a match for the string as originally entered, so it's probably best not implemented in {{poscatboiler}} itself, but in a substitutable fill-in template. I have a working proof-of-concept at {{pcbez}}, but I don't understand substitution and/or templates in general well enough to make it substitutable without a lot of clueless trial and error. Can someone do that for me?

Thanks! Chuck Entz (talk) 21:13, 1 January 2015 (UTC)[reply]

I think it would be more workable, at least in the short term, to provide only the language code. The module can then determine that everything else must be the label. I would rather not make things too dependent on "delimiters" because my goal for the long term was to integrate {{suffixcat}} and company into {{poscatboiler}}. I believe that it's beneficial to have less templates, so that users don't have to remember which one does which. —CodeCat 21:18, 1 January 2015 (UTC)[reply]
Using the language code certainly looks like the only workable way to adapt {{poscatboiler}} itself, and may someday cause problems with converting other templates to {{poscatboiler}}, but I'm talking about the real short term here: I suspect that all the specific examples I gave here could be implemented in an hour by someone who really knows what they're doing (troubleshooting could drag that out much longer, of course). Since no one currently uses these templates without parameters, there's no problem with backwards-compatibility: you can either ignore any positional parameters, or you can use them instead of the pagename-based ones if they're present (the latter is probably better, just to be safe- see the problem with "sms:" in the {{topic cat}} section, above). Chuck Entz (talk) 22:05, 1 January 2015 (UTC)[reply]
As for the philosophical issue: it's true that the proliferation of catboiler templates was a serious problem. I'm sure someone would eventually have come up with "rfquoteoldestylecatboiler" which would render quote-request categories for earlier authors in appropriate fonts to stylistically match their era, or "trreqsundaycatboiler" for translations requested on a specific day of the week. Reducing the number of templates is a worthwhile goal, but it needs to be kept in the context of the overall demand on the editor. It's nice not to have to remember 57 different catboilers, but it's also nice not to have to have to look up the language code- especially for the <canonical language name> terms derived from <language name or language family name> categories, which often have redlinks to more obscure language-family categories. Chuck Entz (talk) 22:35, 1 January 2015 (UTC)[reply]
I've added this to {{poscatboiler}} now (see the edits I made to Module:category tree and Module:category tree/poscatboiler). If you leave out the label, it will try to extract it from the page name. If the category doesn't begin with the specified language, or if the autodetected label doesn't exist, it shows a somewhat nondescript error message, but at least the basic idea works. —CodeCat 14:12, 2 January 2015 (UTC)[reply]
Very nice! If you create a new category that has multiple redlinks in the breadcrumbs, it's now possible to copy the wikitext from the first unchanged into all the redlinked categories with just a few clicks and keystrokes. The error handling is definitely a problem, though. Perhaps you could compare the expanded language code with the beginning of the page name and give a message along the lines of "The language code XX is for the language YYYY, which doesn't match the category name".Chuck Entz (talk) 18:18, 2 January 2015 (UTC)[reply]

missing important and common words[edit]

What kind of software solutions exist and which are we using to ensure we know which of the most common words are missing? I was very surprised to discover that, for example, news stream is completely missing in this and all other dictionaries. So Wiktionary has the chance to be the first dictionary to record one of the most important and common and descriptive words of our time. I found several lists like User:Brian0918/Hotlist and User:Robert Ullmann/Missing and User:Visviva/Tracking, but these don't seem to make any kind of frequency analysis. I don't understand what to do with the red links at the beginning of the last list. --Espoo (talk) 11:32, 7 January 2015 (UTC)[reply]

"hu-conjugation of" in verb form category[edit]

How do we change the template {{hu-conjugation of}} so that it isn't in Category:Hungarian verb forms but puts verb forms there? --Lo Ximiendo (talk) 15:02, 7 January 2015 (UTC)[reply]

Fixed. You have to put the category call inside the "includeonly" part and not inside the "noinclude" part. —Aɴɢʀ (talk) 15:11, 7 January 2015 (UTC)[reply]
@Angr but you should have taken a look at the list in Category:Hungarian verb forms and seen the template there. Besides, I wish {{hu-conjugation of}} got rewritten into Lua. --Lo Ximiendo (talk) 15:16, 7 January 2015 (UTC)[reply]
Hmm, I don't know why that's happening. It would be good for the template to be luacized, but that wasn't the problem you brought up. —Aɴɢʀ (talk) 15:22, 7 January 2015 (UTC)[reply]
I disagree with making {{hu-conjugation of}} add entries to the category. The category is already added by the headword template. —CodeCat 15:24, 7 January 2015 (UTC)[reply]
{{hu-conjugation of}} was created several years ago before the current categorization direction. There is no need to recreate the template in Lua. It could be replaced by {{inflection of}} applying the parameters needed for Hungarian. For example: vadászok
{{hu-conjugation of|vadászik|1|s|indic|pres|indef}} would become
{{inflection of|vadászik||1|s|indicative|pres|indefinite}}. --Panda10 (talk) 17:39, 7 January 2015 (UTC)[reply]
I've added most of the grammar tags from {{hu-grammar tag}} to Module:form of/data. But there was a conflict in one case: sub is already used to mean subjunctive, so it can't also mean sublative. Furthermore, the following tags shouldn't be added to the module, so some solution for them should be found: ban, ben, 1s, 2s, 3s, 4s, 5s, 6s, 1p, 2p, 3p, 4p, 5p, 6p. I'm not sure what to do with pos and nonattr. It also needs to be checked if any pages use {{hu-grammar tag}} with a tag that it doesn't recognise (in which case it's shown as-is) but which {{inflection of}} does recognise. —CodeCat 18:52, 7 January 2015 (UTC)[reply]
Ok, thanks. Those parameters are used by {{hu-inflection of}}. This would be more complicated to replace (it is used in about 19,000 entries). In my above note I meant to replace only {{hu-conjugation of}} with {{inflection of}} because it is used in a little over 200 entries. --Panda10 (talk) 19:41, 7 January 2015 (UTC)[reply]
{{hu-conjugation of}} now just calls {{inflection of}}. You can replace it if you want. —CodeCat 22:34, 7 January 2015 (UTC)[reply]
Thanks. Is it feasible to semi-automate the creation of Hungarian verb forms? Similar to the noun forms. It is very convenient to create a noun form entry just by clicking the declension table cell. --Panda10 (talk) 22:38, 7 January 2015 (UTC)[reply]
The biggest limitations of WT:ACCEL are that it only works for red links, and it can only create entries for one form at a time. So when the same form is actually several distinct forms that just happen to be identical, then it doesn't work either. This is likely a problem for verbs, which have a lot more forms than nouns do and so there is more risk of one form appearing more than once in the table. In theory, the table module (it would have to be a module; it's not feasible with a template alone) could be modified to alter the acceleration tags it puts in the links, so that WT:ACCEL is told that the entry is for multiple forms. But that would make it a lot more complicated as well. —CodeCat 22:54, 7 January 2015 (UTC)[reply]
Ok, it will just stay as is. However, the changes you made in the templates created a problem in the possessive forms, e.g. ablaka - the definition line contains a category name in wikilinks. It also emptied the Category:Hungarian noun forms - possessive. Can you reverse this? I appreciate your help but I really don't want more problems, it will be just too overwhelming for me to correct them. --Panda10 (talk) 00:25, 8 January 2015 (UTC)[reply]
I just removed the category for now. Having a category for every single kind of inflected form is just overkill, as I have mentioned before in other discussions. —CodeCat 00:29, 8 January 2015 (UTC)[reply]

Fonts[edit]

Why do the Navajo, pinyin and romaji words need to be written in those ugly and different fonts? --Biolongvistul (talk) 13:11, 8 January 2015 (UTC)[reply]

For Pinyin and Romaji, it happens when the software assumes that the word is being written in Hanzi/Kanji even though it really isn't: if I write {{l|zh|Běijīng}} it shows up as Běijīng because it assumes that everything labeled "zh" is in Hanzi, so it uses a font that's better suited to Hanzi. I would have expected {{l|zh|Běijīng|sc=Latn}} to force it to show up using the default Latin font, but it doesn't; it still shows up as Běijīng, which is annoying. (Interestingly, if I specify the language as "cmn" instead of "zh", Pinyin shows up using the default Latin font, even without being explicitly labeled "sc=Latn", so {{l|cmn|Běijīng}} shows up as Běijīng.) For Navajo, I have no idea since Navajo is only written in the Latin alphabet, so the software shouldn't be assuming anything else. —Aɴɢʀ (talk) 20:59, 9 January 2015 (UTC)[reply]
If I remember correctly, that font was chosen for Navajo to accommodate its many diacritics, assuming that our standard fonts can't. Stephen G. Brown (talkcontribs) can shed some light. --Vahag (talk) 00:36, 10 January 2015 (UTC)[reply]
That is correct. Navajo uses diacritics on some letters that are spaced incorrectly in regular Roman fonts, so we use the Aboriginal Sans Serif. —Stephen (Talk) 09:57, 10 January 2015 (UTC)[reply]

Confix template[edit]

There is a problem with this template where listings for the suffix don't appear in the correct alphabetical order, when there's a root word as well as a prefix. For example: repristination. Donnanz (talk) 12:36, 9 January 2015 (UTC)[reply]

I don't see it, can you elaborate? —CodeCat 13:57, 9 January 2015 (UTC)[reply]
You can see it in the -ation listings, in between prioritization and privatisation. (https://en.wiktionary.org/wiki/Category:English_words_suffixed_with_-ation) Donnanz (talk) 14:06, 9 January 2015 (UTC)[reply]
It looks like the module is stripping the prefix from the sort key for the prefix category and using this prefix-stripped sort key for the suffix category, too, where it should be using the whole word. Chuck Entz (talk) 17:19, 9 January 2015 (UTC)[reply]
I have found a workaround for repristination by altering it to {prefix|re|pristine|lang=en} {suffix||ation|lang=en} (doubled brackets shown as single). But the problem still remains for the unwary. Donnanz (talk) 21:28, 9 January 2015 (UTC)[reply]
Does it work if you use {{affix|en|re-|prestine|-ation}}? —CodeCat 21:36, 9 January 2015 (UTC)[reply]
Yes, that works OK. Donnanz (talk) 22:02, 9 January 2015 (UTC)[reply]
Please don't apply that workaround to lots of entries! We should fix confix rather than propagating hacks. Equinox 21:38, 9 January 2015 (UTC)[reply]
Hopefully there's not a lot of entries like this. Donnanz (talk) 22:02, 9 January 2015 (UTC)[reply]
I think confix only works properly when a prefix is linked to a suffix with no word in between. Donnanz (talk) 10:03, 10 January 2015 (UTC)[reply]
I commented on exactly the same problem at Template talk:confix#Category sorting key. This really needs to be fixed. --Mormegil (talk) 11:26, 12 March 2015 (UTC)[reply]
Yeah, I have found it better to use the affix template (as mentioned by CodeCat above) in certain cases. The affix template is a relatively recent creation. Donnanz (talk) 11:38, 12 March 2015 (UTC)[reply]

Can anyone fix the tagging of the "Appendix" namespace, please?[edit]

See discussion here: http://sourceforge.net/p/kiwix/discussion/604122/thread/1eacb6d8/

Thank you.— This unsigned comment was added by 198.23.103.67 (talk) at 13:59 9 January 2015.

No comments?  :( — This unsigned comment was added by 198.23.103.67 (talk) at 10:11 15 January 2015.

Could you state exactly what you are seeking and why? Generally speaking, only parts of Appendix namespace have content that meets minimum standards IMO. Separating the wheat from the chaff will take some time if contributors are willing to undertake the task. DCDuring TALK 15:17, 15 January 2015 (UTC)[reply]
We really need a separate namespace for reconstructions. —CodeCat 15:42, 15 January 2015 (UTC)[reply]
Maybe, but the link above is to a discussion about Appendix:Japanese verbs. I rather doubt that reconstructions are tops on the list of Appendix content that linkers seek. DCDuring TALK 16:01, 15 January 2015 (UTC)[reply]

Bot Request[edit]

Would someone with a bot please be so kind as to perform null edits on all the entries in Category:Pages with module errors? With over 2100 entries from a problem that was quickly fixed a day or two ago, the real module errors are very hard to spot (see błyskać for the one I know about). Thanks! Chuck Entz (talk) 19:47, 9 January 2015 (UTC)[reply]

I'm running it now. It's not actually performing null edits though, just hard purges. It's faster that way. —CodeCat 21:37, 9 January 2015 (UTC)[reply]
Thank you! It was fun watching the member count dropping rapidly every time I refreshed the page. I see that the problem with the Polish entries has been fixed, (the Chinese ones seem to have beenfixed earlier), and the pt-adj monstrosity has been dealt with, so we now have an empty category for the first time in what seems like a month or more. I can handle the new ones that will straggle in from the edit queue, but I gave up on these after clearing about 1,500 of them (what can I say- I'm stubborn!). Chuck Entz (talk) 22:17, 9 January 2015 (UTC)[reply]
Making a bot that does null edits is very easy. If you want, I can give you some Python code that does it? You'll need the Pywikibot package. You probably don't need a bot account as null edits aren't really edits, nothing is actually changed. —CodeCat 22:19, 9 January 2015 (UTC)[reply]

Template for variant spellings.[edit]

I would like to create a template that creates a collapsable table, replaces input strings according to a specific key and logs every single replacement. An example: The input of the template is "a, d, e". Each of these letters has two results according to the key, the alternatives being "ä, ð, 0". I would want to log the variants "ade", "aðe", "äde", "äðe", "ad", "äd", "að", "äð" in an annotated table. Basically an automated form-table like the one that can be seen at enmity.
I tried to figure it out myself through the help pages, but they all seem to be for people who already know to some extent how it all works. For example I could find no help pages on Wiktionary on how to create a table that is collapsable.
So if anyone could give me a link to a help page with the relevant formatting data and explain to me how to include a string-replacement into a template, I'd be grateful. Korn (talk) 10:08, 10 January 2015 (UTC)[reply]
ps.: I'm aware the WT-templates page has a section giving the code to make a table collapsable. But what I meant is that if you've no prior Wiki-experience, making your baby steps here is at least a bit confusing. Korn (talk) 10:31, 10 January 2015 (UTC)[reply]

If I understand you correctly you would like to create a table for variant spellings without regard to whether they were actually attestable or sourced from a reference like the OED. That would be something we wouldn't want in entries. We already have some handmade tables like that which are not useful. The table at enmity might be useful, but the plethora of redlinks suggests that we haven't collected evidence and are relying exclusively on the OED. DCDuring TALK 11:09, 10 January 2015 (UTC)[reply]
Well, I was primarily intending to use it for IPA to translate the basic structure of a word into all the narrow transcriptions for different dialects for Middle Low German. Though, in modern Low German, 99% of words in almost all but one or two varieties are indeed not attestable, because people use German for written communication. And people use a pronunciation spelling for writing, so it could be used in that area as well.Korn (talk) 13:07, 10 January 2015 (UTC)[reply]

I triggered a spam filter?[edit]

I just tried to edit my userpage with a short introduction saying who I am and that I'm an admin and CU over at en.wikibooks. It seemed to have triggered a spam filter, I think. Can someone add this to my userpage?

I am an administrator and check user at English Wikibooks. I'm not very active at English Wiktionary.

Thanks. --Xania (talk) 23:14, 12 January 2015 (UTC)[reply]

 Done. — Ungoliant (falai) 23:18, 12 January 2015 (UTC)[reply]
For future reference, I believe the filter was triggered because you had few contributions and were adding an external link; because your target was another WMF site, you could have avoided the filter by using a link of the form [[:wikibooks:User:Xania|English Wikibooks]]. Cheers, - -sche (discuss) 23:58, 12 January 2015 (UTC)[reply]
Thanks. I had thought that a WMF URL would have been exempt but I'd forgotten that I could have used a shortcut instead.--Xania (talk) 00:12, 13 January 2015 (UTC)[reply]

In e.g. indogermanisch, the name of the "dialect" to which the alt forms belong (in this case the qualifier is not a dialect but an explanation that the alt forms are abbreviations) should be in parentheses, like it would be if the old method of formatting were used. - -sche (discuss) 01:46, 13 January 2015 (UTC)[reply]

Bump. - -sche (discuss) 20:26, 19 January 2015 (UTC)[reply]
See this discussion. --Vahag (talk) 10:42, 26 January 2015 (UTC)[reply]
Why are the forms listed in a different font face? —Aɴɢʀ (talk) 20:39, 19 January 2015 (UTC)[reply]

Template question[edit]

Hello,

I'm new to the Mediawiki markup language. I've already read several help pages about templates.

According to m:Help:Parameter_default, {{{p|q}}} outputs "p", if "p" is defined. Otherwise "q".

I still don't get a particular syntax:

{{{a|{{{b|c}}}}}} gives c
{{{{{{a|b}}}|c}}} gives c - parameter b is undefined

Can someone explain to me, why the result is "c"? In both examples, "a" might be defined?

The assumption in the examples is that there are no defined parameters.
{{{a|{{{b|c}}}}}}
is interpreted as
  • is parameter a defined?
  • Yes: return the contents of parameter a.
  • No: is parameter b defined?
    • Yes: return the contents of parameter b
    • No: return the literal string "c".
{{{{{{a|b}}}|c}}}
is interpreted as
  • is parameter a defined?
  • Yes: return the contents of parameter a
  • No: return the literal string "b"
  • what was just returned will now be interpreted as a parameter name, being either what was in a, or "b". Assuming that there are no defined parameters, this will be "b", so...
  • is parameter b defined?
  • Yes: return the contents of parameter b
  • No: return the literal string "c"

I'm also struggeling with an expression like this one:

{{#if: {{{A|{{{B|{{{C|}}}}}}}}} | XXX | YYY }}

If I understand the syntax correctly, the output will be "XXX" if any one the three variables "A", "B" or "C" is defined. Otherwise (if none of these variables are defined) "YYY" will be the output. Is my assumption correct?

Almost.
So #if has three elements: the truth statement, the return value if true, and the return value if false.
If the parameter A is defined, its contents will be returned as the truth statement. If it isn't defined, then parameter B will be evaluated. If it isn't defined, C. If that isn't defined either, then the empty string "" will be returned.
The truth value is then determined. Simply: if the return value is undefined or an empty string, then it evaluates as false, otherwise as true.
So if all of the parameters are undefined, the result of the #if statement will be "YYY". It will also be "YYY" if the contents of the first defined parameter in A, B, or C, is the empty string (that is, if it is called from an invocation like A=.) So you can force it to output a negative result, even if B and C are defined.
If you don't put a default value in an expansion, and the parameter is not defined, then the result will be the string unexpanded.
So if you have
{{{A|{{{B|{{{C|}}}}}}}}}
with no parameters defined, the result will be {{{C}}}, which is, C is not defined, so the result is {{{C}}}, which is a valid string, which is what is passed back as the result. This is not an empty string, so it evaluates as true, and the result of
{{#if: {{{A|{{{B|{{{C}}}}}}}}} | XXX | YYY }}
(that is, with no default for the C parameter) will be "XXX"
Have I confused you utterly? --Catsidhe (verba, facta) 12:09, 13 January 2015 (UTC)[reply]
Hello Catsidhe, thanks for your fast and extensive response! ;))) Thanks!
Just for clarity:
{{{a|{{{b|c}}}}}}
works the same way
{{{{{{a|b}}}|c}}}
does?
I also learned, that a defined (but empty) parameter affects the logic flow. Until now I assumed, that empty values were input errors by user, but now I rescind that statement.
I think I've been way off with my interpretation of
{{#if: {{{A|{{{B|{{{C|}}}}}}}}} | XXX | YYY }}
I lost you at "(that is, with no default for the C parameter) will be "XXX"":
First you said, that if the variables A,B are not defined, the output will depend on the value of C. If C isn't defined either, the empty string will be returned and "YYY" will be the output.
On the other hand, you said that
{{{A|{{{B|{{{C|}}}}}}}}}
will be evaluated to
{{{C}}}
and that is not an empty string, independent of whether C is defined or not. Thus "XXX" will follow.
This is the point where I am confused. Lets assume C has no default value and no value is given for C, where the template is called. Will "XXX" or "YYY" follow?
Citronas (talk) 13:27, 13 January 2015 (UTC)[reply]
Recall that {{{x|y}}} means if x is defined, return "{{{x}}}", otherwise, return "y". So look at the evaluation step-by-step:
  • "{{{A|{{{B|{{{C|}}}}}}}}}" is what we start with. Is parameter A defined?
    • Yes? Then it becomes "{{{A}}}".
    • No? Then it becomes "{{{B|{{{C|}}}}}}". Is parameter B defined?
      • Yes? Then it becomes "{{{B}}}".
      • No? Then it becomes "{{{C|}}}". Is parameter C defined?
        • Yes? Then it becomes "{{{C}}}".
        • No? Then it becomes "" (empty string).
I hope this helps. —CodeCat 13:42, 13 January 2015 (UTC)[reply]
It's more like {{{C}}} evaluates to the contents of the parameter C, if it is defined. If it is not defined, then it evaluates to the string "{{{C}}}". The {{{C|a}}} says that if C is not defined, then return the string "a". {{{C|}}} says that if C is not defined, then return the empty string. I was contrasting the behaviour of {{{C|}}} and {{{C}}}. With the alternate value of the empty string, the whole construction will return the empty string if none of the parameters are defined, and the empty string has the truth value "false". Without that pipe character, if none of the parameters are defined, it will return the string "{{{C}}}", which has the truth value "true".
{{{a|{{{b|c}}}}}} and {{{{{{a|b}}}|c}}} do not work the same. If a is defined, then the first will return the value of a. Else if b is defined it will return the value of b, else it will return the string "c". The second will first evaluate whether a is defined, and return either that value or the string "b", and then use that string (either a or "b") and see whether that is a defined parameter. If it is, it will return the contents of that parameter, otherwise it will return "c". So if the contents of a is "z", it will evaluate {{{a|b}}}, which will return "z", which then becomes {{{z|c}}}. If parameter z is defined, then that value will be returned.
--Catsidhe (verba, facta) 19:49, 13 January 2015 (UTC)[reply]
I finally got it ;) Thanks Catsidhe and CodeCat!! I wasn't aware of the difference between {{{C|}}} and {{{C}}}. I should have asked here earlier, instead of guessing for 2 weeks straight =) Citronas (talk) 10:10, 14 January 2015 (UTC)[reply]

Template:eo-head seems to be displaying noun and adjective inflections slightly differently: for adjectives, it presents them in the order "plural, accusative singular, accusative plural" (e.g., aĝa), whereas for nouns, it presents them in the order "accusative singular, plural, accusative plural" (e.g., ŝnuro). I think it is desirable to have the inflections presented in the same order for both adjectives and nouns, so I would appreciate it if someone could change Template:eo-head so that it uses the order "plural, accusative singular, accusative plural" for both parts of speech. This is the order that Template:eo-noun and Template:eo-adj both use. Thank you! —Mr. Granger (talkcontribs) 00:02, 16 January 2015 (UTC)[reply]

I think that putting the accusative singular first makes more sense, because the accusative plural is derived from the nominative plural. And what happens for nouns with no plural? With the ordering you propose, you would end up with the accusative singular changing positions because the plural before it disappears. It looks neater if the plural is just taken off the end instead. Furthermore, we already put singular cases before nominative plural in the headword lines of Russian and Slovene, and probably other languages too. —CodeCat 00:07, 16 January 2015 (UTC)[reply]
That's fine with me. I just want all three templates to use the same order for both parts of speech. —Mr. Granger (talkcontribs) 00:30, 16 January 2015 (UTC)[reply]
I think it's fixed now. —CodeCat 00:59, 16 January 2015 (UTC)[reply]
Thanks! —Mr. Granger (talkcontribs) 01:05, 16 January 2015 (UTC)[reply]

parameter id= has stopped working [edit]

The parameter id=, used in {{m}} and {{l}} for linking to {{senseid}}-generated targets, has stopped working. See for example in the etymology of भाति (bhāti). Please fix it. --Vahag (talk) 11:25, 18 January 2015 (UTC)[reply]

It works now. id= wasn't broken, it was just being ignored for Appendix pages because we don't need to link to language sections. But we do need to be able to link to ids, so I changed that now. —CodeCat 12:37, 18 January 2015 (UTC)[reply]
I see, thanks. --Vahag (talk) 15:17, 18 January 2015 (UTC)[reply]

The above category is suddenly collecting suffixes, proper nouns, pronouns and noun forms. Any idea what may be the cause? Thanks. --Panda10 (talk) 19:28, 19 January 2015 (UTC)[reply]

They have a noun declension template with n=sg. You need some way to distinguish between declensions of nouns and other parts of speech in the template. DTLHS (talk) 19:33, 19 January 2015 (UTC)[reply]
Thanks! --Panda10 (talk) 20:04, 19 January 2015 (UTC)[reply]
@CodeCat I believe this problem is coming from the new Module:hu-nominals. Is there a way to correct it? --Panda10 (talk) 20:04, 19 January 2015 (UTC)[reply]
I've removed the category for now, but I'm confused what's wrong with showing proper nouns there. They are nouns after all. —CodeCat 20:44, 19 January 2015 (UTC)[reply]
Ok, thanks for the correction. The nominal inflection module is probably not the best way to do this type of categorization since we are using it for nouns, adjectives, numerals, pronouns, and even suffixes. I will add the category when needed using other methods. --Panda10 (talk) 21:01, 19 January 2015 (UTC)[reply]
I used the modules from other languages as a base when making it, so there were some remnants like that. —CodeCat 21:15, 19 January 2015 (UTC)[reply]

Tamil transliteration rules are incomplete[edit]

In அஃகம், you can see that a couple letters (namely ஃக) aren't transliterated. I presume this should be addressed. - -sche (discuss) 20:26, 19 January 2015 (UTC)[reply]

@DerekWinters, Wyang pls help if you can. --Anatoli T. (обсудить/вклад) 01:17, 20 January 2015 (UTC)[reply]
I think it should be "aḥkam". It's visarga + ka (ka). --Anatoli T. (обсудить/вклад) 01:22, 20 January 2015 (UTC)[reply]
My diff didn't work in Module:ta-translit. --Anatoli T. (обсудить/вклад) 01:25, 20 January 2015 (UTC)[reply]

Redlinks by language[edit]

Is it possible to acquire a list of words wanted in a given language? That is, pages with a redlink encased in a template such as {{m|xyz|word}} leading to them?

As far as I can tell this is not possible within MediaWiki software, but it sounds like information extractable from a database dump perhaps. --Tropylium (talk) 14:38, 20 January 2015 (UTC)[reply]

It would be possible, but difficult. But much easier would be to find those enclosed in {{l}}, {{m}}, {{term}} as all of these have a language parameter, position 1 for {{l}} and {{m}}, lang= for {{term}}. How are your skills with regular expressions? DCDuring TALK 15:50, 20 January 2015 (UTC)[reply]
For l and m this should be OK (in Python):
r"{{(?:l|m)(?:\|.*?=.*?)*(?:\|(LANGCODE))(?:\|.*?=.*?)*(?:\|(WORD))(?:\|.*?=.*?)*}}" #gives two groups: langcode and word. 
As for term, as it can have kinda difficult expression I would first transform term's into l's or m's like this:
from r"{{term(.*?)(\|lang=(LANGCODE))(.*?)}}" into r"l|\3\1\2\4"
P.S. I think regex will have hard time working on that huge file though.
--Dixtosa (talk) 18:11, 20 January 2015 (UTC)[reply]
I run a Perl script every month that extracts and counts instances taxa enclosed in {{taxlink}} (on 11K pages). It runs in less than 30 seconds, but virtually all instances are red links, so it doesn't have to compare the list of all terms enclosed in {{l}} (on 362K pages) and {{m}} (on 42K pages) with a list of all headwords, let alone a list of all entries in a given language. In addition all terms enclosed in {{taxlink}} are Translingual lemmas.
There are also templates such as {{l/es}} (223K pages) (Compare {{l|es}} (10K pages).) that enclose words from only a single language. IOW, it would be easy to generate, for example, l|es-, m|es-, and l/es- linked words in Spanish. I think they are supposed to all be lemmas. Subtracting members of Category:Spanish lemmas shouldn't be too hard. DCDuring TALK 19:47, 20 January 2015 (UTC)[reply]
I wouldn't assume that all terms linked with {{l}} and {{m}} (and l/XX templates) are lemmas. There are all sorts of times when nonlemma forms might find themselves inside those templates. —Aɴɢʀ (talk) 20:39, 20 January 2015 (UTC)[reply]
If they mostly are, the exercise would still probably be worth it. But it would probably be worthwhile to subtract all entries in a given language, rather than just all lemmas. In any event the remaining entries would still have to be looked at one at a time for purposes of actually adding new L2 sections or new pages DCDuring TALK 22:04, 20 January 2015 (UTC)[reply]
"Take all links with a language code, subtract all existing entries" is the obvious brute force option, sure. I'm wondering more if it is possible to speed things up a bit: first acquire a list of redlinks in the main namespace, then retrieve the referring wikicode(s) for each? --Tropylium (talk) 11:12, 21 January 2015 (UTC)[reply]
We have Category:Terms having red links in their inflection table by language already, which works with various templates. Examples of such templates are Template:es-adj and Template:ast-noun. --Walled brick (talk) 11:25, 21 January 2015 (UTC)[reply]

This module error wasn't here a couple of days ago, but the entry's edit history doesn't show anything since January 11, and when I look at "Templates used in this section:" for the section that has the error, none of the templates listed has any edits in the past week:

  • Template:ja-phrase last edit July 27, 2014 (my time zone)
  • Module:ja last edit December 24, 2014
  • Module:ja-headword last edit December 25, 2014
  • Module:languages last edit September 26, 2014
  • Module:languages/data2 last edit January 13, 2015

Can anyone explain where this module error came from? It looks like it materialized out of thin air. Has there been a system change that might explain this? Chuck Entz (talk) 03:47, 21 January 2015 (UTC)[reply]

Fixed. The function find_kana in Module:ja-headword tries to find from the arguments a pure kana parameter, and it fails to detect one if the fullstop "。" is included. Wyang (talk) 05:59, 21 January 2015 (UTC)[reply]
Thank you! Any idea why it waited a week from the last edit before the error showed up?. Chuck Entz (talk) 07:01, 21 January 2015 (UTC)[reply]
No idea, it might be the reason these edits were needed. Wyang (talk) 08:53, 21 January 2015 (UTC)[reply]

Mismatch between L2 and language declared in etymology[edit]

Is this and this, i.e. the use of a language code as the lang= parameter of {{borrowing}} or as the second parameter of {{etyl}} that doesn't correspond to the L2 header, something a bot could check for periodically? It doesn't always need to be cleaned up to the language code that corresponds to the L2; sometimes it needs to be switched to use "-", as here. - -sche (discuss) 17:58, 21 January 2015 (UTC)[reply]

@-sche: User:DTLHS/bad etymology. I have excluded Chinese from the list. DTLHS (talk) 01:24, 22 January 2015 (UTC)[reply]
Some of the pages in the list use {{compound}} and related templates with nocat=. Those should really be excluded. —CodeCat 01:38, 22 January 2015 (UTC)[reply]
Excluded. DTLHS (talk) 01:42, 22 January 2015 (UTC)[reply]
Thank you! - -sche (discuss) 02:13, 22 January 2015 (UTC)[reply]
I really don’t like the way {{borrowing}} works. I think it should work the same way as {{etyl}}. — Ungoliant (falai) 02:36, 22 January 2015 (UTC)[reply]
It seems that people expect {{unk.}} to work the same way as {{etyl}}, too. - -sche (discuss) 02:38, 22 January 2015 (UTC)[reply]
I wonder if {{rfe}} should have multiple parameters, for the probable language of the etymology (if you knew it was Latin) as well as the requesting entry language. DTLHS (talk) 02:45, 22 January 2015 (UTC)[reply]

Since I don't expect that page is monitored that well, I'm posting here requesting that someone take a look at my request on Wiktionary talk:AutoWikiBrowser/CheckPage#Technical 13. Thank you. Technical 13 (talk) 20:59, 21 January 2015 (UTC)[reply]

Partial string search[edit]

Is it possible to search for partial string with regular expressions or something else? Currently, I need to find all Russian words with Cyrillic "-вств-" in them (e.g. чу́вство (čúvstvo) to fix a pronunciation rule in Module:ru-pron. I have mistakenly defined the rule with a silent first "в", as in чу́вство (čúvstvo), здра́вствуйте (zdrávstvujte) but there are cases when it's pronounced, I forgot what those words are! One example is де́вственница (dévstvennica).

I think the advanced search functionality would be useful in various case, e.g. when looking for words having the same stem or suffix, etc. --Anatoli T. (обсудить/вклад) 22:56, 21 January 2015 (UTC)[reply]

AWB search returned 31 results.
безнравственность, Соединённое Королевство Великобритании и Северной Ирландии, королевство, Соединённое Королевство, чувствовать, почувствовать, чувствоваться, почувствоваться, здравствуйте, здравствуй, чувство, девственник, колдовство, предчувствие, девственница, кумовство, девственность, воровство, здравствовать, девственная плева, да здравствует, сочувствовать, сочувствие, нравственный, чувствительный, лукавство, нравственность, отцовство, рыболовство, самочувствие чувствительность --Panda10 (talk) 00:45, 22 January 2015 (UTC)[reply]
@Panda10 Thanks a bunch! Is that a complete list? (As it turns out, silent "v" in the beginning of the cluster is less common than pronunciation "/fstv/.) --Anatoli T. (обсудить/вклад) 02:10, 22 January 2015 (UTC)[reply]
Yes and it is very easy if you are on unix-like machine (for example Linux). You just download the "List of all page titles"(it is only 55MB). And run command
$ grep вств enwiktionary-20150102-all-titles
but it doesn't filter by language.
Yes this is a complete list as of 2015-01-02. --Dixtosa (talk) 08:46, 22 January 2015 (UTC)[reply]
Special:Search/insource:/вств/ (warning: slow!) Keφr 09:24, 22 January 2015 (UTC)[reply]
Thank you all! --Anatoli T. (обсудить/вклад) 14:09, 22 January 2015 (UTC)[reply]

Alternative font for alternative forms?[edit]

Why does {{alter}} and/or Module:Alternative forms display forms in a font different from the default font? How do we fix that? —Aɴɢʀ (talk) 12:56, 25 January 2015 (UTC)[reply]

Maybe because local sc = args["sc"] or "polytonic"? Keφr 18:06, 25 January 2015 (UTC)[reply]
So what should it say? —Aɴɢʀ (talk) 18:30, 25 January 2015 (UTC)[reply]
Probably this. What is the actual point of {{alter}} anyway? {{l}} paired with {{qualifier}} works well enough for me. Keφr 18:42, 25 January 2015 (UTC)[reply]
I dunno, I've never used it myself. I just noticed that it looked funny when other people use it. —Aɴɢʀ (talk) 19:10, 25 January 2015 (UTC)[reply]
It still has the problem that it doesn't display the qualifier label in parentheses; see Wiktionary:Grease_pit/2015/January#Module:Alternative_forms. If it can't be fixed soon, I'm tempted to start restoring functional manual formatting in entries that use it. - -sche (discuss) 19:18, 25 January 2015 (UTC)[reply]

Automating removal of Category:German lemmas categories (and presumably other languages)[edit]

I've never really dabbled in the automation side of Wiktionary, so I thought I'd ask here: is it possible, using AWB or a bot, to go through the German parts-of-speech categories and remove categories like Category:German nouns/Category:German adjectives etc from pages that already has template:head or one of its descendants? The problem is that template:head automatically parses German words with special characters in order to correctly alphabetise them in dictionary order (so it puts gären between garen and garnieren). However, putting a lemma category on the page then overrides this and causes the default sort to take precendence, which puts non-ASCII characters after ASCII (which means gären gets sorted after gustieren, between gähnen and gönnen). Simply removing the category where it's unnecessary would ensure that terms including special characters get correctly sorted. I've corrected a few entries by hand (eg. [1], [2]) but it's hard to find these improperly categorised pages manually when they don't start with an umlaut.

Presumably other languages have this problem too. Category:Spanish adjectives has ñango (which is only categorised through template:es-adj) next to namibio, but ñoño (which is explicitly categorized) is sorted next to zurdo. I'm using German as an example solely because that's a language with collation rules I know fairly well. Smurrayinchester (talk) 09:39, 26 January 2015 (UTC)[reply]

I've been going through the German topic categories and fixing the ones with diacritics to use {{catlangcode|de|Blah}} instead of bare [[Category:de:Blah]] for the same reason: {{catlangcode}} uses smart sorting, and the bare Category: code doesn't. —Aɴɢʀ (talk) 20:19, 26 January 2015 (UTC)[reply]