Wiktionary:Grease pit

Definition from Wiktionary, the free dictionary
(Redirected from Wiktionary:GP)
Jump to: navigation, search

Wiktionary > Discussion rooms > Grease pit

Welcome to the Grease pit!

This is an area to complement the Beer parlour and Tea room. Its purpose is specifically for discussing the future development of the English Wiktionary, both as a dictionary and as a website.

The Grease pit is a place to discuss technical issues such as templates, CSS, JavaScript, the MediaWiki software, extensions to it, the toolserver, etc. It is also a place to think in non-technical ways about how to make the best free and open online dictionary of "all words in all languages".

It is said that while the classic beer parlour is a place for people from all walks of life to talk about politics, news, sports, and picking up chicks, the grease pit is a place for mechanics, engineers, and technicians to talk about nuts and bolts, engine overhauls, fancy paint jobs, lumpy cams, and fat exhausts. That may or may not make things clearer... Others have understood this page to explain the "how" of things, while the Beer parlour addresses the "why".

Permanent notice

  • Tips and tricks about customization or personalization of CSS and JS files are listed at WT:CUSTOM.
  • Other tips and tricks are at WT:TAT.
  • Everyone is encouraged to expand both pages, or to come up with more such stuff. Other known pages with "tips-n-tricks" are to be listed here as well.

Grease pit archives +/-
2006
2007
2008
2009
2010
2011
2012
2013
2014


Contents

February 2014[edit]

Topic categories link to nonexistent (misnamed) Wikipedia articles[edit]

For some reason, some topic categories link to Wikipedia articles which match the category name, e.g. Category:de:Biochemistry says "Wikipedia has an article on: De:Biochemistry" and Category:xpr:Biology says "Wikipedia has an article on: Xpr:Biology". But the English Wikipedia does not have an article titled "De:Biology", and the German Wikipedia does not have an article titled "Biology". Other categories, e.g. Category:de:Anatomy, link in what I assume is the intended way, to "Anatomy". - -sche (discuss) 02:44, 2 February 2014 (UTC)

Piped links to the specific page. DTLHS (talk) 02:54, 2 February 2014 (UTC)
I don't follow. How can the links to Xpr:Biology etc be fixed (to be links to w:Biology etc)? 03:35, 2 February 2014 (UTC)
It seems that the problem here is that the {{topcatdescdefault}} in Template:topic cat description/Biochemistry was supposed to be substed. In my opinion, that's a design flaw and going and substing all of them is not a good solution. Clearly this is an issue with a large number of category pages (see Special:WhatLinksHere/Template:topcatdescdefault). --WikiTiki89 04:02, 2 February 2014 (UTC)

Template:rare form of versus Template:rare spelling of[edit]

Difference??? TeleComNasSprVen (talk) 02:49, 2 February 2014 (UTC)

The same as between {{alternative form of}} and {{alternative spelling of}} probably. —CodeCat 02:50, 2 February 2014 (UTC)
They literally perform the same function though. No reason to have two templates when we can simply parameterize one to accommodate the other. TeleComNasSprVen (talk) 02:52, 2 February 2014 (UTC)

Category:English British forms versus Category:British English forms[edit]

Same question as above. TeleComNasSprVen (talk) 04:00, 2 February 2014 (UTC)

The first category is terribly named; it should be deleted and its contents moved into the second category. I thought it was the Missgeburt of a newb until I saw with considerable surprise who had created it. - -sche (discuss) 09:49, 2 February 2014 (UTC)
User:CodeCat must have had a mental block when creating Category:English British forms. --WikiTiki89 09:53, 2 February 2014 (UTC)
"English British forms" clearly refers to forms of the British language as spoken in England. —Aɴɢʀ (talk) 11:11, 2 February 2014 (UTC)
I only created the category because it wasn't empty, it was being filled by {{spelling of}} and I didn't really know what else to do with it. I figured that if the category existed, its oddness would be noticed more easily by people who knew what to do. That worked, apparently. :) —CodeCat 14:17, 2 February 2014 (UTC)
Emptied and deleted the former. Keφr 12:36, 2 February 2014 (UTC)
  • It would seem that the relatively few uses of {{spelling of}} need to be reviewed so that dumb category names are not forced on us by template writers. Say, wouldn't that be a good idea for several of the categorizing templates? DCDuring TALK 14:35, 2 February 2014 (UTC)

How would I process special characters in Python?[edit]

Discussion moved from Wiktionary:Beer parlour/2014/February.

I can't even print them out. I tried encoding with utf-8, utf-16, iso8859-1, iso8859-7, iso8859. I also tried decoding with them. I also tried encoding then decoding. I even tried decoding then encoding. All didn't work. --kc_kennylau (talk) 16:50, 2 February 2014 (UTC)

This question doesn't belong in the Beer Parlour. It's not about Wiktionary at all. —CodeCat 16:57, 2 February 2014 (UTC)
I am writing a script to feed my bot. How is it not related to BP? --kc_kennylau (talk) 17:00, 2 February 2014 (UTC)
It's not a policy question, it's a technical question- so it should be asked at Grease Pit. In cases where you're not sure where to ask, you can start at the Information desk. Chuck Entz (talk) 17:11, 2 February 2014 (UTC)
I would start with reading the error message. And giving details about the OS, Python version, and other such things. Trial-and-error programming is never the solution. Keφr 18:20, 2 February 2014 (UTC)
.encode('iso8859-7'): UnicodeEncodeError: 'latin-1' codec can't encode characters in position 0-2: ordinal not in range(256), Python 2.7.6, Windows Vista --kc_kennylau (talk) 18:33, 2 February 2014 (UTC)
.encode('utf-8') and .encode('utf-16') gives me no error, but a cluster of question marks. --kc_kennylau (talk) 18:34, 2 February 2014 (UTC)
Are you running this from inside cmd.exe? Try 0) running chcp 65001 at the command prompt, 1) changing console font. Keφr 18:38, 2 February 2014 (UTC)
chcp 65001:

Traceback (most recent call last):

 File "orphan-el-altern.py", line 2, in <module>
   import catlib
 File "C:\Users\Lau's family\Desktop\compat\compat\catlib.py", line 21, in <module>
   import wikipedia as pywikibot
 File "C:\Users\Lau's family\Desktop\compat\compat\wikipedia.py", line 9723, in <module>
   exec "import %s_interface as uiModule" % config.userinterface
 File "<string>", line 1, in <module>
 File "C:\Users\Lau's family\Desktop\compat\compat\userinterfaces\terminal_interface.py", line 12, in <module>
   from terminal_interface_win32 import Win32UI as UI
 File "C:\Users\Lau's family\Desktop\compat\compat\userinterfaces\terminal_interface_win32.py", line 10, in <module>
   import terminal_interface_base
 File "C:\Users\Lau's family\Desktop\compat\compat\userinterfaces\terminal_interface_base.py", line 13, in <module>
   transliterator = transliteration.transliterator(config.console_encoding)
 File "C:\Users\Lau's family\Desktop\compat\compat\userinterfaces\transliteration.py", line 2019, in __init__
   while value.encode(encoding, 'replace').decode(encoding) == "?" and value in self.trans:

LookupError: unknown encoding: cp65001

--kc_kennylau (talk) 18:51, 2 February 2014 (UTC)
Oh my God, I used .encode('cp950') and everything is so beautiful. --kc_kennylau (talk) 19:05, 2 February 2014 (UTC)

Embrace the wiki[edit]

So we use templates with a lang= parameter to generate links to language sections, JavaScript to make section red-links orange, special templates to neuter self-links, and a framework of CSS and JavaScript (tabbed languages) to make language sections look like pages.

Wouldn’t it just be simpler to put a term at en/term instead of term#English already? —This unsigned comment was added by Mzajac (talkcontribs).

  • We have this discussion every so often, and I still oppose it. Putting langcodes in titles complicates searching beyond MW capabilities and thus requires disambiguation pages. Plus, it makes it harder for editors who work in different languages on the same page. It's not worth the switch. —Μετάknowledgediscuss/deeds 19:35, 2 February 2014 (UTC)
    • Then maybe that's where our customisation efforts should be. We should adjust our software to fit our needs, instead of trying to patch our way around it to fit a model that fundamentally doesn't work. The "one page per word in all languages" model just doesn't work. It's cumbersome and no other dictionary would do it that way. The only reason we do it that way is that Wiktionary started off long ago as a mostly English-only dictionary, and it only really had Wikipedia to look to for ideas on how to structure a wiki. But now that we have so much more experience with how a wiki-based dictionary works, we are also in a much better position to judge what works and what doesn't. I think it's pretty clear that there is a lot of room for improvement. I think the argument that it's more convenient for people who want to work on different languages on the same page is so minor that it shouldn't even count. We shouldn't be letting convenience for a minority of editors (and editors are themselves a minority compared to users) get in the way of improving the usability at the more fundamental level. Of course the inertia for such a big change is going to be high, but I think if you ignore the need for changing this, then you're really sticking your head in the sand. —CodeCat 19:46, 2 February 2014 (UTC)
      • Is there any technical reason why, if we did this, each page couldn't transclude all of its subpages automatically so it would look the same as it does now? DTLHS (talk) 19:52, 2 February 2014 (UTC)
        • It should be possible, but I don't know how easily. We could also make automatic disambiguation pages, which would be nice for people with slower connections. Maybe it could be an option to either transclude or just list links. —CodeCat 19:55, 2 February 2014 (UTC)
I like the idea of an automatic index to common spellings, to serve as an index for readers or search aid. But I think it’s mainly not that useful to aggregate whole entries because they are spelled the same in different languages. More useful to index common meanings (thesaurus), common etymology or direct descendants of a term, cognates, synonyms, or things like that.
Translingual entries could be at term, at mul/term (= multiple languages), or at und/term (= undetermined language). Michael Z. 2014-02-02 20:14 z
This change actually simplifies the process of searching when a user want to perform a language-based search, and also simplifies language-based monitoring of changes to entries. I would support if we were at the beginning of things. The site badly needs changes, at the MediaWiki level. --Z 20:29, 2 February 2014 (UTC)
  • @Metaknowledge: is the search issue still true under the new Cirrus search engine? - Amgine/ t·e 21:35, 2 February 2014 (UTC)
I used to oppose this idea, but now I support it. Actually, I’ve been writing a list of pros and cons for some time:
  1. Watchlist more useful: edits to en/foo (or should it be foo/en?) won’t appear in the watchlist of someone who only cares about fr/foo;
    • The model foo/en is one of the primary multi-lingual templating models. - Amgine/ t·e 21:32, 2 February 2014 (UTC)
  2. What links here more useful: terms linking to language X won’t appear in the WLH page of terms linking to language Y. Consequently, it may become a useful tool for searching for related terms, descendants, derived terms and whatnot;
  3. Wiktionary will run faster: if a person only wants to read information about the Portuguese word a he can load the Portuguese content alone, instead of the Portuguese content with content in 77 other languages he doesn’t care about. (This will require the addition of a box below the search box where the person can add the language name);
  4. More flexible use of redirects: the existence of a word in language X won’t prevent us from making the same word in language Y a redirect;
  5. No need for language code parameters: templates will be able to fetch it from the page title (but see con number 1). Consequently:
    1. Less typing;
    2. Creating identical entries will be easier: if multiple entries are identical, as is common for terms in closely related languages, you only need to write the wikicode once and paste it into the other language pages without the need to manually change the lang codes;
    3. Way less incorrect categorisation: is there anyone here who has never seen a weird word in some Terms derived from X category only to find out {{etyl}}’s second parameter was wrong?
  6. Link blueness more useful: if a link is blue, you won’t have to click it to check if the entry you want does indeed exist. Consequently:
    1. Black links less useless: black links are useless because the same word existing in another language completely defeats their purpose. With this change, their purpose will remain undefeated (though the purpose itself will still be pretty stupid IMO);
    2. Easier automatic creation of entries: an entry existing in another language won’t prevent those green creation links from working;
  7. Language name redirects: people will no longer have to memorise which name of a language we use. If someone types Anglo-Saxon in the language box it will take him to ang/foo. If we were nice, we could even have some JavaScript magic to change the lvl 2 heading to match the person’s preferred name;
  8. More flexible use of section linking: we will be able to link to a POS section without fear of the link being rendered useless by someone who adds another language section containing the same POS;
  9. Patrolling easier: with the language in the pagename, it will be easier to skip unpatrolled edits in languages one doesn’t know the first thing about;
  10. IWs more useful: fr/foo would only display IWs whose page contain a French entry. The hub pages (foo) can continue following our current practice of only allowing IWs with the same page name.

Cons:

  1. Conversion will be a nightmare: millions of pages to create and move, thousands of templates to rewrite, new software to create, loads of new practices to decide, a decade of tradition thrown in the dust. How do we undertake the transition without making Wiktionary unavailable during its certainly long duration?
  2. Entries containing / will be invalid: there are very few of these, I think, but we still need to agree on what should become of them;
  3. Adding multiple entries at once will be more difficult: multiple pages will need to be opened. Some of this disadvantage may be offset by pro #5.2;
  4. Translingual issues: by Translingual, I understand terms used across multiple languages (thus its language code mul). So, an entry like cm is relevant to someone interested in French as much as it is to someone who is interested in English. However, if this change is undertaken and a person searches for something like a with French as the language, he might never see the Translingual entry which may contain the sense he is looking for.
Ungoliant (falai) 21:11, 2 February 2014 (UTC) (feel free to comment on individual pros and cons using #: or #*)

I've created Wiktionary:Per-language pages proposal, and added some of the points raised here to it. —CodeCat 21:51, 2 February 2014 (UTC)

I will simply repeat what I said one of the last times this was proposed. At that time, the concern was that a very small number of pages were too large, so some of my comments were specific to that concern:
  • Only a small percentage of pages will ever have more than two language sections [...]. I estimate that a supermajority will only ever have one language section: the inflected forms of Georgian verbs, for example, are unlikely to have homographs; in fact, most inflected forms are unlikely to have homographs: sure bats has some, but fugiebamus? arrodillasen? Likewise words containing clicks (ǃʻûĩ ǂʻàn ǀàũ), hieroglyph transliterations (m3-ḥs3), etc. I vehemently oppose splitting pages by language, but if pages are to be split, I suggest they should be split only after a certain treshhold is passed, e.g. only once they contain 2+ language sections or surpass a certain byte size... otherwise, a tiny tail will be wagging an enormous dog. Are there any prohibitively large pages not in Latin script?
  • If pages are split, how will users know to type rottweiler/en and rottweiler/fi to find the definition of "rottweiler" in English and Finnish, respectively? What will plain rottweiler look like? Will rottweiler transclude all the subpages, so that its display is unchanged? (I could live with that.) Or if you want main pages to be stripped-down disambiguation pages saying "for the Finnish definition, click here", what happens to users who don't know what language a word they want to look up is in, or who want to know what water means in all the languages which use it? They have to click through to each subpage, then go back to the main page and click to the next subpage, to slowly get a picture of the definitions in each language?
  • I disagree with [the occasional] assertion that the current format is only convinient for editors and that subpages would be easier for readers; I think the current format is easier for readers.
- -sche (discuss) 22:20, 2 February 2014 (UTC)
  • @Amgine, Ungoliant MMDCCLXIV: My issues centre around Ungoliant's pros #3 and #7, with his con #2 as a smaller but also problematic point. My concern is that Ungoliant is assuming that there will be a second search bar for language, which presumably could be autofilled by user preference (yes, even by anons). If someone were to have made the JS for that, or to have pledged to make it, I think I could support. But while we are simply relying on MW search like we are now, without any promise of a search bar that would be truly functional for a multilingual dictionary, I cannot support this. However, I believe this is the only deal-breaker for me. Actually, there might be more. The transclusion idea is very important.Μετάknowledgediscuss/deeds 22:34, 2 February 2014 (UTC)
    I agree. The search bar for language is absolutely necessary. Come to think of it, it would be a useful feature whether we change the page structure or not. — Ungoliant (falai) 22:49, 2 February 2014 (UTC)
    I agree a couple of these are concerns, and I do not know if they are surmountable.
    1. The idea the wiktionary will 'run faster' is not true. Most pages are served from cache.
    2. The concept of using a language name in the search box is cool. I suspect it will affect less than 3%, more likely less than 1%, of all searches once it is widely accepted and used, but this is a large improvement even at that rate of use. As long as you understand it's an insider's trick, aimed at the tiny portion of our readership who visit here regularly, then it's cool.
    3. / is used within Mediawiki software to indicate a subpage, which has substantial effects within the software (see mw:Help:Subpages.) I believe this is turned off by default in the main namespace in WMF projects, but I do not know the status of Wiktionary. I seem to recall there's a work-around for using solidii in titles to avoid subpaging.
    - Amgine/ t·e 17:59, 3 February 2014 (UTC)
    Additional comment: there is a greater concern, imo: how will external apps using the API to serve our content to users (there are about 50 last I checked) find a term in a given language? - Amgine/ t·e 18:02, 3 February 2014 (UTC)
    Long pages really do load slower, but it's even more noticeable with editing. Long pages are very hard to edit, in a few cases even impossible because of time-outs. —CodeCat 18:15, 3 February 2014 (UTC)
    That is, generally, a browser issue, not a Mediawiki issue. Furthermore due to the cache-poisoning model used in MW a long page (for example water) has a greater likelihood of needing on-load reparse/recache. IOW: this proposal could make the long-page load problem worse, not better, due to the number of page transclusions. - Amgine/ t·e 18:20, 3 February 2014 (UTC)
    A huge cached pages takes longer to load than a small cached paged. — Ungoliant (falai) 18:30, 3 February 2014 (UTC)
    Yes, it takes longer to download, and the browser may take longer to display it, but it takes less server time than using a search engine to find and send a page. Using the example of water/en, the savings would be negligible. The savings for water/de might be noticeable by the average broadband user, would be very obvious to someone with limited internet. In my opinion it is unlikely someone who would benefit from the filesize savings will also know about a probably obscure method of searching for it. - Amgine/ t·e 18:39, 3 February 2014 (UTC)
    I don’t see how a search bar placed next to the current one will be obscure. — Ungoliant (falai) 18:43, 3 February 2014 (UTC)
    What will be typed in the second search box? the term being sought. And the search will guess which language is being sought, because the user has not declared it. Some correlative studies of en.WT readers suggest the primary audience who are not using an English user interface are students learning English: your search will find the term in their interface language and not the English language which they are actually searching for. Among those who do use an English interface the evidence suggests they are often (but not mostly) English speaking students of a non-English language - their search terms are not English terms.
    For that portion of our readership, and I believe it to be substantial, they would need to search for both language and term - e.g. "Deutsch water" - to find what they want. Most of the English speaking browsers would likewise be completely unaware a term is also used in other languages if they use the second search box. And finally, when there are two optional boxes next to each other on a page, the top item will be used in practice with the unused option relegated obscurity. - Amgine/ t·e 18:57, 3 February 2014 (UTC)
    @Amgine: Editing and previewing pages causes more server load on large pages as well. --WikiTiki89 19:16, 3 February 2014 (UTC)
    Yes, no disagreement on that. - Amgine/ t·e 19:19, 3 February 2014 (UTC)
    Empty language input should take users to the hub page. — Ungoliant (falai) 19:25, 3 February 2014 (UTC)
    Why should’t they display entries? The page titles should help decode search results. If someone searches for “water”, the results headings should look something like water (English), water (Afrikaans), water (Dutch), &c. I wonder if there’s a simple way to add such boilerplate to the titles displayed in search results, as well as title and h1 elements on the page.
    My hope is that if a person searches for water with no language specified, it would take him to a page that looks exactly the same as the current page. I’m trying to figure out a way of doing this without negating pro number 5. — Ungoliant (falai) 19:34, 3 February 2014 (UTC)
    If we can get the developers to do this, we could have a second type of transclusion which transcludes the page in its own context rather than in the context of the current page. --WikiTiki89 19:40, 3 February 2014 (UTC)
Yes, if (a big if) we could get transclusion to work we could just use the existing tabbed languages infrastructure. Otherwise I couldn't support this without major changes to the mediawiki software. DTLHS (talk) 22:39, 2 February 2014 (UTC)
The fundamental idea of hypertext is that you link to pages without duplicating their full text. Have we twisted the wiki so hard that we are thinking backwards about how a website works?
I don’t really understand the urgent need to associate dictionary entries based on spellings in a writing system. Maybe this belongs at the bottom or sidebar of the page, and not up above the heading. If I’m looking at ale#Polish (“but”), I’d prefer a list of links that includes cognates ale#Czech (“but”) and але#Ukrainian (“but”), and perhaps αλλά#Greek (“but”), but omits ale#English (“malt liquor”). Michael Z. 2014-02-03 17:21 z
You would be able to do both. You can either look at the language-less page and see all the words in all languages with that spelling, which is useful if you don't know what language you are searching for or if you are just browsing. Or you can look at the language-specific page and see only that language section. --WikiTiki89 19:18, 3 February 2014 (UTC)
True enough. And cognates are linked from a complete etymology section. I am thinking that a simple index to subpages on the root page would be more useful and simpler to understand than a full transclusion of them. Michael Z. 2014-02-03 19:32 z
Arrowred.png One of the last times this issue arose, I think I was the one who brought it up. Various database-y things are made more difficult by the way each "entry" is currently a bucket for all homographs, as described in that older thread here.
I knocked up a mock up at [[User:Eirikr/Sandbox3/ni]] for one way that this might work: the main page looks effectively the same to the reader as it would now, but each language entry resides at its own sub-page -- so the Kedah Malay entry is at [[User:Eirikr/Sandbox3/ni/meo]], the Zulu entry is at [[User:Eirikr/Sandbox3/ni/zu]], the Japanese entry is at [[User:Eirikr/Sandbox3/ni/ja]]. The main entry (everything-on-one-page entry) at [[User:Eirikr/Sandbox3/ni]] still shows the blue Edit links, and clicking on the Edit link for any language section opens just that language section in the editor (i.e. the edit is of the transcluded lang-specific subpage), in a way that is transparent to the user. The only real difference for editing is that there isn't a way to edit with everything on one page, but there are workarounds for this (opening other language sections for editing in other browser tabs, etc.).
Searching would be the same as now -- typing a specific spelling in a specific script into the search bar would direct the user to the main page for that entry, provided it exists. We could also find a way of implementing language-specific searches, where searches would automatically direct the user to the [[term]]/[[lang]] page instead.
This would apply to *all terms*, regardless of the number of languages for any given spelling (entry). Entries with only one language would not be "split", they'd be moved -- so all Georgian entries would be under [[term]]/[[ka]], for instance. This is much more straightforward and easier to implement than any approach where only some entries have lang code pages and some don't. Combined with the previously described way in which the main page for an entry (without lang code) would be used as the search result landing page, we avoid any need for casual users to even be aware of the lang codes. Moreover, this work of splitting and moving is eminently bot-able.
My 2p, anyway. ‑‑ Eiríkr Útlendi │ Tala við mig 19:45, 3 February 2014 (UTC)
Nicely done. Is there an easy way to make the all-entries page just a list of links that the reader could peruse without scrolling? Michael Z. 2014-02-06 19:55 z
There is a way to get all subpages without having to manually maintain the list: mw.text.unstrip(frame:preprocess('{{Special:PrefixIndex/'..pagename..'/}}')) I am not sure how expensive this is, or how stable. DTLHS (talk) 20:36, 6 February 2014 (UTC)
  • I imagine (perhaps naively) that it might be relatively simple to create a Lua template that would transclude all subpages (or perhaps all subpages that are lang codes, to open the door to having other kinds of subpages, perhaps for maintenance or other reasons), optionally sorted in certain ways. ‑‑ Eiríkr Útlendi │ Tala við mig 20:55, 6 February 2014 (UTC)
    • Listing or transcluding subpages is easy (if we want to transclude them; I don't) but it's hard to find out what subpages there are. I'm also not sure if it's worth the trouble to have to put the template on every single main entry page. It would be much preferred if we could automate it so that these pages didn't need to be created at all. —CodeCat 21:11, 6 February 2014 (UTC)
  • User:DTLHS: Transcluding special pages disables page caching. No other performance issues are known. This was already discovered once, there are a few interesting discussions over at w:WT:Lua. Keφr 21:18, 6 February 2014 (UTC)
    • Right, no caching is definitely unacceptable I think, and we'll have to ask the developers for a solution if this is the path we want to take. DTLHS (talk) 21:30, 6 February 2014 (UTC)
      • Why would no caching be unacceptable? It's not as if the "main" pages would ever change... —CodeCat 21:31, 6 February 2014 (UTC)
        • Won't the lack of a cache reduce performance significantly? And the main page doesn't ever change, but any subpages would need to be loaded each time. DTLHS (talk) 21:41, 6 February 2014 (UTC)
          • Only if they're transcluded, but there's no reason to do that when we can just list them. —CodeCat 21:53, 6 February 2014 (UTC)
  • Arrowred.png Transclusion vs. Listing:
My sense from the above, and from past threads, is that there is a substantial demand to keep something that resembles the everything-on-one-page model, with the option to display on a per-language basis. That was the basic understanding in my head when I created the mock-up page at [[User:Eirikr/Sandbox3/ni]].
Just listing all of the subpages sounds to me like a usability regression -- we're forcing users to click another link before they can get to the information. If everything is at least displayed on one page (as currently, or as at the mock-up), then users don't need to navigate another set of links.
What do others think? Would a list of language links be useful / desirable? ‑‑ Eiríkr Útlendi │ Tala við mig 22:49, 6 February 2014 (UTC)
In past discussions, I have noted that I vehemently oppose splitting content onto subpages, but feel that transcluding all the subpages onto the main page would be a must if the pages were to be split onto subpages; I still feel that way. As Eirikr says, to simply list subpages is an egregious usability regression, particularly because (as I have noted) a supermajority of pages contain (and will only ever contain) one language section. - -sche (discuss) 22:58, 6 February 2014 (UTC)
A lot of the points that are raised here are already raised at Wiktionary:Per-language pages proposal, so they should probably be discussed there, not here. I don't see how it's a usability regression, because you first have to argue that people widely use Wiktionary to look up the same word in many languages at a time. I think you'll find that it's only a small minority of users, and the majority is only interested in one language, so that is what this improvement is targeted for. I think that putting all content on one page like we do now would be the regression, or rather a missed opportunity at progression. —CodeCat 23:03, 6 February 2014 (UTC)
I think the usability regression that -sche is referring to is the fact that we would need to create a hub pages and a subpage even for every entry that only has one language. --WikiTiki89 23:17, 6 February 2014 (UTC)
That's why I suggested that we should look at ways to do this automatically, without having to create pages at all. Ideally, every possible non-subpage would display an automatically-generated list of all languages that have that word, and we'd only need to create subpages. The list entries would update themselves and wouldn't even need to be created as real pages. —CodeCat 23:20, 6 February 2014 (UTC)
Actually they will need to be. Otherwise how would interwikis work? --WikiTiki89 23:25, 6 February 2014 (UTC)
That's the kind of details that we would work out on that dedicated page. It's meant for us to gather all information, pros and cons of each approach, other consequences, and so on. That way we can make an informed decision once we do. Right now we're trying to decide things before we even have all the facts addressed yet. —CodeCat 23:29, 6 February 2014 (UTC)

Template for eg over usex like label over context[edit]

Having discovered and used "label|en", I've abandoned "context|lang=en" in its favor. Liking its brevity and clarity, I then figured that "eg|en" would be a similar if not better improvement over "usex|lang=en".
Although a very inexperienced editor, I thought to put such a template together. Pages like "Help:Template" and "Help:A quick guide to templates" informed me about how they worked but were mysterious about just how to put one up.
If Grease pit people agree with me about this potential template, I would be grateful to have them put it up so that I (and others) can use it. ReidAA (talk) 05:14, 3 February 2014 (UTC)

I agree on the concept, but I think {{eg}} is not the best name. How about {{ux}}? --WikiTiki89 05:18, 3 February 2014 (UTC)
What's wrong with "eg" ? (see here) ReidAA (talk) 21:32, 3 February 2014 (UTC)
One reason is people are more used to {{usex}}, so it would be easier for them to switch to {{ux}}. Also, "e.g." implies that it would be an example of the meaning word rather than an example of the usage of the word. For example, "apple, e.g. Granny Smith" makes more sense than "apple, e.g. I ate an apple.". --WikiTiki89 21:38, 3 February 2014 (UTC)
I think {{usex}} as it is is short enough. It's not used that widely to make it prohibitively long. Compare {{etyl}}, which is used more. —CodeCat 21:40, 3 February 2014 (UTC)
It's not so much the length as the convenience of having the language code as the first positional parameter rather than as a named parameter. In order to do that, we need to create a template with a different name. --WikiTiki89 21:51, 3 February 2014 (UTC)
We don't need to necessarily. In the past, we've "migrated" templates like this by first making the language mandatory, then using the presence of lang= to determine whether the first parameter is the language code or not. —CodeCat 21:55, 3 February 2014 (UTC)
But that's a lot of work. Why not just create an alternate template? --WikiTiki89 21:57, 3 February 2014 (UTC)
Because that's even more work. And I like the current name more. —CodeCat 22:00, 3 February 2014 (UTC)
It's less work. It's three short lines of code added to Module:usex and the creation of another template. And that's all the work that needs to be done. --WikiTiki89 22:01, 3 February 2014 (UTC)
But now we have to support both templates, and will probably want to migrate them all eventually. So now, a bot has to go through all uses of {{usex}}, orphan it, then go back and change all uses of {{ux}} to {{usex}} again later. —CodeCat 22:10, 3 February 2014 (UTC)
That could work, though. It's how I changed the behaviour of {{only in}}: I created {{only-in}}, and (with considerable assistance from Mg) switched entries to use both it and the new format. - -sche (discuss) 22:17, 3 February 2014 (UTC)
I don't see what's wrong with having both templates. --WikiTiki89 22:25, 3 February 2014 (UTC)
I agree with CodeCat that {{usex}} is the best title, and we should Just change its behaviour. However, I expect that using the presence or absence of lang= to tell whether or not the first parameter is the language code is not viable, because I expect that lang= is often omitted (or would be if we made it so that omitting it didn't cause an error, because the template interpreted it as meaning the language code had been supplied in a different way). I suppose temporarily creating a new template à la {{only-in}} may be the best idea. - -sche (discuss) 22:15, 3 February 2014 (UTC)
Why not name the new temporary template {{usex/t}} then, just to make it clear that it's not a name editors should get accustomed to? —CodeCat 22:21, 3 February 2014 (UTC)
I created it at {{ux}} already. --WikiTiki89 22:25, 3 February 2014 (UTC)
Yes, and I'd like it to be moved to {{usex/t}}. —CodeCat 22:38, 3 February 2014 (UTC)
I guess if you have your mind set on it replacing {{usex}} entirely then go ahead. Personally, I don't see why we can't have both. --WikiTiki89 22:43, 3 February 2014 (UTC)
So what's the state of the game ? Is it alright to start using {{ux}} ? ReidAA (talk) 06:32, 8 February 2014 (UTC)
In the absence of an answer and the continuing presence of {{ux}} I'll assume the go-ahead. ReidAA (talk) 21:23, 12 February 2014 (UTC)
(file)
. The usual railroading process seems to be at work:
  1. Make a change that affects everyone, but only discuss it here.
  2. Start a conversion process.
  3. Then, take the fiat demi-accompli to BP when as and if objections start to surface.
  4. Commence a template deletion process in [[WT:RFDO. DCDuring TALK 22:11, 12 February 2014 (UTC)
To be fair, I never intended {{ux}} to replace {{usex}}, but to be an alternative. --WikiTiki89 22:18, 12 February 2014 (UTC)
Then, let a thousand flowers bloom. Experimentation doesn't need much discussion. Put the uniformitarian impulse may kill it prematurely, by going directly to step 4 after a suitable interval. DCDuring TALK 02:05, 13 February 2014 (UTC)

Uncategorized pages[edit]

Special:UncategorizedPages has gotten relatively long again. One reason seems to be that the deletion of the useless "alternative forms" categories has uncovered the fact that some entries weren't in any other categories. In many cases, this is because their headword lines aren't templatized, they're just bolded pagenames. I know we have bots that add {{head|foo|POS}} to newly-created entries that lack it; perhaps those bots could be sicced on the special page. - -sche (discuss) 05:15, 3 February 2014 (UTC)

@-sche: I am now using my bot (Kennybot (talkcontribs)) to clear Special:UncategorizedPages. --kc_kennylau (talk) 09:17, 3 February 2014 (UTC)
Nice work! I spot-checked a dozen of the bot's edits and they all looked good. - -sche (discuss) 19:29, 3 February 2014 (UTC)
@Kc kennylau: Shouldn't changes like this one have actually added "noun form" rather than just "noun"? --WikiTiki89 19:33, 3 February 2014 (UTC)
Wait, are second-person singular simple present tense forms of things nouns (as the POS header claimed even before the bot edited the entry)? Or are they verbs? Turkish has some strange grammar... - -sche (discuss) 19:42, 3 February 2014 (UTC)
I think the "present tense" part is a mistake. It's just the "second-person singular possessive" (as in "thy öğreti). --WikiTiki89 19:45, 3 February 2014 (UTC)
No, it's just User:Sae1962. We have a proper template for these, {{tr-possessive form of}}. —CodeCat 19:47, 3 February 2014 (UTC)

Strange "green link" entry[edit]

I created Angriffen by clicking on a green link in Angriff. It produced a ===Related terms=== heading instead of a ===Noun=== heading, and a headword containing "related terms form" instead of "noun form". What's going on? SemperBlotto (talk) 15:05, 3 February 2014 (UTC) p.s. I haven't corrected it yet.

The acceleration script tries to guess the part of speech based on the header in the entry. Here it apparently thinks that "Related terms" must be the header. It might be because Related terms should come after Declension, but in any case green links have never really been applied to inflection tables, so the script hasn't been properly adapted to them. Currently it just "scrolls back" from where the link was, and uses the first header it finds (regardless of the level) as the PoS. There should be a way to solve it, but I'm not sure how. —CodeCat 15:19, 3 February 2014 (UTC)
OK. Anschlag tried to do the same thing, but was OK when I moved the section. SemperBlotto (talk) 15:24, 3 February 2014 (UTC)
It's a temporary fix at best... if you add a Usage notes section (which does go before Declension according to WT:ELE) the same thing will happen again. I could tell it to ignore that section... —CodeCat 15:26, 3 February 2014 (UTC)
I don't think there's any to solve it, it just eats whatever header that comes before it. See WT:SB. --kc_kennylau (talk) 15:35, 3 February 2014 (UTC)

Editing rights for a user[edit]

DerekWinters (talkcontribs) has been a responsible editor so far. He hasn't been long with Wiktionary but he is requesting editing rights on protected language modules, in particular related to automatic transliteration, such as Module:languages/data2. Can this be granted to him? is this a BP question? --Anatoli (обсудить/вклад) 00:18, 4 February 2014 (UTC)

Right now that page is protected so that only admins can edit it, which is the way it should be because mistakes there can cause all of Wiktionary to fail. --WikiTiki89 00:28, 4 February 2014 (UTC)
That's understandable. In such cases a user can request administrator rights or rather someone can nominate him. He has been with Wiktionary and all his edits are good, working with a number of Indic languages, so I think I can nominate the user. --Anatoli (обсудить/вклад) 00:38, 4 February 2014 (UTC)

Can someone update WT:EDIT for the "new" translation check format?[edit]

Main article: Wiktionary:Beer parlour/2014/January#Proposal to change how translation checks and requests are formatted

For the new translation format with {{t-check}} and {{t-needed}}, some changes need to be made to WT:EDIT. It has to understand both the old and the new format before we can make any other changes, otherwise things will start breaking when it no longer understands our translation tables. I'm not familiar enough with WT:EDIT to trust myself to make the changes properly. Can someone do it? —CodeCat 15:07, 4 February 2014 (UTC)

+1. - -sche (discuss) 21:30, 26 February 2014 (UTC)

"Add translation" script might break when translation table contains {{trreq}}[edit]

The entry where I found this is roil. Some experimentation indicates that the error ("Could not find translation entry for 'sv:foo'. Please reformat") appears if the entry is supposed to be inserted right before a trreq entry. For example, attempting to add an se entry ('Northen Sami') to the first set of translations on that page works, but attempts to add jv ('Javanese'), sv ('Swedish') or sw ('Swahili') fails with an error as above.

Adding an entry after a trreq seems to work, however - cf se. Also, in the second table, where a Swedish translation is already present, it is possible to add a translation to Swahili so I doubt it's a problem with the language templates, but with the context into which it's supposed to be added. \Mike (talk) 17:55, 4 February 2014 (UTC)

This is related to the discussion right above this one. —CodeCat 18:15, 4 February 2014 (UTC)
Oh. I did not make the connection - but duly noted. \Mike (talk) 19:47, 4 February 2014 (UTC)
I asked Mike to report this here, after he asked about the issue in IRC. "Is broken" has a slight difference from "will break". - Amgine/ t·e 21:08, 4 February 2014 (UTC)
The reason it's being changed is because it's already broken, though. The "main" link explains more. To fix the small breakage, we need to make a change that will cause a large breakage unless it's accounted for in advance by adjusting the script. —CodeCat 21:12, 4 February 2014 (UTC)
Oddly enough, I don't care. I suspect some change was initiated which has caused the current condition of brokenness. Any change which does this, when it was predictable before the change was implemented, is by definition a bad change or implementation. Just like the use of obscure/non-intuitive abbreviations, specialty jargon in naming, and undocumented code; there are reasons why sane coding conventions are developed and used. - Amgine/ t·e 21:19, 4 February 2014 (UTC)
I just want to be sure that we're on the same page: you are aware that the bug Mike describes has existed for at least three years and wasn't caused by any recent change, right? (It's possible the bug was present even in the earliest versions of WT:EDIT; I don't know.) The threads above are about how to make changes (they have not been made yet) to fix the script. - -sche (discuss) 21:29, 4 February 2014 (UTC)
Nope, not aware of any of that and I want to keep it that way. Bug reports should be welcomed with open arms because they are the best feedback. - Amgine/ t·e 21:37, 4 February 2014 (UTC)
I'm glad Mike reported the bug. Even though we've known about it for years and have been trying to fix it as recently as the thread directly above this one, it's good to know that other people are aware of the bug and want it fixed. I'm less thrilled with the way you revelled in your ignorance of the situation when CodeCat and I tried to explain it to you. - -sche (discuss) 22:01, 4 February 2014 (UTC)
What little I have seen of 'coding' on this project I mostly am either unqualified to help with, or extremely unwilling. My ignorance had nothing to do with my answers, and I would say much the same now having been 'enlightened'.
Bug-meister style current style
Thanks! We are aware of this issue. The section above is related to this issue as well. This is related to the discussion right above this one.
- Amgine/ t·e 22:07, 4 February 2014 (UTC)

So it was a known, old, bug after all, and not a transitional glitch while the new "translation check format" is being rolled out? (Or whatever that section above is talking about). Then, do we, somewhere, have a source where casual editors can see if the strange behavior observed indeed is known, or if it should be reported? I am of course not talking about a full-blown Bugzilla behemoth, but a simple list of known issues, which would have saved me from some head-scratching and trouble-shooting yesterday. \Mike (talk) 11:50, 5 February 2014 (UTC)

Just a cursory look, but it looks like there are 3 trreq-related issues reported on the talk page User talk:Conrad.Irwin/editor.js. The coding section seems to have an overview of known issues, but I suspect you would need to work through the rest of the page to look for similar issue reports. - Amgine/ t·e 17:03, 5 February 2014 (UTC)

Are the redundant second copies of the simplified or traditional form of Chinese "inflection lines" a bug or a feature?[edit]

Recently I started noticing lots of Chinese "inflection lines" like this:

密码 (simplified, Pinyin mìmǎ, traditional 密碼, simplified 密码)

They start with one form, in this case simplified, indicate the pinyin, give the equivalent traditional form ...

But then give a second, redundant copy of the simplified form.

The equivalent contrary situation occurs for traditional form entries.

I'm assuming this is a low priority bug that's come about while Lua-ifying lots of templates combined with automatically constructed "inflection lines" which pass both the "sim" and "tra" parameter.

Is this done consciously as a feature? I can't imagine why. So perhaps we can look at modifying the templates / Lua modules involved to not display the "tra" parameter even if it's specified, when the form is already traditional, which I believe is specified with the "t" parameter.

And conversely to not display the "sim" parameter on simplified entries, those which have the "s" parameter.

Simplified only needs to tell us the matching traditional and traditional only needs to tell us the matching simplified. I don't believe entries which have identical traditional and simplified forms have any similar kind of redundancy. — hippietrail (talk) 11:19, 6 February 2014 (UTC)

Yes, it's a bit of a problem, which is new. The obvious part (the page itself) should not be duplicated as in 密碼. --Anatoli (обсудить/вклад) 11:30, 6 February 2014 (UTC)
Why is the sim= parameter given if the form is already simplified? That makes no sense. —CodeCat 11:50, 6 February 2014 (UTC)
I agree that it looks strange at first but the simplified entry (s) didn't display simplified, the traditional entry (t) didn't display traditional and shared (ts) didn't display anything. It's just easier to maintain entries when both trad. and simp. have the same part, e.g. |tra=愛好|sim=爱好| --Anatoli (обсудить/вклад) 12:24, 6 February 2014 (UTC)
Then the module should compare both the tra= and sim= parameters to the pagename to figure out which it needs to display. --WikiTiki89 14:14, 6 February 2014 (UTC)
Exactly. Now that we have LUA this kind of thing should be really easy to do. Though I admit I haven't got around to learning WikiLua myself yet. — hippietrail (talk) 14:32, 6 February 2014 (UTC)
Well actually, it would be pretty easy to do even with pure templates. --WikiTiki89 14:37, 6 February 2014 (UTC)
diff seems to have fixed it. —CodeCat 14:43, 6 February 2014 (UTC)
Yes I had a quick look and saw that this/these templates do not yet use Lua. I'll take a look to see the new version. Thanks to all who listened and/or contributed! — hippietrail (talk) 17:46, 6 February 2014 (UTC)
Yes, thank you very much! --Anatoli (обсудить/вклад) 20:24, 6 February 2014 (UTC)

It is not redundant within the template. There may be multiple simplified or traditional forms (eg. 併存), in which case PAGENAME =/= 'simplified' or 'traditional' in the template.

Anyway, those templates are redundant externally, considering essentially all information contained within the template is duplicated elsewhere. There is no point in indicating simp/trad and the other forms if Template:zh-hanzi-box is compulsorily present on all pages. And Pinyin should go to Pronunciation (where it is also duplicated by the IPA pronunciation template) not definition, and be made visible by Template:Pinyin-IPA, since it is more pronunciation of the character in one variety than the inherent transliteration of a character into a different script. Thus those headword templates, which are useful for inflecting European languages, basically contain no critical information at all for Chinese or Vietnamese.

What is worse is the compulsory PoS split for unsuitable languages, which dictates that these redundant information be repeated. What results as a consequence is an entry in which information is unnecessarily duplicated multiple times. eg. 明白. Wyang (talk) 22:41, 6 February 2014 (UTC)

re "there may be multiple simplified or traditional forms (eg. 併存)": good point. AFAICT, the templates as newly modified still allow for that. {{cmn-verb}}, for example, no longer displays "traditional" forms redundantly on the pages of the traditional spellings of verbs that have only one traditional form, but it still displays the multiple forms of 併存.
re "those templates are redundant externally, considering essentially all information contained within the template is duplicated elsewhere": I agree. Someone should make a cmn-head template or module and have all the POS templates use it rather than repeating as much code as they do now.
re "Pinyin should go to Pronunciation": I disagree; it is a decent romanization / way of identifying characters in Latin script (and romanizations go on the headword line); it is an unintuitive guide to pronunciation (e.g. qǐlái being /tɕʰi˨˩lai˧˥/). - -sche (discuss) 23:10, 6 February 2014 (UTC)
Silly question (pinging @user:CodeCat, @user:Wyang): how do I add a call to {{Zhuyin}} passing the pin parameter, just a after pin? (The template delinks pinyin). It should be automatic, so that entries won't need to be modified.
Current: 密碼 (traditional, Pinyin mìmǎ, simplified 密码)
Desired: 密碼 (traditional, Pinyin mìmǎ, Zhuyin ㄇㄧˋ ㄇㄚˇ, simplified 密码)
Module:PinyinBopo-convert seems to be fully functional. --Anatoli (обсудить/вклад) 01:19, 7 February 2014 (UTC)

Automatic sorting[edit]

Thanks to Lua, we've come a long way in automatically sorting entries in categories. {{head}} now knows, for example, that the German umlauts ä, ö, ü are to be sorted as a, o, u, and that ß is to be sorted as ss. {{de-noun}}, on the other hand, does not know that. (Maybe the other German headword-line templates don't know it either, I haven't checked them.) Could someone knowledgable please fix that? Thanks. —Aɴɢʀ (talk) 19:30, 6 February 2014 (UTC)

I've made the change for {{de-noun}}, diff. The others can be modified in the same way. —CodeCat 19:47, 6 February 2014 (UTC)

Problem with {{ro-adj-form of}}[edit]

Apparently recent edits to {{ro-adj-form of}} have made it so that the first parameter is interpreted both as the lemma to be linked to and the gender/number parameter, the second as both the head parameter and the grammatical case, and I'm not sure what it thinks the third parameter is. In short, it looks like all the Romanian adjective form entries are broken. Can someone fix this? Chuck Entz (talk) 07:58, 7 February 2014 (UTC)

It has been fixed. It was a bit of an oversight, sorry. —CodeCat 13:57, 7 February 2014 (UTC)

getVanillaIndexOf[edit]

If someone could review (and, if appropriate, implement) User talk:Conrad.Irwin/editor.js#Fixing_getVanillaIndexOf.28.29, it'd be swell. (Posting here just to be sure the post there gets spotted.) - -sche (discuss) 23:00, 8 February 2014 (UTC)

Strange text-centering behaviour in collapsible boxes[edit]

I can't figure out why this edit diff causes the text in the tables to be centered. It seems totally counterintuitive... I remove things that say "center", and yet..? Is there a way around this? If I apply text-align: left to the main table, then all of the table header cells are also left-aligned, and I don't want that. Only the regular cells should be left-aligned. —CodeCat 02:09, 9 February 2014 (UTC)

The text is now centred because it MediaWiki:Common.css has div.NavFrame {text-align: center}, which is inherited by the table cells.
I am guessing that it wasn’t centred before, because the centring property of the <center> element has some inheritance weirdness. All I could find is MDN saying “This is used to implement the legacy align attributes on some table-related element. Do not use these on production Web sites.”[1] Michael Z. 2014-02-09 06:10 z
The sentence you quote from MDN is referring to e.g. text-align: -moz-center, used for implementing e.g. align="center". (Not sure if you already realize that; the context you give it makes it sound like it's talking about <center>.) —RuakhTALK 08:06, 9 February 2014 (UTC)
Both <center> and <div align=center> are given the property text-align: -webkit-center; in Safari (these were both deprecated in HTML 4). So I suppose these obsolete elements may have similar, unpredictable effects on layout.
But life is short and I don’t want to spend it analyzing how obsolete HTML works. Let’s get rid of it and troubleshoot any real problems. Michael Z. 2014-02-09 19:05 z
Ok, so... how do w get the table to display right without resorting to outdated code? —CodeCat 19:49, 9 February 2014 (UTC)

Bot question[edit]

I haven't botted for a long time, and after a quick successful run it won't let me run it, instead spitting out the error message WARNING: Token not found on wiktionary:en. You will not be able to edit any page. A search around the blagotubes reveals but little. I'm on a Mac using SVN, and I just updated Pywikipediabot. —Μετάknowledgediscuss/deeds 22:12, 9 February 2014 (UTC)

Try updating manually or using something other than SVN? I have heard that causes problems. And make sure your config files are correct... DTLHS (talk) 01:13, 10 February 2014 (UTC)
I installed Git, but I'm not finding the documentation for what to do next. Feel like such a noob again. —Μετάknowledgediscuss/deeds 01:25, 10 February 2014 (UTC)
Could it be caused by the switch to HTTPS? Make sure that your family.py includes the bit added in this patch: https://github.com/wikimedia/pywikibot-compat/commit/6addd6f70a386fd131acbe5c2d0b47f21a0cd68f. (If it doesn't, you can just add it manually.) —RuakhTALK 01:57, 10 February 2014 (UTC)

Slower blue links?[edit]

I've noticed lately that it takes a little longer for black links to turn blue when I enter form-of pages for Latvian words from the declension table in the lemma. For instance, a word like rudmatains "redhaired" has a full declension. I usually add form-of pages by clicking on the black links to start a new pages, write in the form-of information, save it, and then go back to the lemma page, where the respective form in the declension table would now be a blue link. Now, when I add a form-of page and return to the main lemma, the link in the declension table is still black -- if I click on it, it does take me to the form-of page, so I know everything is OK, but it just doesn't immediately turn blue. It eventually does turn blue, in about two or three minutes, so no biggie there; but I just wondered if something had changed at Wiktionary while I was away, something that would slow down the previously near instantaneous conversion of black (or red) links to blue links after you save the corresponding new page. --Pereru (talk) 01:11, 10 February 2014 (UTC)

It's been happening to me too lately; purging the page fixes it, but it's a PITA to have to keep purging pages all the time. —Aɴɢʀ (talk) 08:56, 11 February 2014 (UTC)

template:wikipedia doubly links[edit]

Double wikipedia.png

Template:wikipedia currently has, after its box, a separate link, hidden by CSS (search within that page for interProject), and copied by JS (ditto) to a link in the left margin. This is (a great idea but) poor design: using CSS to hide stuff that's really there: then browsers that don't bother with that CSS will show the extra link, as in the picture to the right.

I propose:

  • that the JS be modified to read even links that have interProject among their classes (not only those that have it as their only class);
  • that the CSS hiding such links be removed;
  • that the extra link be removed from after the box generated by template:wikipedia (and likewise for any similar link generated by another template); and
  • that the remaining link to Wikipedia, which appears in the box generated by template:wikipedia (and any other template that currently uses the class interProject on a separate link), have interProject added as a class.

​—msh210 (talk) 06:20, 11 February 2014 (UTC)

Specifically, change the JS as follows: remove
	var spans = document.getElementsByTagName('span');
 
	// filter for projectlinks
	for (var i=0, j=0; i<spans.length; i++) {
		if (spans[i].className == 'interProject') {
			elements[j] = spans[i].getElementsByTagName('a')[0];
			j++;
		}
	}
and replace it with
	var spans = document.getElementsByClassName('interProject');
 
	// filter for projectlinks
	for (var i=0, j=0; i<spans.length; i++) {
		elements[j] = spans[i].getElementsByTagName('a')[0];
		j++;
	}
in [[mediawiki:Common.js]]. Remove the entire
/* InterProject */
 
.interProject {
	display: none;
	clear: both;
	border-top: 2px dotted #AAAAAA;
	margin-top: 2em;
}
from [[mediawiki:Common.css]]. And make the changes to the templates described above.​—msh210 (talk) 05:30, 12 February 2014 (UTC)

Bot request (Hungarian plurals)[edit]

I am requesting a bot run through all Hungarian plural nouns to make the following change:

From:

# {{plural of|xx|lang=hu}}

To:

# {{hu-inflection of|xx|nom|p}}

The plural entries that are in Category:Hungarian noun forms - nominative already have the new structure since they were created after User:CodeCat added the accelerated noun form creation to the Hungarian declension table.

Thanks in advance! --Panda10 (talk) 14:39, 11 February 2014 (UTC)

To be clear- are you referring to the entries in Category:Hungarian noun forms only? DTLHS (talk) 18:44, 11 February 2014 (UTC)
Yes. --Panda10 (talk) 18:57, 11 February 2014 (UTC)

I would also like to get rid of two templates: {{hu-noun-form}} and {{hu-noun form}} but they are still being used in many of the plural forms. These two templates should be replaced with {{head|hu|noun form}}. Can this be done in the same bot run or should it be a different reqest? --Panda10 (talk) 14:55, 12 February 2014 (UTC)

Update: This last part was completed manually. But the original request still stands. There are about 6500 entries with {plural of|xx|lang=hu} in Category:Hungarian noun forms. --Panda10 (talk) 16:34, 4 March 2014 (UTC)
@Panda10: I started doing it. --kc_kennylau (talk) 10:19, 5 March 2014 (UTC)
@Kc kennylau: Looking great. Thank you for your help. --Panda10 (talk) 19:10, 5 March 2014 (UTC)
@Panda10: Should be Yes check.svg Done. Running the script once more to check it. --kc_kennylau (talk) 11:28, 6 March 2014 (UTC)
@Kc kennylau: Thank you! :) --Panda10 (talk) 13:42, 6 March 2014 (UTC)
@Panda10: Checked, and you're welcome. Cheers! --kc_kennylau (talk) 15:55, 6 March 2014 (UTC)

Regex 'AND' operator[edit]

I am using AWB on a database dump on my PC. I'd like to create a list of entries that contain both 'Word1' and 'Word2' in the same line, not immediately after each other. What is the correct regular expression to do this? Thanks. --Panda10 (talk) 19:07, 11 February 2014 (UTC)

  • \bword1\b.*\bword2\b|\bword2\b.*\bword1\b. Or something similar. You get the idea. Keφr 19:15, 11 February 2014 (UTC)
Thanks! --Panda10 (talk) 19:23, 11 February 2014 (UTC)

Bot task: add missing entries to 'Category:Terms spelled with...'[edit]

I started using AWB to (a) find all entries with 0 in their titles and ==English== in their contents which did not have [[Category:English terms spelled with 0]] in their contents, and (b) add [[Category:English terms spelled with 0]] to them. However, I realized that it made more sense to let a bot do that, and likewise catch entries missing from all the other subcategories of Category:English terms by their individual characters. - -sche (discuss) 06:00, 12 February 2014 (UTC)

Actually, it might even make more sense to have the headword template automatically add the category. --WikiTiki89 17:31, 12 February 2014 (UTC)
That would be ideal, assuming it wouldn't slow pages down. The way things are currently set up, {{head}} would have to check the language, then check all the characters in the pagename against a language-specific list (either of approved categories, or of unapproved categories; more on than in a moment), since different languages have categories for different characters. For example, English doesn't have categories for its typical letters, A through Z, but does have categories for Ä and Γ, since those letters are exceptional in English. In German, there are not categories for A or Ä, since both are typical letters, and in Greek there would not be a category for Γ (since that letter is typical) if someone ever got around to creating categories for Greek. If that set-up is retained, it would probably make the most sense to associate a list of "basic letters" with every language (in Module:languages?), and then have {{head}} add categories for the open-ended set of "all characters not in the list of typical letters". (Alternatively, we could revise our decision to delete Category:English terms spelled with ' and to not have categories for Category:English terms spelled with A, etc, and allow categories for every character for every language, but the categories for 'typical' letters would be very large.) As I processed the first batch of English-0 entries, I found several that contained no headword template, but a one-off bot or AWB search could catch those. - -sche (discuss) 18:24, 12 February 2014 (UTC)
I would be ok with doing this, but I'd suggest doing it first with a more regularly spelled language that has less entries, as a trial. —CodeCat 18:36, 12 February 2014 (UTC)
I've been wanting to do this for the Russian obsolete letters ѣ, ѳ, і, and ѵ, which already have categories, but most words that have these letters do not have the category listed on the page. --WikiTiki89 19:11, 12 February 2014 (UTC)
It will be important to be thorough with the list, though. We have to consider every single character, even punctuation and spacing. —CodeCat 19:45, 12 February 2014 (UTC)
Actually, I think that (in the case of Russian at least) each categorizing character should be specified explicitly, rather than specifying each non-categorizing character. --WikiTiki89 19:54, 12 February 2014 (UTC)
Hm... why? The class "characters other than Cyrillic letters which could potentially be used in Russian words" is open-ended, and "wanted categories" could help us find uses of such characters. Having all characters except the small, closed set of basic letters/spaces/punctuation categorize is more maintainable, I think. The one exception I see to that is not Russian but Chinese-character-using lects. For them, flipping things (and specifying which characters do categorize) does seem obligatory. - -sche (discuss) 09:15, 16 February 2014 (UTC)
Not every rare letter needs to have a category for it. For example, if lytdybr (lytdybr) passes RFV, it doesn't need to be categorized for each of its letters, but perhaps is better off in a category such as Category:Russian terms written in the Latin script. Categories such as Category:Russian terms spelt with Ѣ is useful because spelling words with "ѣ" vs "е" was an important nuance of the old orthography. Its usefulness can be compared to the hypothetical Category:English terms spelled with "ei". Categories for random letters that happen to appear in word or two are not nearly as useful. --WikiTiki89 22:27, 16 February 2014 (UTC)
Started doing it with Kennybot (talkcontribs). --kc_kennylau (talk) 10:34, 16 February 2014 (UTC)
I noticed; thank you! :) - -sche (discuss) 07:16, 17 February 2014 (UTC)

Pinging vandals[edit]

Am I right in guessing that the {{vandal}} template has the unintended side-effect of pinging the account being reported? If so, is there any way to make it not do that? Chuck Entz (talk) 07:45, 14 February 2014 (UTC)

According to the documentation, it pings the vandal only if whoever used it signed his addition with tildes and the link to the vandal's userpage is constructed [[user:foo|thusly]] and not [//en.wiktionary.org/user:foo thusly]. If we want to make sure the template doesn't ping the vandal, we can change the link to the latter. If we don't want that, but some user doesn't want to ping the vandal, he can avoid using tildes to sign his name.​—msh210 (talk) 22:14, 14 February 2014 (UTC)
Actually, I see Kephir has made that change.​—msh210 (talk) 22:16, 14 February 2014 (UTC)
I'm glad he did. This diff shows why it was needed. I just happened to see the vandalism report about the same time as the vandal did, so it ended up being the vandal's last edit. Chuck Entz (talk) 22:44, 14 February 2014 (UTC)

Where to go for data dumps?[edit]

I'm curious where I should go to get an XML dump of Wiktionary, and more specifically, the ZH Wiktionary. The ZH one has Middle Chinese readings for many more entries than the EN WT, an area I'm currently interested in. My Chinese is just good enough for me to puzzle out the reading info on a given entry, but not good enough to be able to read their GP or other fora, or to ask there for dump data. Could anyone here point me in the right direction? ‑‑ Eiríkr Útlendi │ Tala við mig 08:31, 14 February 2014 (UTC)

Here: [2]. Dakdada (talk) 09:32, 14 February 2014 (UTC)

Template "subpages"[edit]

Any ideas why this doesn't work in User:Csörföly D? SemperBlotto (talk) 07:55, 16 February 2014 (UTC)

This solved the problem. --kc_kennylau (talk) 08:43, 16 February 2014 (UTC)

Proposed change to Welcome template[edit]

It says: "If you already have some experience with editing our sister project Wikipedia, then you may find our guide to Wikipedia users useful." But it's a guide for Wikipedia users, not a guide to them; i.e. we aren't listing specific users and their characteristics (although that would be amusingly controversial). Equinox 23:08, 18 February 2014 (UTC)

{{sofixit}}. —Aɴɢʀ (talk) 14:34, 19 February 2014 (UTC)
Don't know how to edit a template. Equinox 20:17, 19 February 2014 (UTC)
Really? Oops, AGF. Yes check.svg Done in this edit. DCDuring TALK 20:23, 19 February 2014 (UTC)
The same way you edit any other kind of page. Especially since the welcome template does not actually have much logic in it, it's mostly just text. --WikiTiki89 20:23, 19 February 2014 (UTC)
And you (Equinox) have edited templates before, including {{welcome}}: diff, diff. —Aɴɢʀ (talk) 20:31, 19 February 2014 (UTC)

Language categories in tabbed languages[edit]

If you have tabbed languages turned on, then when you look at one language's entry, all and only that language's categories are supposed to be shown at the bottom. However, at [[chip]], it appears that the "Ireland" inside the context template of sense 6 is somehow triggering Irish language instead, because Category:Irish English—and all other English-language categories that are defined from the point until the end of the English entry—are appearing under the "Irish" tab instead. Any ideas how to fix that? —Aɴɢʀ (talk) 14:31, 19 February 2014 (UTC)

That would be moderately difficult to fix from the script side, I think. I don't really know why the regionalisms categories use this naming scheme. They seem rather unclear. Perhaps Category:French French, Category:English English, Category:Spanish Spanish, etc could be renamed things like "French of France", "English of England", "Spanish of Spain" (or maybe "France French", "England English", and "Spain Spanish"). Names like Category:Luxembourgish French and Category:Irish English are confusing. --Yair rand (talk) 00:22, 20 February 2014 (UTC)
FWIW, quite a few categories do use a [country noun] [language] format as opposed to a [country adjective] [language] format: Category:Louisiana French/English, Category:Quebec French/English, Category:New York English, Category:New England English, Category:New Zealand English, Category:Switzerland German/French/Italian. (The Swiss categories were moved as a result of an RFM to disambiguate them from Category:Alemannic German language.) - -sche (discuss) 00:49, 20 February 2014 (UTC)
I support changing the naming scheme to "(language) of (place)". —CodeCat 01:49, 20 February 2014 (UTC)
Will any of this solve the problem this thread is about? Why does the software think that Irish English and everything after it is a subcategory of Category:Irish language instead of a subcategory of Category:English language, and how can we persuade it to stop thinking that? —Aɴɢʀ (talk) 16:12, 20 February 2014 (UTC)
It will solve the problem by renaming the category "Irish English". —CodeCat 16:19, 20 February 2014 (UTC)
But why doesn't the tabbed languages software just display the categories that are under the language heading whether or not they seem to relate to the language? --WikiTiki89 22:42, 20 February 2014 (UTC)
That's not actually possible. The location of a category on a page/in the wikitext isn't something the script can access. Only the order is accessible. When the categories are sorted, if the category name begins with the name of the next language section ("Irish " in this case), it's assumed that the next language section has started (unless the category name ends in "letter names", "script characters", or "mythology"). This tends to work quite well in most cases. --Yair rand (talk) 23:03, 20 February 2014 (UTC)
Oh. I assumed that categories stay where they are in the HTML, but are made to float to the bottom with CSS. --WikiTiki89 23:10, 20 February 2014 (UTC)
This specific problem could be solved by renaming the category in question Category:Hiberno-English (though that may refer to something slightly different to Irish English), but the more general problem remains. And renaming our categories to various awkward phrases that no one actually uses just to prevent them from breaking seems very much like getting the wrong end of the stick. —Aɴɢʀ (talk) 23:45, 20 February 2014 (UTC)

Per-browser preferences[edit]

Every single day, I have to go here, untick "Highlight the inflection line of some entries" and tick "Show the translation sections expanded, instead of having them collapsed". Why? SemperBlotto (talk) 09:20, 20 February 2014 (UTC)

  • It happens every time I switch the PC off and on again. SemperBlotto (talk) 15:35, 22 February 2014 (UTC)
    Don't you like cookies? DCDuring TALK 16:04, 22 February 2014 (UTC)
    Love 'em. I'm kept logged in, so I assume all my cookies are functioning normally. Other sites that require cookies all work normally. SemperBlotto (talk) 16:12, 22 February 2014 (UTC)
    Duplicated the problem with a restart of my PC. Perhaps the cure is like the one in the doctor joke: Patient: "Doc, it hurts whenever I do X" / Doctor: "Stop doing X". Maybe you could have your PC "hibernate" or something overnight, instead of flipping it off. DCDuring TALK 17:39, 22 February 2014 (UTC)
  • In fact, all I have to do is stop and start the browser for the cookies to be forgotten. It seems to be due to the recent changes to User:Hippietrail/custom.js - changing "getCookie" to "jQuery.cookie" logic. No idea how to fix it. SemperBlotto (talk) 10:40, 23 February 2014 (UTC)
    Now fixed by "Kephir" - thanks. SemperBlotto (talk) 11:23, 23 February 2014 (UTC)

Looking for feedback on List of languages, csv format[edit]

I've made this list:

Sample:

line;code;canonical name;category;type;family code;family;sortkey?;autodetect?;exceptional?;script codes;other names
100;ael;Ambele;Ambele language;regular;nic-bod;Bantoid;;;;;
101;aem;Arem;Arem language;regular;aav;Austro-Asiatic;;;;;
102;aen;Armenian Sign Language;Armenian Sign Language;regular;sgn;sign;;;;;
103;aeq;Aer;Aer language;regular;inc;Indo-Aryan;;;;;
104;aer;Eastern Arrernte;Eastern Arrernte language;regular;aus-pam;Pama-Nyungan;;;;Latn;
105;aes;Alsea;Alsea language;regular;qfa-und;unclassified;;;;;Yaquina,Yakwina,Alseya,Yakona

It's a machine-readable version of Wiktionary:List of languages. I've tried to include all the data from the existing lists.

It would be good to nail down the format so script/bot writers could depend on it. For the program I'm writing, I'm currently only using columns 2–4 (language code, canonical name, category name). If anyone has thoughts on what data should and shouldn't be included please chime in or edit the thing. I don't actually know how much data is stored per language. Pengo (talk) 02:20, 22 February 2014 (UTC)

User:Ruakh was working on a module for exporting data using JSON. That seems much more sensible than csv, which is kind of outdated and inflexible. —CodeCat 02:23, 22 February 2014 (UTC)
Module:JSON data is already there and working. I use it for User:Buttermilch and User:Kephir/gadgets/xte. Hardly documented, though. Keφr 10:07, 22 February 2014 (UTC)
Cool. The CSV is a slightly more lightweight, but JSON is certainly more robust. I've added a link to JSON module now from List of languages (and from my csv list) so others might find it.
Note that apart from lack of docs, the JSON module is missing an export of Etymology-only languages. Not an issue for me currently, but thought I'd point it out if anyone wants to be a completionist. Pengo (talk) 00:45, 23 February 2014 (UTC)
Right. The current format of the etymology-only language data is rather unsuitable for exporting, as it does not distinguish between "canonical" codes and aliases. show_etym in Module:list of languages uses a somewhat grotesque hack to present WT:LOL/E the way it currently does. Keφr 09:53, 23 February 2014 (UTC)
As for documentation, I remember thinking something about deliberately not documenting it with too much detail (e.g. not giving a URL to the API entry point which exports the data), so that it would not be so easy to abuse. (While the knowledgeable can just read mw:API:Expandtemplates to construct an API call to get the data they want.) Also, User:Pengo: can you switch to the JSON exporter now, so that we can get rid of the CSV version? I think maintaining two separate "official" data exporting methods (with CSV being, I imagine, somewhat less stable and straightforward) is not a very good idea. Keφr 10:31, 23 February 2014 (UTC)
Thanks for your feedback. I'll stick to using and developing the CSV, if you don't mind. The columns have clear meanings, it contains less redundant data, the page has less potential for "abuse" (it uses less CPU, if that's what you're worried about? Hiding the method for viewing it means it never gets cached, btw), and the format of the JSON is undocumented and seems to be geared towards showing the undocumented internal representation of the data rather than a user-centric data export. Thanks but no thanks. Pengo (talk) 08:31, 24 February 2014 (UTC)
Our data format is well-documented and it contains no redundancies, see Template:language data documentation. The JSON exporter simply remaps that format to JSON types (and can also filter out unneeded language data). Your CSV representation redundantly lists language family both as a code and name; the "line counter" field is also superfluous. I can also imagine problems with delimiter collision, if we ever decide to put a comma in an alternative language name (not likely, but possible). MediaWiki caching may be actually harmful here — pages sometimes fail to be purged after their dependencies are changed, which may make the bots/whatever use non-fresh data (I think it happened with WT:LOL a while ago).
If you want to keep using CSV, fine by me, but I suggest you keep it in your user space/user module sandbox (something like Module:User:Pengo/languages-csv). Keφr 15:52, 24 February 2014 (UTC)
Sorry that it doesn't have string quoting yet. It is a new module, which is why I was asking for feedback. Pengo (talk) 23:57, 24 February 2014 (UTC)

Can I make the edit window taller?[edit]

Is there a way for me to make the edit window taller? Including the enhanced editor that's used for modules? —CodeCat 15:51, 22 February 2014 (UTC)

@CodeCat: Special:Preferences#mw-prefsection-editing > Editing > Editor > Rows --kc_kennylau (talk) 15:56, 22 February 2014 (UTC)
I doubt it's important enough to switch browser, but Opera puts a resize handle on all text boxes. Quite handy. Equinox 17:15, 22 February 2014 (UTC)
Chrome does as well, and if I remember correctly, so does Firefox. --WikiTiki89 20:20, 22 February 2014 (UTC)
I'm using Firefox and there's no resize handle on the text boxes here at Wiktionary, though there is at Wikisource. —Aɴɢʀ (talk) 20:45, 22 February 2014 (UTC)
Now that's what I call strange. --WikiTiki89 20:47, 22 February 2014 (UTC)
I do get a resize handle, but it's not remembered. I wondered if there was a way to make it stay that way. It works now, thank you for the advice. —CodeCat 21:03, 22 February 2014 (UTC)
I'd never noticed that resize thing. (FF) Thanks for the Q&A that brought it to my attention. It's really useful for me. DCDuring TALK 21:51, 22 February 2014 (UTC)
Unfortunately it doesn't work for the enhanced editor. It shows me 26 lines, while I'd rather have 30. —CodeCat 22:20, 22 February 2014 (UTC)
I turned enhanced editing off, and the resize handle appeared for a moment, then disappeared again when Dot's Syntax Highlighter kicked in. I don't have that at Wikisource, which must be why I always have the handle there. —Aɴɢʀ (talk) 22:36, 22 February 2014 (UTC)
I highly recommend using external editors for editing large amounts of code and the default unenhanced plaintext editor for small amounts. The enhanced editor is horrible. --WikiTiki89 22:43, 22 February 2014 (UTC)

Lebanese Arabic[edit]

Just noticed that Category:Lebanese Arabic language is unique for languages in "Category:All languages" in that it has no boilerplate template and no language code in the list. Anyone care to tackle it? Pengo (talk) 23:55, 24 February 2014 (UTC)

Lebanese Arabic is currently treated as part of North Levantine Arabic ("apc"). We are actually in the middle of a discussion about merging North and South Levantine into just plain Levantine (see here). --WikiTiki89 00:16, 25 February 2014 (UTC)
Thanks Pengo (talk) 00:54, 25 February 2014 (UTC)
I've emptied and deleted the category. Its only entries were Template:arabic-dialect-pronunciation and two entries that template placed into the category even though they had ==Arabic== headers. - -sche (discuss) 02:34, 25 February 2014 (UTC)

Automatic redirects for characters/character combinations[edit]

This is not related to Wiktionary directly, but is an issue that has been discussed, so I'm asking here. With a wiki, how do you set two characters or sets of characters to be equivalent? For example, when the user searches for a word with dz in it, I want the wiki to automatically consider dz to be dᶻ (such as in a word like d̲ᶻ̲idᶻəlal̓ič). Even better, is it possible to have the search engine look for dz first, and then search for dᶻ if no word has dz? --BB12 (talk) 01:32, 25 February 2014 (UTC)

Kind of. Rather than searching for "dz" then "dᶻ", the search index might simply stores dᶻ in an index of normalized word forms as "dz". Usually a search engine creates a "normalized" or "canonical" or "stemmed" form of the word for some of all of its indexes. Not sure why you're asking here. Pengo (talk) 01:55, 25 February 2014 (UTC)
I have a wiki and would like to do that to aid the user in searching for words. I've spent quite a bit time looking for a way to do that, but I've found mention of something like that only on Wiktionary (and possibly one other place), so I'm asking here in the hopes somebody knows how to do it. Googling on mediawiki text normalization brought up some interesting results, but they were not exactly what I'm looking for (or else they were over my head).
FWIW, this is something that also might be useful for Wiktionary. Requiring the user to type a superscript "z" or underlines in the word d̲ᶻ̲idᶻəlal̓ič dᶻidᶻəlal̓ič (Seattle in Lushootseed) is more than many people can handle. --BB12 (talk) 03:43, 25 February 2014 (UTC)
I just learned my text is out of date. No need for underlining, but the issue with dz and other superscripts still remains. --BB12 (talk) 04:59, 25 February 2014 (UTC)
Evidently the method used is the Universal Language Selector. I've put in a request to bugzilla, but I don't know what the process will be. Hopefully I can help out in some way..... --BB12 (talk) 20:42, 28 February 2014 (UTC)

Tlingit noun templates[edit]

I don't know the first thing about templating, and I'd appreciate some help with creating some templates for Tlingit nouns. I need a template for plurals, diminutives, possessed forms, and all combinations of these three, but doesn't require any of them.

Okkerdeis (talk) 04:30, 26 February 2014 (UTC)

You're talking about a headword template, similar to {{en-noun}}, correct? How many plurals / diminutives / possessed forms of one noun can there be? If there are a lot a full inflection table template might make more sense. DTLHS (talk) 04:38, 26 February 2014 (UTC)
I guess a table would make more sense since there are so many potential forms of a word. I'm still hesitant though because many nouns don't have plural forms or diminutive forms, so the tables would have a lot of dead links, or just forms identical to the base form. In any case, I'm not sure how to make an inflectional table either.

Okkerdeis (talk) 05:42, 26 February 2014 (UTC)

Right, well I've made a very basic start at {{tli-noun}} (only one pl, dim, and poss parameter) which can be extended later if you want. DTLHS (talk) 06:09, 26 February 2014 (UTC)
Thanks for the effort; I'm still having trouble using this template. I'm sure that it's easy to use, but I can't quite figure it out.

Okkerdeis (talk) 23:40, 26 February 2014 (UTC)

Grease pit (plural plural, diminutive diminutive, possessed form possessed form) DTLHS (talk) 23:57, 26 February 2014 (UTC)

Translating most common words first[edit]

I wanted a list of words that have not yet been translated into my language, sorted according to how common/popular these words are. First, because a lot of very common words have not yet been translated into my language (Greek). Second, because it is quite difficult to actually check if a word has been translated or not. When an entry does not exist at all, you see it in red. But if you want to check whether a translation in a particular language exists, you have to open existing entries, expand the translations, and then check for translations in that language. So, I downloaded the frequency lists from Project Gutenberg and the wiktionary dump and wrote some basic code to do this:

#create frequency list from downloaded html files (put all in one folder)
#!/bin/sh
# grep -h "</a></td>" * > freqlist
# perl -i -pe 's/<.*>(.*?)<.a><.td>/\1/' freqlist
 
#one line per article
for i in *.xml; do perl -lpe 'BEGIN { $/="</page>"} s/\s/ /g'  "$i" > untranslated; done
 
# remove articles without translations and items with Greek translations
perl -i -ne'print if /{{trans-top/' untranslated
perl -i -ne 'print unless /{{t.?\|el/'  untranslated
#keep title only
perl -i -pe 's/^.*<title>(.*)<.title>.*$/\1/' untranslated
 
#create final list
awk 'FNR==NR{a[$0];next}($0 in a)' untranslated freqlist > wordlist.html 
perl -i -pe 's@^(.*)$@<a href="https://en.wiktionary.org/wiki/\1#Translations">\1</a><br>@' wordlist.html
split -l 100 -d --additional-suffix=".html" wordlist.html wordlist/wordlist

The results are not perfect since they contain a lot of plurals/past participles etc., but they are definitely usable. I'd like to filter out some more words by also comparing the list to the wordlist of a "standard" dictionary, such as wordnet or the free version of webster's. Is anyone aware of another list I could use?

On a more ambitious note, would you consider adding a similar feature to the site? I'm thinking of a page where people would be able to press a button and be served with the next most common untranslated word in their language. The list would be updated every 2-4 weeks and the counter would be reset to zero. So, words that were clicked on but were not translated will return to the start of the list and will be served to new users, while words that were translated in the meantime will be removed. Jenniepet (talk) 20:18, 26 February 2014 (UTC)

I think a good way to add that feature would be a bot adding {{trreq}} to the translation tables via a bot. The problem is that currently {{trreq}} is not handled well by Conrad Irwin's accelerated translation added script, and Category:Translation requests is not easy to find by unexperienced users. Matthias Buchmeier (talk) 19:18, 26 February 2014 (UTC)
You mean a bot that would regularly add, say, a hundred new words to the Category:Translation requests page? That sounds good. Although I don't think it adds anything to make the missing translations visible inside the respective articles. For example, among the first hundred untranslated words for Greek I found "look" and "low". If I had opened these entries by chance and seen they had no translation I would probably have added them anyway. Jenniepet (talk) 20:18, 26 February 2014 (UTC)
I have a program that sorts words lacking translations based on how many translations the table has. The results have been great. — Ungoliant (falai) 21:05, 26 February 2014 (UTC)
That's a great idea. Could you send me your list of "English words with the most translations" so that I can combine it with mine and see if this results in something even more useful? I'd appreciate a longer list (somewhere between 20-40.000). I did try to combine my list with the wordnet and webster1913 wordlists after all, but the results are disappointing. Essentially, I'd like to have a list that doesn't include said and came in the first page of results. However, there is one thing I would change in your approach. I think that your condition "words that are missing one translation in a given language" is too strict. For languages with very few existing translations, I would suggest listing only "words without any translations". For very popular languages, the ideal solution would be something along the lines of "words missing at least 50% of the translation glosses/senses". The reason is that for many well-known words you have more than 10 translation glosses, and some of them can be really obscure (e.g. baseball or american football terminology) or almost identical. So, people might leave them blank on purpose.
@Jenniepet: It doesn’t generate such a list, as it would need to be regenerated after each dump. I just ran it for Greek (here), tell me what you think. — Ungoliant (falai) 23:58, 27 February 2014 (UTC)
@Ungoliant MMDCCLXIV:. The lists are great! Are you able to generate such lists for Russian, Japanese and Mandarin, please? I could use other languages, such as German, Arabic, Korean, etc. but I won't push my luck now :) --Anatoli (обсудить/вклад) 00:10, 28 February 2014 (UTC)
@Ungoliant MMDCCLXIV: I uploaded a list of words that haven't been translated into Greek created using my original script: list1. Also, two differently sorted lists of the common elements between my list and yours: list2 and list3. I think that the combined lists 2 and 3 are much better. My original list had too many inflected forms and your list has too many proper names. Their combination looks perfect! (I might have a slight preference for list2)
And now we come to the fun part: For these lists to be really useful, they should not lead you to words that have been translated in the meantime. So, they should be updated every 1-3 months using the latest dump. But what I'd also like to propose is that once someone clicks on a link on the list, that link should be removed (or hidden) from the list. So, if two or more contributors are working their way down the list, none of them will come up against a multitude of "dead" links. Sadly, I don't know how to implement this. I don't even know if it can be done using only javascript. Does anyone have any suggestions? Jenniepet (talk) 03:48, 28 February 2014 (UTC)

Acceleration in {{cy-noun}}[edit]

Can't figure out why it's not working (probably something minor and stupid). Examples: migwrn, sebon. —Μετάknowledgediscuss/deeds 04:47, 27 February 2014 (UTC)

@Metaknowledge: Because User:Conrad.Irwin/creationrules.js lacks the rule for that particular language. --kc_kennylau (talk) 14:19, 27 February 2014 (UTC)
This code from me is not tested:
// Welsh
creation_rules['cy'] =
	function (params, entry)
	{
		var template = {
			'plural':'plural of',
			'equative': 'equative of',
			'comparative':'comparative of',
			'superlative': 'superlative of'};
 
		if (!template[params.form])
			throw new PreloadTextError('No rule for "' + params.form + '" in language "' + params.lang + '".');
 
		entry.def = '{{' + template[params.form] + '|' + params.origin + '|lang=' + params.lang + '}}';
	};
--kc_kennylau (talk) 14:35, 27 February 2014 (UTC)
Um, what? Normally it works without having to add the language to the JS. In fact, that's how it worked every single other time I've done this. But anyway, for Welsh the automatic plurals really ought to have a mutation table automatically appended to them, so I wouldn't mind if the logic for that went in creationrules.js... but maybe that should be done after we work out the templates involved. —Μετάknowledgediscuss/deeds 02:05, 28 February 2014 (UTC)
Because "every single other time" there is a rule already created. --kc_kennylau (talk) 00:56, 1 March 2014 (UTC)

Request for addition to spam filters[edit]

A bunch of spam talk pages have been created in the past couple of days, each using a different IP from any of several parts of the world, with mostly the same text and with the same words in the edit comment (I'm adding extra characters in even-numbered positions to avoid giving them a free search-engine hit): "Fqrqiqeqnqdq qFqiqnqdqeqrq". Admins can find plenty of examples through the deletion log.

Could someone who knows how add a filter to block such edits?

Also, at least one of the IPs was used a year ago to post a test edit with an edit comment of "Test, just a test" and text of "Hello. And Bye." If anyone sees such an obviously automated edit, don't just delete it- block the contributor as well. This might just reduce the chance of them coming back later to post spam.

In the past we've let these go because the edit itself doesn't violate any rules, overlooking that fact that using a bot to add content of any kind without going through the approval process is a blockable offense. We may not always block suspected bots if they're doing something innocuous like adding interwikis- but we have every right to do so. Chuck Entz (talk) 00:32, 1 March 2014 (UTC)

March 2014[edit]

Projectlinks[edit]

Something's gone wrong with the display of Template:projectlinks: it's showing some module invocation code instead of the correct link text. This, that and the other (talk) 10:31, 1 March 2014 (UTC)

Fixed, seemingly by Kephir. This, that and the other (talk) 09:19, 4 March 2014 (UTC)

quotations JS[edit]

At eye#Noun, I noticed that the JS that suppresses the display of quotations makes the "quotations" display control appear nicely at the end of each definition, unless there is a usage example (whether or not {{usex}} is used) immediately after the definition, in which case it appears on a separate line. A better appearance results from placing all the quotations for a definition immediately after the definition with the usage example after the quotations.

Would it be possible to amend the JS so that the little blue "quotations" display control always appeared on the definition line? DCDuring TALK 01:41, 3 March 2014 (UTC)

Done. --Yair rand (talk) 11:14, 3 March 2014 (UTC)
Thanks. That looks much better. I hope we don't discover any bad side-effects. DCDuring TALK 11:34, 3 March 2014 (UTC)

Accelerating Template:ast-adj[edit]

Hi. Kenny was kind enough to ACCELerate Template:ast-adj, but apparently it isn't yet fully ACCELerated. He says to do that, the "rules" need to be changed. I have a request at User talk:Conrad.Irwin/creationrules.js, and would appreciate it if an admin could make the necessary changes as explained there. Thanks in advance. --Back on the list (talk) 13:12, 4 March 2014 (UTC)

Done. —CodeCat 13:59, 4 March 2014 (UTC)
Thanks kitty! Love you! --Back on the list (talk) 14:42, 4 March 2014 (UTC)
Well, the neuter singular ACCEL doesn't actually work. Any suggestions? --Back on the list (talk) 18:02, 9 March 2014 (UTC)
Try again. Keφr 19:12, 9 March 2014 (UTC)

Pronunciation Recording[edit]

Visual workflow draft for pronunciation recording gadget; If you have trouble watching this video here, watch it on vimeo. A more extensive/explanative version is available.

Dear Wiktionary community!

About me
My name is Rainer Rillke, and I have been volunteering at Wikimedia Commons for 3 years now, gathering experience around media files. I've been always interested in how things work and how one could improve them.
The idea
One idea that appeared last Summer was allowing the recording of small chunks of speech, uploading that to Wikimedia Commons in the background and including this into a Wiktionary entry without having the hassle doing everything by hand or installing additional software. That idea led to the foundation of MediaWiki extension PronunciationRecording during the Google Summer of Code. However, this was not completed; instead development is stale for over 5 months now.
My proposal
To make this going to work, so Wiktionary has an immediate benefit of this feature, I would like to provide the work done so far as a gadget and add some more work in regard to usability. You can see my plan at m:Grants:IEG/Finish Pronunciation Recording. And more importantly, you can give me a hand, if you are interested.
Looking for volunteers and 2 consultants
Often, software projects that are not aligned to the community's needs produce vapourware. Recently, that happened to a $ 15,000 grant. I don't want to write vapourware, I need your advice! Without community support, I cannot and will not continue this project. So here is what I am looking for:

Don't forget to comment. Thanks and kind regards -- Rillke (talk) 18:28, 6 March 2014 (UTC)

Hey guys, I got so excited by this project that I decided to join it. From personal experience, I see how uploading an audio file can be irritating. This project will make it a lot easier to do so, encouraging even casual users to upload audio pronunciations.
Wiktionary already has a decent coverage of IPA pronunciations, but we are still lacking in audio pronunciations. I know that for most of us IPA is sufficient, but, as evidenced by feedbackers every now and then, there are many users for whom IPA is nothing more than incomprehensible clutter.
If you agree, I ask that you post your support here. If not, tell us why. — Ungoliant (falai) 01:19, 7 March 2014 (UTC)
Thanks for the video. This seems like it would be a very useful gadget. I have left an encouraging comment here. - -sche (discuss) 00:05, 14 March 2014 (UTC)

ACCEL broken[edit]

A recent edit to User:Conrad.Irwin/creation.js seems to have broken it: you can see this on silver foil where silver foils is a red link with broken green underlining. If someone creates silver foils, you can just go to a gibberish pagetitle such as h@aw and preview the page with {{en-noun}} added. - -sche (discuss) 23:59, 8 March 2014 (UTC)

I have undone the edit in question until a bug-free version of it can be implemented. - -sche (discuss) 00:38, 9 March 2014 (UTC)

Recent changes - format gone haywire[edit]

We now have <span class="minifont">Show new changes starting from 22:05, 11 March 2014 | Current number of entries: <b>3,688,346</b></span> being displayed. Anyone owning up? SemperBlotto (talk) 22:08, 11 March 2014 (UTC)

Looks like it was caused by Mediawiki:Rclistfrom being changed to be inside a link instead of containing it in this change, associated with the MW1.23wmf17 deployment. I suspect this was not intentional. --Yair rand (talk) 22:18, 11 March 2014 (UTC)
You still need to remove the bolding. DTLHS (talk) 22:20, 11 March 2014 (UTC)
And we've now lost the "number of entries". SemperBlotto (talk) 09:20, 12 March 2014 (UTC)
I've added it back. It is linked like the rest of the line, but that seems tolerable. - -sche (discuss) 19:45, 30 March 2014 (UTC)

Lua replacement for Langrev[edit]

What is the Lua replacement for Langrev? That is to say, if I know the canonical name of a language, what code tells me the language's code? (If I know the code, I can find the canonical name by typing {{#invoke:languages/templates|lookup|en|names}}. Usefully, this is subst:able. It would be great if the code that turned canonical names into codes were also subst:able.) - -sche (discuss) 01:09, 12 March 2014 (UTC)

There is a pair of functions for it in Module:languages, but they're still a bit unoptimised. There's also no way to use them from a template yet, although that could be fixed fairly easily by adding that to Module:languages/templates. The main obstacle in migrating right now is deciding which of the two functions to use. The first only returns the canonical name, so it can't be used to look up alternative names, but it does always give a single unique result. The second searches alternative names too, but could give more than one possible answer. Our {{langrev}} template is a bit of a hybrid between these... it returns some alternative names, but doesn't handle conflicts, it just uses whichever one had its name added first. —CodeCat 02:12, 12 March 2014 (UTC)

Can't add translations at urea[edit]

When I try, I get an error saying "Could not find translation table for 'nl:ureum'. Glosses should be unique". I suspect that it might be because of the subscripts in the gloss. —CodeCat 03:34, 13 March 2014 (UTC)

  • Yes. I removed the subscripts and could then add a German translation. (I would be surprised if the translations for the two senses were ever different). SemperBlotto (talk) 08:00, 13 March 2014 (UTC)

Old Church Slavonic transliteration[edit]

What module is responsible for the automatic transliteration of Old Church Slavonic (cu)? I wanted to tell it to transliterate Ѿ and ѿ as Otŭ and otŭ respectively, but there is no Module:cu-translit. This should probably be done for Old East Slavic (orv) as well. —Aɴɢʀ (talk) 19:00, 17 March 2014 (UTC)

You can look it up in Module:languages/data2. It's Module:Cyrs-Glag-translit. —CodeCat 19:04, 17 March 2014 (UTC)
(E/C) Module:Cyrs-Glag-translit. But before you do that, think about whether it is true in all cases. I would ideally transliterate it as ot or something to that effect. --WikiTiki89 19:06, 17 March 2014 (UTC)
If we really want to get fancy we can transliterate it oͭ. Does it have a conventional scholarly transliteration other than otŭ? Part of the point of automated transliterations is that they shouldn't be context-dependent. —Aɴɢʀ (talk) 19:27, 17 March 2014 (UTC)
But is otŭ a scholarly transliteration, or just a scholarly normalization? --WikiTiki89 21:45, 17 March 2014 (UTC)
What is the letter used for? —CodeCat 22:10, 17 March 2014 (UTC)
It is used as a sort of abbreviation of "от(ъ)", having a similar origin to German umlaut (ä, ö) and to Spanish/Portuguese tilde (ñ, ã, õ). I have only seen it representing the word отъ (otŭ) and as the corresponding prefix, but it may have other uses. I don't read Old Slavic texts all that much. --WikiTiki89 22:19, 17 March 2014 (UTC)
Then whatever transliteration we choose, it should be obvious enough that it represents that word. —CodeCat 22:29, 17 March 2014 (UTC)
I've added it as Otŭ and otŭ for now; if we decide later to transcribe it differently, we can change it then. —Aɴɢʀ (talk) 00:08, 18 March 2014 (UTC)

Formatted Arabic on iPad is displayed with disconnected letters[edit]

iPad supports Arabic input and display but Wiktionary entries using formatted entries currently show right-to-left (as expected) but letters are not connected to each other, e.g. عربي looks like ع‌ر‌ب‌ي (added ZWNJ). By contrast, Persian looks okey. Not able to test right now, if somebody changes the fonts but I will check later, when I get to my iPad. --Anatoli (обсудить/вклад) 01:03, 20 March 2014 (UTC)

Can the iPad's browser show you the HTML source code of a webpage? --WikiTiki89 02:47, 20 March 2014 (UTC)
Yes, it can. I will check tonight. --Anatoli (обсудить/вклад) 03:06, 20 March 2014 (UTC)
It appears the latest versions of OS don't allow to view the source code. --Anatoli (обсудить/вклад) 20:49, 20 March 2014 (UTC)
It looks fine on my iPhone. Just to clarify the issue, do you see it incorrectly on this page or only on the entry itself? --WikiTiki89 21:18, 20 March 2014 (UTC)
Yeah, iPhone is fine. Any unformatted string like عربي looks fine (here and on the Internet) but when it's formatted with any template that uses MediaWiki:Common.css then it's disconnected. So, the previous attempts to fix something else broke the display for iPad (possibly Mac as well, which also use Safari browser). --Anatoli (обсудить/вклад) 22:46, 20 March 2014 (UTC)
How do the following look:
  1. عربي (my expectation: broken)
  2. عربي (my expectation: broken)
  3. عربي (my expectation: working)
  4. عربي (my expectation: broken)
  5. عربي (my expectation: no idea)
  6. عربي (my expectation: no idea)
--WikiTiki89 23:15, 20 March 2014 (UTC)
I see examples 1,2,4 and 5 in different fonts from examples 3 and 6. (OSX, Chrome) --Catsidhe (verba, facta) 00:24, 21 March 2014 (UTC)
(E./C.)@Wikitiki89. Thank you. Hope it's alright to have quite long delays on this discussion. I only use our family iPad at home. So, I will check when I can and post here. There are other issues with the latest version of Safari, like inability to search on page or no support for some media files.
@Catsidhe. Fonts are probably OK, as long as Arabic letters are not disconnected, like in my first post. --Anatoli (обсудить/вклад) 00:31, 21 March 2014 (UTC)
Actually, looking at it more closely, example 5 is a third font, which is different again from 3,6 and 1,2,4. They all look connected to me, but maybe the equivalent on your iPad doesn't have the correct ligatures. --Catsidhe (verba, facta) 00:48, 21 March 2014 (UTC)
I am well aware of the style differences. And actually, they should all be the same font except number 5. 3 and 6 should just be a smaller size. On the iPad, the sizes may or may not be normalized. BTW, these are not "ligatures", because you can have an infinite string of connected letters: ههههههههههههههههههههههههههههههههههههههههههههههههههههههههههه. Arabic does also have ligatures such as لا, but they are implemented differently from connected letters. --WikiTiki89 00:58, 21 March 2014 (UTC)
Context dependant joined forms, then. Taking #5 out of the equation, from what I see, the two fonts have different letter shapes. More the point, when I look at it through Developer Tools, it says that 1,2,4 are in Arial Unicode MS, where 3,6 are in Geeza Pro. This may be relevant when Anatoli has a chance to look at it on the problematic device. I've taken a screenshot, but can't for the life of me figure out how to put it up here. --Catsidhe (verba, facta) 01:17, 21 March 2014 (UTC)
I see all letters connected on Safari/iPad/iOS 5. Fonts and rendering vary on Safari/Mac OS X 10.9.2 (see screenshot).  Michael Z. 2014-03-21 01:00 z
Thanks. I needed to know if Safari/Mac is affected. @Mzajac: What's the quick way to upload files to Wiktionary? --Anatoli (обсудить/вклад) 01:04, 21 March 2014 (UTC)
I just hit the Upload File link in the tools sidebar on this page, on the Mac. Haven’t tried it on the iPad. Michael Z. 2014-03-21 13:46 z
Upload them to the commons:Main Page. If you're on an iPad, there's even an app for that. --WikiTiki89 01:19, 21 March 2014 (UTC)
Thanks, I never tried. --Anatoli (обсудить/вклад) 01:40, 21 March 2014 (UTC)

Numbers 3, 5 and 6 look okey, the others are broken --Anatoli (обсудить/вклад) 13:19, 21 March 2014 (UTC)

Formatted Urdu is also disconnected. --Anatoli (обсудить/вклад) 13:22, 21 March 2014 (UTC)
Then we can rule out the fonts, because number 6 uses the same fonts as the CSS, but explicitly. Try these:
  1. عربي (my expectation: working)
  2. عربي (my expectation: working)
  3. عربي (my expectation: working)
  4. عربي (my expectation: no idea, particularly interested in this)
  5. عربي (my expectation: no idea, particularly interested in this)
  6. عربي (my expectation: no idea, particularly interested in this)
The latter three, which I am particularly interested in, test the CSS unicode-bidi property; I don't know much about it, but it is definitely related to how right-to-left languages are displayed, and is basically the only thing left in MediaWiki:Common.css that could possibly be causing this. --WikiTiki89 17:57, 21 March 2014 (UTC)
All 6 are working! --Anatoli (обсудить/вклад) 20:46, 21 March 2014 (UTC)
Then I'm stumped. It seems to be the "Arab" class itself that's causing the problem, and not the properties defined in it. --WikiTiki89 00:53, 22 March 2014 (UTC)
Thanks for trying, anyway. :) --Anatoli (обсудить/вклад) 01:46, 22 March 2014 (UTC)
It might still help if you could figure out how to view the HTML source. A secondary solution would be to fake-view the source code like this, which might also help. --WikiTiki89 06:06, 22 March 2014 (UTC)
Thanks. That worked, although the source the source looks messier than in other browsers and there's no search in the latest Safari on iPad:
<p><strong class="Arab headword" lang="ar" xml:lang="ar">عَرَبِي</strong> <a href="/wiki/Wiktionary:Arabic_transliteration" title="Wiktionary:Arabic transliteration" class="mw-redirect">•</a> (<span lang="" xml:lang="">ʿarabī</span>) <span class="gender"><abbr title="masculine gender">m</abbr></span>, <b><span class="Arab" lang="ar" xml:lang="ar">عَرَبِيَّة</span></b> (ʿarabíyya) <span class="gender"><abbr title="feminine gender">f</abbr></span>, <b><span class="Arab" lang="ar" xml:lang="ar">عَرَب</span></b> (ʿarab) <span class="gender"><abbr title="plural number">pl</abbr></span></p>
--Anatoli (обсудить/вклад) 07:11, 22 March 2014 (UTC)

Lua Regex[edit]

I was wondering if anyone knows whether there is any access to a proper regex in our Scribunto. My research points to no, but I thought I'd check. -Atelaes λάλει ἐμοί 05:21, 20 March 2014 (UTC)

What do you need to do that can't be done with Lua patterns? --WikiTiki89 05:22, 20 March 2014 (UTC)
"Or" and "And" that can be applied to groups of characters. -Atelaes λάλει ἐμοί 08:31, 20 March 2014 (UTC)

CirrusSearch[edit]

I see that the Italian Wiktionary now uses CirrusSearch instead of the old LuceneSearch. Are we planning on doing the same? SemperBlotto (talk) 08:34, 20 March 2014 (UTC)

We have it as a beta feature. Keφr 09:26, 20 March 2014 (UTC)

Template:en-plural noun[edit]

Appears to unintentionally bold "plural" when a term is formatted to be both countable and uncountable. See Mallwart for a current example. -Cloudcuckoolander (talk) 21:51, 21 March 2014 (UTC)

Did you mean {{en-proper noun}}? Fixed. Keφr 07:25, 22 March 2014 (UTC)
Yes, sorry. Thanks for taking care of the issue. -Cloudcuckoolander (talk) 11:35, 24 March 2014 (UTC)

Linking Pages Between Languages[edit]

I created a page for the Spanish word "bizquear" in the English Wiktionary (https://en.wiktionary.org/wiki/bizquear), and when I was searching for etymological information, I found out that there already existed a page in the Spanish, Polish, Malagasy, Dutch, and Mandarin Chinese Wiktionaries (https://es.wiktionary.org/wiki/bizquear). The page that I created was not automatically linked to the already existing pages (it doesn't not say English on the left column of them, and on mine it doesn't have the other languages). How can I link the entries? Is there a way to automatically search through all language Wiktionaries to find out of there is an entry in another language?

I think you're looking for Help:Interwiki linking. There should be a bot that handles this automatically (eventually). Equinox 21:34, 22 March 2014 (UTC)
I think there are several bots that do occasional runs; after a week or so the links would have been added by bot. But better yet, interwiki links should be handled automatically by Wikidata, as they are at Wikipedia. —Aɴɢʀ (talk) 16:57, 23 March 2014 (UTC)
My impression, based on numerous threads on wikidata:Wikidata talk:Wiktionary, is that the Wikidata-ers are not only not interested in handling Wiktionary's interwiki links, but are interested in not handling Wiktionary's interwiki links. Some Wikidata-ers periodically express an interest in creating elaborate systems to transclude definitions into multiple entries and such, and then people shoot those plans full of holes because words are almost never exactly synonymous across languages. But interwiki linking of en:diede:die, etc, which is the one thing Wikidata already possesses the technical capacity to do, and which is the one thing the majority of the Wiktionarians who've commented on that page have said they want, seems to be one thing Wikidata-ers aren't going to implement.
Please, chime in: wikidata:Wikidata talk:Wiktionary#Moving_inter-Wiktionary_links_to_Wikidata.2C_redux.
- -sche (discuss) 18:44, 23 March 2014 (UTC)

Bokmål/Nynorsk regional labels[edit]

Moved from Wiktionary talk:Votes/pl-2014-03/Unified Norwegian Is it possible to categorise Norwegian by Category:Norwegian Bokmål and Category:Norwegian Nynorsk by changing Module:labels/data. I have tried but the effect wasn't as desired, I was getting "Norwegian Bokmål Norwegian".

Test phrasebook entries with labels {{cx|Bokmål}} and {{cx|Nynorsk}}:

  1. jeg elsker deg/eg elskar deg
  2. jeg vet ikke/eg veit ikkje

--Anatoli (обсудить/вклад) 06:41, 25 March 2014 (UTC)

You should move this discussion to the Grease Pit, this has nothing to do with the vote. --WikiTiki89 06:44, 25 March 2014 (UTC)
It's a bit counter-intuitive; as you've now noticed, you have to use "plain_categories" ... it did take me a while to figure that out, back when I was trying to add an Oxford British spelling label. (What threw me off what that it says plain_categories "does not support language-specific categories". I think I'll try to clarify that documentation a bit...) - -sche (discuss) 06:49, 25 March 2014 (UTC)
Thanks for fixing it! --Anatoli (обсудить/вклад) 06:52, 25 March 2014 (UTC)
I do think that we should only use these labels if specific senses are used only in Bokmål/Nynorsk, not if the word as a whole is. —CodeCat 13:25, 25 March 2014 (UTC)
Why is that? --WikiTiki89 16:34, 25 March 2014 (UTC)
Because that's what context labels are for. They're for labelling senses. It hardly seems practical if we have to label all 10 senses of some word with the same "Bokmål" label. —CodeCat 18:21, 25 March 2014 (UTC)
@CodeCat:. I don't see how we then can use the unified approach. The readers will assume that words are both Bokmål and Nynorsk. An additional parameter in the header would help. --Anatoli (обсудить/вклад) 21:49, 25 March 2014 (UTC)
I have nothing against putting a label in the headword line. In fact, I wonder if we shouldn't start doing that for more words generally. It would solve the problem we've had in the past of distinguishing archaic terms from terms with archaic senses, and other similar issues. If we assume that context labels always indicate sense-specific things, and the headword line is used to cover the term as a whole, then there is no ambiguity anymore. —CodeCat 21:54, 25 March 2014 (UTC)
I think we don't need a vote for putting a label in the headword line, since "Norwegian" ("no") is allowed. The parameter for the other form, e.g. Nynorsk on a Bokmål header would be useful. There's nothing new here, there's Cyrillic on Roman Serbo-Croatian, traditional on a simplified Chinese entry. I'm neutral on your other general suggestion. --Anatoli (обсудить/вклад) 22:26, 25 March 2014 (UTC)
The two Norwegian standards don't differ by script, so there may be terms where the basic form is the same in both standards, but they may have different inflected forms (I think genders too). Trying to put all that in one headword line would make it too cramped, so I propose that, if a term occurs in both standards with the same meaning, we show two headword lines: Bokmål first, then Nynorsk right below it. Something like this:
dag m (Bokmål, inflections go here...)
dag m (Nynorsk, inflections go here...)
How is that? —CodeCat 22:48, 25 March 2014 (UTC)
This looks OK to me but I don't know if autoformat will allow that (two header lines). What about words without inflected form, which are shared? Could we have "both" parameter or smth to show they are applicable to both Bokmål and Nynorsk to avoid two lines? --Anatoli (обсудить/вклад) 23:01, 25 March 2014 (UTC)
There isn't a reason why a single template couldn't display both headword lines. So both lines above could be displayed by a single call to {{no-noun}} (once we modify it). The parameters might get a little confusing, though, if we want to use numbered parameters. You might end up with something like {{no-noun|both|Bokmål gender|Nynorsk gender|Bokmål plural|Nynorsk plural|...}}. So we may want to use named parameters instead like: {{no-noun|both|g-nb=|g-nn=|pl-nb=|pl-nn=}}. Here, only the first numbered parameter does anything. Alternatively, we could use numbered parameters for Bokmål and additional named ones for Nynorsk. That's POV of course, but it's ok to be POV in template parameters. :) —CodeCat 23:10, 25 March 2014 (UTC)
Perhaps you need to make a few samples with various situations for people to see. I don't have a problem with your suggestion. --Anatoli (обсудить/вклад) 23:21, 25 March 2014 (UTC)
I'm not sure what else to show. Does my example not get the point across well? —CodeCat 23:23, 25 March 2014 (UTC)
Actually it does. I just think there will be many 100% identical forms, for which we won't need two lines - adverbs, proper nouns (same gender), numerals, preposition, etc. or when the variety is unknown, default to just "Norwegian" with a one-line header. --Anatoli (обсудить/вклад) 23:42, 25 March 2014 (UTC)

{{rfv-etymology}}[edit]

Please replace the current code with the following: {{#switch:{{NAMESPACE}}|<!--Main-->|Appendix|Template|Transwiki={{maintenance line|1=<span id="rfv-etymology-notice-{{{lang|}}}-{{{topic|}}}"/>Can [[Wiktionary:Etymology scriptorium#{{{fragment|{{{section|{{PAGENAME}}}}}}}}|this]]<!-- + Link --><sup class="plainlinks">([{{fullurl:Wiktionary:Etymology scriptorium/{{CURRENTYEAR}}/{{CURRENTMONTHNAME}}|action=edit&section=new&preload=Template:rfv-etymology/preload&preloadtitle=%5B%5B{{urlencode:{{FULLPAGENAME}}}}%23rfv-etymology-notice-{{{lang|}}}-{{{topic|}}}%7c{{urlencode:{{FULLPAGENAME}}}}%5D%5D}} +])</sup><!-- --> etymology be [[Wiktionary:References#Etymologies|sourced]]?}}}}</span><includeonly>{{#ifeq:{{NAMESPACE}}||{{#if:{{{lang|}}}|[[Category:Requests for etymology ({{#invoke:languages/templates|lookup|{{{lang}}}|names}})|{{{sort|{{PAGENAME}}}}}]]|[[Category:Requests for etymology|{{{sort|{{PAGENAME}}}}}]]}}}}</includeonly><!-- --><noinclude>{{documentation}}</noinclude>

--kc_kennylau (talk) 23:34, 25 March 2014 (UTC)

I strongly oppose. This would point to the wrong month when the next month starts. --WikiTiki89 23:38, 25 March 2014 (UTC)
@Wikitiki89: No, the edited area only affects the "plus" sign which should point to the current month. --kc_kennylau (talk) 08:54, 26 March 2014 (UTC)
Yes check.svg Done In the future please explain what the change does rather than just simply pasting a bunch of code. --WikiTiki89 09:03, 26 March 2014 (UTC)

A new Template:context-like template for use on the headword line[edit]

I think CodeCat is on to something when she observes above that "putting a label in the headword line [...] would [also] solve the problem we've had in the past of distinguishing archaic terms from terms with archaic senses". In particular, it seems to me that if we created a new {{context}}-like template for use on headword lines, and had it put entries into a different set of categories than sense-line {{context}} (perhaps by expanding Module:labels/data so that it knew which category to use when a label was used in {{context}}, and which to use when the label was used in the headword-context template), that would solve the problem. Then

==English==
===Noun===
{{en-noun}} {{headword-line-context-template|archaic}}
# A foobar.

could put entries into [[Category:English archaic terms]], while

==English==
===Noun===
{{en-noun}} 
# {{context|archaic}} A foobar. {{defdate|until the 18th century}}
# A whatsit. {{defdate|since the 17th century}}

could put entries into [[Category:English terms with archaic senses]]. And we could always make some categories the same in certain cases; for example, we could decide that both American spellings like academize and terms with senses specific to American English should go into [[Category:American English]] or not; this is just an example.
What do you think of this idea? What could the headword-line-context-template be called? - -sche (discuss) 00:53, 26 March 2014 (UTC)

I'm wondering if headword lines wouldn't become too long if we do it like this. We may want to consider placing them on the next line, but indented in some way?
As for implementation, we would have to look closely at which labels we use for which. There are probably labels that don't make much sense when generalised across a whole term, or vice versa. There is also the danger that if at some point we have an entry with a term-level label, someone may add a new sense for which the label doesn't apply, and forget to fix it. —CodeCat 01:43, 26 March 2014 (UTC)
Don't you think we'd be better off calling it {{template-used-similarly-to-context-templates-but-designed-specifically-for-the-headword-line}}? --WikiTiki89 05:12, 26 March 2014 (UTC)
What do you mean? —CodeCat 17:57, 28 March 2014 (UTC)
@CodeCat: I was making a joke about what -sche called the template in the example above. --WikiTiki89 18:56, 28 March 2014 (UTC)
No, that would be too cryptic. How about {{template-designed-to-be-used-in-the-headword-line-specifying-context-in-which-the-word-is-used-also-please-remember-to-remove-it-when-a-sense-to-which-it-does-not-apply-is-added}}? Keφr 18:05, 28 March 2014 (UTC)
Um, ok? —CodeCat 18:12, 28 March 2014 (UTC)
How about {{head-context}}? Whatever we call it, I suggest {{hc}} as a shortcut, in the manner of {{cx}}. Is anyone interested in designing it? With my limited coding skills, I could copy the contents of {{context}} to {{head-context}} and modify it to call on a separate set of label-to-category correspondences stored in e.g. Module:labels/data/2, but I think it would be more elegant and simpler to update (add/remove labels, change categories, etc) if {{context}} and {{head-context}} both stored their label-to-category correspondences in Module:labels/data. I think that requires modifying Module:labels/data to have two sets of category values, perhaps called "sense_plain_categories" and "head_plain_categories" (or plain_category_correspondence_­for_the_template_­designed_to_be_used_­in_the_headword_line_­specifying_context_­in_which_the_word_is_used_­but_please_remember_to_remove_it_­when_adding_a_sense_­to_which_it_does_not_apply), and then modifying {{context}} to know that it calls the sense-line categories, while {{head-context}} calls the headword-line categories. - -sche (discuss) 18:42, 28 March 2014 (UTC)
That sounds fine. --WikiTiki89 18:56, 28 March 2014 (UTC)
What about something that matches {{label}} instead? —CodeCat 18:57, 28 March 2014 (UTC)
Created at {{head-label}}, with syntax identical to {{label}}; categories are pulled from keys suffixed with _head; they fall back to unsuffixed keys. Keφr 18:59, 28 March 2014 (UTC)
Way too premature. We haven't even decided how to name it yet, let alone how it's going to work. —CodeCat 19:10, 28 March 2014 (UTC)
Not only that, his edit broke Module:labels. This is a really, really bad way to make changes to modules that are depended on by hundreds of thousands or even millions of entries. Chuck Entz (talk) 19:22, 28 March 2014 (UTC)
I think that rather than having {{context}} and {{head-context}}, we should use {{sense-context}}/{{sense-label}} (the original) and {{term-context}}/{{term-label}} (the new one). That is, we name the templates after their purpose rather than after the place they go. Renaming the old template would also make it clearer. We can keep {{context}}, {{cx}} and {{label}} around as redirects, and we could probably create new ones too like {{scx}} and {{tcx}} if we want. —CodeCat 19:25, 28 March 2014 (UTC)
Actually, if we're going to rename them anyway, I propose we get rid of the old name "context" altogether and use "label" exclusively, along with its parameter format (1st parameter is language code). —CodeCat 19:27, 28 March 2014 (UTC)
Yes, I like the idea of a {{scx}}{{sense-context}} (and/or {{slb}}/{{slbl}}{{sense-label}}), {{tcx}}{{term-context}} (and/or {{tlb}}/{{tlbl}}{{term-label}}) parallel naming scheme. The question of whether the language code should be the first parameter or set by lang= is a separate question, which means that for now, if someone creates {{sense-context}} it should mimic {{context}}, and/or if someone creates {{sense-label}} it should mimic {{label}}, IMO. - -sche (discuss) 21:09, 28 March 2014 (UTC)
A lot of people will complain. We can have the old {{context}} templates redirect to the new {{sense-context}} templates. --WikiTiki89 22:38, 28 March 2014 (UTC)
We can keep the original name around as a redirect, but we should convert existing entries to use the new names. —CodeCat 22:50, 28 March 2014 (UTC)
(@Wikitiki:) Yes, of course {{context}} and {{cx}} should still work (as redirects). - -sche (discuss) 04:32, 29 March 2014 (UTC)
(@CodeCat:) We should be aware that some entries currently use {{context}} on the headword line (where we ultimately want {{tcx}}), and others use it in translations tables (where we want {{qualifier}}). I am cleaning up some of those misuses with AWB, by temporarily redirecting {{tcx}} to {{context}} [sic] and changing headword-line instances of {{context}} to {{tcx}}. My intention is that what the entries display and how they categories is not changed at this time, but they will all update automatically (as the job queue gets to them) once we create {{term-context}} and re-redirect {{tcx}} to it. - -sche (discuss) 06:24, 29 March 2014 (UTC)
A number of headword-line instances of {{context}} which I updated to {{tcx}} were instances of the template being used to impart transitivity or reflexivity information about Italian verbs. Reflexivity probably is an "all-term" (headword-line-worthy) feature, just like {{tcx|American spelling}}, but transitivity information should perhaps be moved en masse to the sense line at some point. - -sche (discuss) 03:20, 4 April 2014 (UTC)
Until Module:labels can be expanded to handle and distinguish sense-line ("sense-only") and headword-line ("all-term") templates and labels, I have set {{term-context}} (shortcut: {{tcx}}) to use Module:labels2, as I suggested above. You can now see it in action on a number of pages, like [[center of gravity]] and [[adrad]], as well as [[不著調]] and [[abolisht]]. Thinking about that last one: should we create a dedicated form of template à la {{obsolete spelling of}} for obsolete past tense forms, and third-person singular present tense forms, of verbs? - -sche (discuss) 03:20, 4 April 2014 (UTC)
It is already quite obvious that Module:labels can be expanded; we just have not agreed on how to do it. Whatever happened to entia non sunt multiplicanda præter necessitatem? Keφr 05:25, 4 April 2014 (UTC)
Now that I think about it: since the pattern for these categories seems to be "Legalese terms with X senses" vs "Legalese X terms", why not store only "X" in the data page? Keφr 06:24, 4 April 2014 (UTC)

selecting targeted languages[edit]

"selecting targeted languages" no longer works in my "Vector" skin - is it something I've done - does the new Beta option interfere? Saltmarsh (talk) 07:16, 26 March 2014 (UTC)

It’s working for me, but the icons that used to appear next to the language names are invisible. — Ungoliant (falai) 07:44, 26 March 2014 (UTC)
Well my previously selected languages no longer appeared in the table heading - and while the cursor changes to indicate something is where the icons should be, clicking seems to have no effect. Saltmarsh (talk) 14:29, 26 March 2014 (UTC)
Fixed (I think). --Yair rand (talk) 15:31, 26 March 2014 (UTC)
Working well - thanks Saltmarsh (talk) 18:51, 26 March 2014 (UTC)

Bot account needed for manual "form of" creation with AWB?[edit]

This is the AWB plugin that I'd like to use w:Wikipedia:CSVLoader/Walkthrough with CSV files that look like this User talk:Neitrāls vārds/CSV.

Wikipedia says that you need a bot account only in autosave mode (not when when you are manually clicking "save" or "skip") but it doesn't appear to be doing anything it just says "reattempting in 20 seconds," giving the impression that it's not allowed the privilege of creating a new page. So on Wiktionary you need a bot account for new page creation with AWB in all cases, is that right?

(Actually I did want a bot flagged account because even if I did it manually, in larger quantities recent changes would be seriously flooded.) Neitrāls vārds (talk) 09:21, 26 March 2014 (UTC)

I had the path to article wrong (in that it wasn't even supposed to be a path but just the new article's name) didn't quite solve the "reattempting" message but after playing with it some more something did (I'm not even sure what.) So, in manual mode you do can create new entries.
Sadly it appears to be forcing upper case (Wikipedia-style) to any newly created entries so for the time being I'm stuck with country and town names (although I do have quite a few of those.) I'll try to ask the plugin's author if he can make the pre-upper case version downloadable. Neitrāls vārds (talk) 11:09, 26 March 2014 (UTC)

Problem with display of Russian verb conjugations[edit]

Today we noticed that it is not possible to see the conjugations of Russian verbs in Wiktionary. Normally, a 'show' button appears, and you can click on that to see the full conjugations. However, the show button does not appear, and you cannot see the conjugations. This is in Safari, Firefox, and Google Chrome. It seems there must be a server problem. I use this with my students on a daily basis so if there is a way to fix it, that would be great! —This unsigned comment was added by Brooke2 (talkcontribs).

  • Works OK for me (Google Chrome) but I often have to scroll sideways to see it all. SemperBlotto (talk) 12:34, 28 March 2014 (UTC)


It still does not work for me in Safari, Chrome or Firefox. It worked on a couple of words, but most of the words have lost their link to declensions and conjugations. This is a huge problem for anyone studying Russian. My students use this function every day to look up conjugations of words. How do we ask someone to turn this function back on? It is very strange that is coming and going. (If you go to the right, there is no 'show' button as there used to be.) —This unsigned comment was added by Brooke2 (talkcontribs).

Fundamentally this is a problem with en.Wiktionary choosing to not use Mediawiki's default mw-collapsible mw-collapsed element classes. This project developed its own solutions before the rest of Mediawiki did (<1.18). Perhaps you can convince the local templaters to update, but it is a very large task you would be asking them to accomplish. - Amgine/ t·e 17:04, 28 March 2014 (UTC)
I agree it would be better if we used the standard method instead of our own. A major problem with our current solution is that it involves putting a table inside a div. This makes it impossible to scale the table according to its contents; all tables have to have a fixed width. The standard solution doesn't have this limitation as far as I know. —CodeCat 17:59, 28 March 2014 (UTC)

So what is the solution? This is a huge problem for people who have become dependent on Wiktionary as the source for Russian verb conjugations and declensions, and it is a huge loss of information. It was working fine just last week. —This unsigned comment was added by 96.4.165.223 (talk).

Fixed already. Clear your cache and see: купить (kupitʹ). It was my careless coding while working on Wiktionary:Preferences/V2 which caused it. Keφr 18:36, 28 March 2014 (UTC)
Actually, not. The local collapsible solution does not work in several browsers depending on the additional security applied. (This is only a small reason why it should be replaced. The bigger reasons are standardization and maintenance.) - Amgine/ t·e 22:52, 28 March 2014 (UTC)
Collapsibles hide only when JavaScript is allowed to be run on Wiktionary. The button to uncollapse them is set up by MediaWiki:Common.js. What kind of schizophrenic security settings cause that? Keφr 06:42, 29 March 2014 (UTC)
If js were not running on Wiktionary, neither collapsible table would be collapsed. The mw-collapsible table also has a visible Expand button, while the Wiktionary collapsible does not. This is a fairly vanilla install of Tor Browser Bundle, better known as FireFox using Tor with maximum privacy settings except js enabled. - Amgine/ t·e 18:11, 29 March 2014 (UTC)

Thank you! I will not pretend to understand the issues with standardization and maintenance, but all of the words that did not work for me this morning do work now. Thank you SO much!!! —This unsigned comment was added by 75.64.165.216 (talk).

Template:context or Template:label?[edit]

Currently, we have two different templates for the same thing. The only difference is in the language parameter: {{context|label1|label2|lang=code}}, {{label|code|label1|label2}}. In the discussion about term-level context labels above, it wasn't really clear which of the two we should standardise on. So I'd like to ask about this now. This shouldn't depend on the outcome of the discussion above, so even if the proposal of using separate term-wide labels fails, this should still be considered on its own. But I do think we should decide this before carrying out that proposal, because otherwise it will cause a wide proliferation of label templates, we'd need at least 4 of them to cover all possibilities. —CodeCat 17:29, 29 March 2014 (UTC)

Use {{label}}. Quicker to type, even when compared to {{cx}}, and still descriptive, especially when compared to {{cx}}.
On a related note, when can we start deleting orphaned context label templates? Do we have to run it through the bureaucracy? Keφr 17:58, 29 March 2014 (UTC)
Um... you can delete them right now if you want to. I have been doing it for a while, off and on. There's just so many and it's not easy to automate it because not all of them are orphaned yet. There are still a lot of redirects among them that need to be converted into aliases in the data module.
As for being quicker to type, {{label}} also has the shortcuts {{lb}} and {{lbl}}. —CodeCat 18:04, 29 March 2014 (UTC)
Don't delete the regional context labels, unless you successfully amend template:eye dialect of and template:alternative spelling of, which use them, to do what they do by some other means. Thanks! (Pinging user:Kephir and user:CodeCat, since this is a late reply.)​—msh210 (talk) 06:16, 7 April 2014 (UTC)
{{whatever|code|label1|label2}} is obviously better than {{whatever|label1|label2|lang=code}}, and "label" is not a bad title, so {{label}} looks better. I don't like the shortcuts for {{label}}, "lb" reminds of something else, and "lbl" is just 2 letters shorter, let's keep it descriptive. --Z 18:46, 29 March 2014 (UTC)
Well, some editors (you know who I mean) whine about even one or two letters... —CodeCat 18:49, 29 March 2014 (UTC)

Module:labels/data again[edit]

I've made an innocent edit and got suddenly errors on line 1662. Who can fix it? Ignatus (talk) 18:42, 30 March 2014 (UTC)

It's not your fault. We have a certain clumsy editor who likes making changes without previewing the code, and who may be blocked if he continues to do so (let's hope he takes the hint). --WikiTiki89 19:22, 30 March 2014 (UTC)
In this case I don't think we can blame him. He's migrating over the remainder of our context labels, and occasionally a typo may slip through. To ask him to preview every single edit is going to frustrate it... —CodeCat 03:07, 31 March 2014 (UTC)

For editors only CSS[edit]

I seem to recall that some time ago someone had proposed a CSS class which only displayed to folks who opted in or only displayed in editing/preview mode. The idea was for notes, error messages and the like to be displayed to editors, but not to readers. Has something like this been implemented yet? If not, would someone be willing to craft it? My CSS is rubbish. -Atelaes λάλει ἐμοί 02:58, 31 March 2014 (UTC)

There have been a few small things like this already, but nothing that covers all mistakes. —CodeCat 03:05, 31 March 2014 (UTC)
Huh? Covers all mistakes? -Atelaes λάλει ἐμοί 03:10, 31 March 2014 (UTC)
Well, you're talking about a single CSS class. So presumably that single class would cover the display of all mistakes/problems in entries. Right now, the few classes we have for that purpose are targeted towards individual types of problems, like "foreign words lacking a language code". —CodeCat 03:25, 31 March 2014 (UTC)
The body element on every page has a class action-something, e.g. view, edit, or submit, matching the action in the query part of the URL. Thus you can style a box with some class, say editing, and use CSS like .action-view .editing {display:none}. (And JavaScript can be used to show it even on action=view but only to certain users (e.g., logged in, autoconfirmed, or admin), which IIRC is what we do with the "News for editors" link atop each page.) Note, though, that including something in a page and using CSS to hide it is poor Web design: better (for us, who can't control the server) would be to generate the box using JS only for such users as should see it.​—msh210 (talk) 06:13, 7 April 2014 (UTC)

A problem with Template:circumfix[edit]

Hello all. Please note this discussion between CodeCat and me, which I have copied from Thread:User talk:CodeCat/A problem with Template:circumfix:

Hi CodeCat. Currently, when {{circumfix}} is used, it generates three links, one for the circumfixed antecedent term, and one each for the preceding and following parts of the circumfix; this results in links to pages that may be utterly irrelevant or even non-existent. Take the example of the Georgian word უბრალო (ubralo), in the entry for which {{circumfix|უ- -ო|უ|ბრალი|ო|lang=ka}} currently displays this:
*უ- (*u-) doesn't exist and -ო (-o) contains nothing relevant; both the links on either side of ბრალი (brali) should link to უ- -ო (u- -o). {{circumfix|უ- -ო|უ|ბრალი|ო|lang=ka}} should display this:
Unless I'm missing something, all that should be necessary is to pipe the links for the first and third terms, so that they link to the proper page, for the circumfix. Would you be able and willing to edit {{circumfix}} to correct this bug, please? Thanks for your time.
 — I.S.M.E.T.A. 02:46, 31 March 2014 (UTC)
The template already has alt1= and alt3= parameters. What should be done with those?
CodeCat 02:48, 31 March 2014 (UTC)
I don't suppose that's a problem, or is it? They just change what's displayed, rather than the link, right?
 — I.S.M.E.T.A. 03:06, 31 March 2014 (UTC)
Currently, the template takes four main parameters: the circumfix itself, the first part, the middle word, and the second part. On top of that, it takes alt1, alt2 and alt3 parameters to override the display of each part. If I change the template so that it uses the first parameter to figure out what to link to, then the 2nd and 4th parameters only really change the display, and alt1 and alt3 parameters don't really have a distinct use anymore.
CodeCat 03:11, 31 March 2014 (UTC)
Oh, I see. So, should we get rid of them? I suppose before we do, we'd need to fix any current uses of them. Is there a way to find out which entries, if any, use either or both of these alt1 and alt3 parameters?
 — I.S.M.E.T.A. 03:16, 31 March 2014 (UTC)
What I'm more curious about is why there is that first parameter to begin with. It seems redundant to the 2nd and 4th, which could presumably be used to reconstruct the "whole" circumfix from its two parts.
But since this is a rather specific template that not many languages need, I can't really judge the merit of how it was made to work. Maybe you should ask at the GP, so that those who actually work with this template can explain.
CodeCat 03:19, 31 March 2014 (UTC)
My guess is that the first parameter is necessary for autogenerating the category name, which mentions the circumfix. I'll get to posting in the Grease Pit now.
 — I.S.M.E.T.A. 03:51, 31 March 2014 (UTC)

Is there some benefit to the current set-up that I'm not seeing? And if not, are there any objections to making the change I've described? — I.S.M.E.T.A. 03:53, 31 March 2014 (UTC)

I remembered this template as counterintuitive and hard to use when I created ketuanan. There's a redundant parameter 1, and strangely it doesn't link to it (the circumfix page), only to the prefix and suffix pages. Wyang (talk) 04:33, 31 March 2014 (UTC)
@Wyang: Do you believe that this (the linking "to the prefix and suffix pages") should be changed? — I.S.M.E.T.A. 05:03, 31 March 2014 (UTC)
Yes. Wyang (talk) 05:04, 31 March 2014 (UTC)
Template:circumfix is named and creates categories as if it were designed to be used when a circumfix has been applied to a word, but it creates links in a way that suggests it was designed to be used when a prefix and a suffix have simultaneously been applied to a word (which is properly the domain of Template:confix, I think). It's a strange template and I agree it needs to be cleaned up. - -sche (discuss) 05:36, 31 March 2014 (UTC)
Yes, AFAICT, except for categorising, {{circumfix}} and {{confix}} are virtually identical. It seems that there is general agreement in favour of making this change. Before we do that, is there anyone I should ping to chip in? — I.S.M.E.T.A. 01:24, 1 April 2014 (UTC)
Apparently not. - -sche (discuss) 03:31, 4 April 2014 (UTC)
First, we'd need to make sure that any cases where the first parameter does not equal what we intend to make it into, are tracked down and fixed. I know there are a few where the two parts are shown as a--b with no space between. —CodeCat 20:24, 8 April 2014 (UTC)
Ok, here is a list of entries where there is a mismatch. The category for these entries should presumably be renamed, to have a space between the two hyphens. abatatar, anoratana, isoratana, ividiana, keadaan, kejohanan, ketuanan, mengenai, pengovuman, perempuan, مڠناءي, ڤرمڤوان, ڤڠوۏومن. —CodeCat 20:53, 8 April 2014 (UTC)

"eye dialect" script errors[edit]

Can anyone figure out and fix whatever it is that has suddenly put dozens of entries with "{{context|eye dialect" in Category:Pages with script errors? Chuck Entz (talk) 23:38, 31 March 2014 (UTC)

This is one of the problems with the "old" context labels. Remember how we would always run into problems when labels conflicted with existing templates? Well that's what's happening here too. The context templates actually get transcluded by {{context}} to see if they are valid. But now that we have Lua, the template {{eye dialect}} triggers a script error when it's transcluded with the wrong parameters. And there you go... Luckily Kephir is working on deleting the last few remaining context templates, and then we can remove the transclusion code from the module. —CodeCat 00:04, 1 April 2014 (UTC)
Wait, what? Deleting regional context templates? How then will {{eye dialect of}} and {{alternative spelling of}} use the from parameters?​—msh210 (talk) 06:04, 7 April 2014 (UTC)
That would be part of fixing {{context labelcat}}, which we haven't done yet. But it hasn't been a high priority because it's not used that much. —CodeCat 12:51, 7 April 2014 (UTC)
Yeah... so we can't delete regional context templates at this point.—msh210℠ on a public computer 22:03, 7 April 2014 (UTC)
(In case this wasn't known,) there are some other templates besides alternative spelling of which take from= parameters in (apparently) the same way as alternative spelling of, such as {{standard spelling of}}. - -sche (discuss) 22:29, 7 April 2014 (UTC)
To expand on what CodeCat said: Template:eye dialect was (and is) a redirect to Template:eye dialect of, so when {{context}} looked there, it found something that wasn't a context label ... and Module:labels/data apparently did not contain "eye dialect" until I added it just now(?!), so when {{context}} looked there, it likewise did not find "eye dialect" to be a context label. But why that didn't cause script errors until now, I don't know. A separate issue is that some (and possibly all) uses of {{context|eye dialect}} should really be {{eye dialect of}}. I am looking over such uses now. - -sche (discuss) 00:13, 1 April 2014 (UTC)
I made some changes to {{eye dialect of}} recently, that would probably be it. There's nothing wrong with the template, it's {{context}} that's the problem. —CodeCat 00:18, 1 April 2014 (UTC)
The transclusion fallback can be deleted already. There are still some transclusions of context labels, but none are coming from Module:labels. And thanks User:Msh210 for mentioning {{eye dialect of}} and {{alternative spelling of}}, because I too have been wondering about these. Keφr 06:20, 7 April 2014 (UTC)
I wonder why these templates use context labels in the first place. It seems a bit like misuse to me. —CodeCat 23:38, 7 April 2014 (UTC)
Because regional context templates were created to match a list of regions whose dialects we consider, and that's a good match for these templates.​—msh210 (talk) 02:39, 8 April 2014 (UTC)
What's wrong with {{cx|region}} {{eye dialect of|word}}? --WikiTiki89 23:45, 7 April 2014 (UTC)
That'd indicate the spelling is used in that region. {{eye dialect of|word|from=region}} means the spelling represents the dialect of that region.​—msh210 (talk) 02:39, 8 April 2014 (UTC)
Then I don't see how context labels are going to be of any use there. All the regional labels we have are used to categorise terms used in specific dialects, not terms that imitate dialectal speech. We don't want a term used to imitate Scots to appear in Category:Scottish English. —CodeCat 02:55, 8 April 2014 (UTC)
(Even if not, we do want to use the regional context templates in {{standard spelling of}} and {{alternative spelling of}}. But) I think we do wish to so categorize eye-dialect entries. Massa is currently defined as "Eye dialect spelling of master, representing African American Vernacular English". In other words, the term is used by writers who write in any dialect, but in quoted speech where the speaker speaks AAVE (or pretends to). The quoted speech thus transcribed is indeed AAVE, so I think it makes sense to so categorize it.​—msh210 (talk) 05:24, 8 April 2014 (UTC)
So the kinds of things that white performers used to say in blackface are supposed to be categorized as AAVE? Pray tell, what dialect of "Injun" should we use to categorize "me smoke'um peace pipe"? Chuck Entz (talk) 06:07, 8 April 2014 (UTC)
I was thinking more of speech by AAVE-speaking characters in novels written by non-AAVE-speaking people. Do you not think they should be AAVE-categorized? Do you also think British speech in a novel by an American should not be British-categorized?​—msh210 (talk) 17:39, 8 April 2014 (UTC)
Something that is not attestable by an actual British speaker should not be treated as British. So if an American writer writes something in imitation of British speech, then that should not count as an attestation for British English usage. I mean, if I say some random jibberish and call it Hindi, that doesn't make it attestable Hindi. —CodeCat 17:47, 8 April 2014 (UTC)
Right, that's why we'd use # {{eye dialect of|foo|from=Wherever}} and not # {{context|Wherever}} [[foo]]. But I don't see anything wrong with categorizing it as Wherever. I recommend this be a BP discussion: whether to categorize eye dialect as the dialect. If not, {{eye dialect of}} can be emended.​—msh210 (talk) 03:47, 9 April 2014 (UTC)

April 2014[edit]

Typography update[edit]

The typography update went live today. If anyone notices any problems on Wiktionary related to the update, please ping me. One issue I'm aware of is that some combining diacritics and tie characters may be incorrectly positioned in Firefox on MacOS or Linux (due to lack of proper glyph positioning data in the Liberation Sans and Helvetica Neue fonts). The problem does not seem to occur in other browsers, however. If this problem is significant, the following can be added to MediaWiki:Vector.css:

html,
body {
    font-family: Arimo, Helvetica, Arial, sans-serif;
}

...to override the new font stack of...

html,
body {
    font-family: Arimo, "Liberation Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;
}

Kaldari (talk) 20:01, 1 April 2014 (UTC)

Recent layout changes[edit]

Moved from Wiktionary talk:News for editors

Since a couple of hours ago, the headings for words and languages are given in a different typeface than they were before. At least in my browser. Now that's fine, of course. But I'm wondering if it has anything to do with the strange phenomenon that homophones are somehow not clickable anymore. Example: Homophone: hello. The word "hello" should be blue, but it's not. At least in my browser, again.Kolmiel (talk) 22:45, 1 April 2014 (UTC)

I don't think the layout changes themselves are causing the problem, but it's whatever other changes have been made during the update that seem to have caused it. A rather heavily used template, {{isValidPageName}}, which is used by the {{homophones}} template, is now broken. This code should display "valid", but instead it shows nothing: {{isValidPageName|hello}}. I don't know why it's not working anymore, although some further investigating would probably find the problem.
In any case, I wonder why {{homophones}} needs this template to begin with. We could bypass the problem if we removed it from there altogether. {{isValidPageName}} is listed as deprecated, so if we could orphan it we wouldn't even need to put any effort into fixing it. —CodeCat 22:57, 1 April 2014 (UTC)
I've created Module:User:CodeCat/isValidPageName as a temporary stopgap measure, and changed {{isValidPageName}} to use it. So {{homophones}} should work again as it did before. —CodeCat 23:16, 1 April 2014 (UTC)
Yeah, it's all back to blue :) Thanks.Kolmiel (talk) 21:05, 3 April 2014 (UTC)

Search exact matches only[edit]

With the new update, I've noticed that when I want to go to a page that doesn't exist, instead of giving me the "create this page" message, it sometimes takes me to a page with a similar name. For example if I type "looma" in the search bar and go, it takes me straight to "lööma" instead. Is there a way to prevent that? —CodeCat 22:22, 2 April 2014 (UTC)

We need to have exact match function but near matches should also be found. E.g. searching for этимолог (etimolog) should also give results for этимо́лог (etimólog) (with accent) but it doesn't, e.g. etymologist#Translations has an accented Russian translations, which is not picked up in the search results. --Anatoli (обсудить/вклад) 23:07, 2 April 2014 (UTC)
  • It's a feature. I personally don't particularly like it. But you can get get to the literal page by typing in the URL. --WikiTiki89 23:12, 2 April 2014 (UTC)
(E/C)Yes, when "looma" is typed "lööma" appears in the search window, just below it, there's another line containing... looma. I never had any problem with this, actually but not being able to find accented terms gives me some grief, "looma" produces "lööma" but "этимолог" doesn't produce "этимо́лог". Arabic/Hebrew accents and hamza, Hindi nuqta terms should be mutually searchable. Hamza doesn't cause any problems but Arabic accents do, e.g. "هٰذا" (accented) doesn't produce "هذا" and "फ़िल्म" (with nuqta) doesn't produce "फिल्म". --Anatoli (обсудить/вклад) 23:24, 2 April 2014 (UTC)
Workaround: in a browser that lets you perform custom searches from the address bar (most of them these days), you can set up various shortcuts to avoid wasting time with the search box. In my setup, "k dog" looks up dog on Wikipedia; "d dog" looks it up on Wiktionary; and "dd dog" starts editing it as a new Wiktionary entry (if not already existing). Equinox 23:23, 2 April 2014 (UTC)
Try searching for the string 'category:latin verbs'. Whatever was changed with the most recent update, this is completely unacceptable. Who can we complain to? DTLHS (talk) 18:17, 3 April 2014 (UTC)
Interestingly, you do land on the search page if there's more than page with diacritics. Thus while coln is a red link, cołn, Cöln, cōln, cōłn, and čoln are all blue, and if you type coln into the search box and hit return, it takes you to "create this page". —Aɴɢʀ (talk) 10:59, 8 April 2014 (UTC)
I'm the one you can complain to. What you are seeing is Cirrus which recently stopped being a BetaFeature and started being the primary search backend for all wikis other then wikipedias and another handful. This particular functionality has lots of history that I won't bore you with but the upshot is that the old behavior wasn't well documented and in some cases outright crazy and it didn't work properly with the new search backend. So I had to replace it when migrating search. I took a shoot at the right thing to do and it worked for most wikis except English Wiktionary. I did some work with a BetaFeature user on this wiki a few months ago and we ended up on the behavior that you see now. Of course, when we talked about it we weren't talking about the "create this page" links. We were talking about searches like those Angr mentioned above that find multiple terms after accent squashing. Anyway, tell me what you want to do.
One option is to disable accent squashing entirely. You'd be the only English wiki to do this so you'd be weird in that respect but given that your titles are in tons of languages this might make sense. Theoretically this'd make finding "lööma" harder for someone who doesn't think in accents. Disabling per user is much more difficult from a technical standpoint.
Another is to disable accent squashing on near matches. Its a smaller deviation from the rest of the English wikis but it is closer to the old behavior. On the other hand it yields a confusing thing in the search box where you type, wait for the prefix search to complete, see that the first result is the word you are looking for, then hit enter. You'd only arrive on the first result if your word matched with accents. Otherwise you'd end up on the search results page.
I can't think of any other options that don't fall into the "change your workflow" category. Things like instead of hitting enter in the search box you wait for the suggestions to spring back and then hit up, then hit enter. That'll skip the near match searching and go strait to the results page.
Sorry for the long winded reply. I'm happy to design what it should do for you here or in bugzilla:63682 which I've created to track this.
NEverett (WMF) (talk) 16:20, 8 April 2014 (UTC)
Maybe we could add back the trusty old "Go" and "Search" buttons that don't rely on the search suggestions box popping up? --WikiTiki89 18:09, 8 April 2014 (UTC)
I'm not certain, but I believe that's a skin issue. I'm using Cologne blue which has the two buttons (and, incidentally, is not affected by the Typography refresh.) - Amgine/ t·e 18:17, 8 April 2014 (UTC)
Let me rephrase: Maybe we could add them to the Vector skin. --WikiTiki89 18:35, 8 April 2014 (UTC)
That's a skins question rather than a search question, so that's outside Nik's remit. We can talk about this separate problem later, after we've addressed this one. --Dan Garry, Wikimedia Foundation (talk) 20:43, 8 April 2014 (UTC)
Well it would solve this issue as well. --WikiTiki89 00:17, 9 April 2014 (UTC)
I definitely don't want to disable accent squashing on near matches (or at all); it's the best way of finding terms to add to {{also}} templates at the tops of pages. —Aɴɢʀ (talk) 19:29, 8 April 2014 (UTC)
I think that if the user is directed to a page that is not the same as what the user typed, then it should be considered equivalent to a redirect, and displayed as such. Just blindly sending the user to another page, like now, is confusing and deceiving, and it's even worse when I actually want to go to the nonexistent page. —CodeCat 19:38, 8 April 2014 (UTC)
Let me see if I can do this. If the near match mechanism decides to bounce you to a page and the page doesn't exactly match what you typed then you get the same kind of thing you'd get on a redirect. I think for wikis who's titles are forced into title case then we'd force the comparison to title case (but no here).NEverett (WMF) (talk) 20:17, 8 April 2014 (UTC)
In the past, we had this same kind of silent redirecting behaviour for differences in case too. But even for those, it was occasionally annoying and it would be useful at times to have a way to get to the intended casing. If this is changed so that silent redirects are shown more explicitly like real redirects are, then it would be much appreciated. It probably shouldn't display exactly the same thing though; instead of "redirected from" the text should say something else so that it's clear that the redirect was performed automatically by the software, not by an actual redirect page. —CodeCat 20:22, 8 April 2014 (UTC)
Certainly. Do you think the link should be to the non-existent page or to a search for the page? I'm leaning towards a search for the page because that is what you were trying to do when you got there anyway. Also, it has the advantage of dropping you on the page with the helpful page creation buttons in this wiki.NEverett (WMF) (talk) 20:34, 8 April 2014 (UTC)

@NEverett (WMF): Please don't disable near-match searches as per my first post. We should be able to do both - find the exact match and the near match. The near match should be improved to include searches with/without accent marks (Cyrillic, Arabic, Hebrew, Hindi). As I said, e.g. there is a Russian translation of "etymologist" - "этимолог" but it's written with the stress mark, as it should be in dictionaries and encyclopedias: "этимо́лог". People searching for "этимолог" (without accent) won't find it. --Anatoli (обсудить/вклад) 00:36, 9 April 2014 (UTC)

As for the original request, as I also said, it's not a big issue, (I do it all the time), if you reread my first post. When you type "looma", don't click on "lööma", which appears first but linger a bit and select "containing... looma" below, you get the page Search results for "looma" - Wiktionary and looma appears in red, ready to be clicked on. --Anatoli (обсудить/вклад) 00:44, 9 April 2014 (UTC)
@Atitarev: OK. A few days brainstorming the right solution won't hurt, I think/hope. NEverett (WMF) (talk) 14:55, 9 April 2014 (UTC)

No support for lang code cejm?[edit]

I was in the process of adding JA term チャリンコ (charinko, bicycle), when I discovered that we don't support lang code cejm for the w:Jeju language. A number of Japanese terms could make use of this lang code in calls to {{etyl}}. How hard would it be to add this? Would anyone object to adding this? Or would the proposed ISO 639-3 code jjm be preferable? (That one's also not currently supported.) ‑‑ Eiríkr Útlendi │ Tala við mig 00:27, 3 April 2014 (UTC)

Etyl: languages don’t need to have a code. You can add it as "Jeju" if you prefer. — Ungoliant (falai) 00:36, 3 April 2014 (UTC)
To expand on Ungoliant's comment: if there won't be Jeju entries (with ==Jeju== L2s), and it'll only be cited in etymologies, then it can be added to the last section ("Other lects") of Module:etymology language/data under any unambiguous (unique) code. You could use "Jeju" as its code, but given that an ISO 639-3 code has been proposed and we are wont to switch from exceptional codes to ISO 639-3 codes whenever the latter are or become available, it might save some work in the long run to just code it as "jjm" now. - -sche (discuss) 03:29, 3 April 2014 (UTC)
  • Japanese term チャリンコ (charinko, bicycle) appears to derive from Jeju 자륜거 (jaryun-geo). This is definitely not a standard Korean term, from what I've been able to find. Korean for bicycle is 자전거 (jajeon-geo), deriving from the same Sinitic term 自轉車 as the Japanese simplified 自転車 (though it might originally be a Meiji-era Japanese coinage from Western contact, c.f. these two links in Japanese, haven't verified though). The Jeju 자륜거 (jaryun-geo) derives instead from 自輪車 (different middle character -- (wheel) instead of (revolve)).
My current understanding is that I should enter {{term|lang=jjm|자륜거|tr=jaryun-geo}}. However, although the jjm lang code now works with {{etyl}}, it doesn't for {{term}}.
Is it kosher to use the ko lang code instead? That doesn't seem quite right, but at least it won't generate any Module Error warnings. Or should I just not link this term at all? That doesn't seem quite right either, given the stated WT mission of all words in all languages. We probably won't have much Jeju content over the short term, but it's conceivable, and ultimately I think desirable, that we could build up a Jeju terms corpus here. ‑‑ Eiríkr Útlendi │ Tala við mig 17:38, 3 April 2014 (UTC)
For an etymology-only language, you use the language's own code in {{etyl}} and the parent language's code in {{term}}, yes. So, {{etyl|jjm|ja}} {{term|foo|lang=ko}}, as you suspect. (I wonder where that should be documented.)
Is Jeju separate enough from Korean to merit its own entries? WP says "many Koreans, including those who speak Jeju, consider Jeju a dialect of Korean, [but] it can be considered a separate language because it is nearly mutually unintelligible with Korean dialects of the mainland". Does that pertain to the spoken form, or the written form? There is a vote underway to merge Chinese lects that are mutually unintelligible when spoken, but intelligible when written.
If Jeju does merit its own entries, jjm will need to be removed from Module:etymology language/data and a different code will need to be added to Module:languages/datax (until such time as Jeju has an ISO code and can go in a different part of Module:languages). But here we run into a technical problem. The way datax exceptional codes are named is "(the language's family's three-letter ISO 639-5 code)-(three letters representing the language itself)". The Koreanic languages do not seem to have an ISO 639-5 family code. On RFM, we have run into the same problem with Lencan. We could use the second half of exceptional codes we've granted the Lencan and Koreanic families as the family prefixes of the languages' exceptional codes (so Jeju might be "kor-jjm"), but that would be problematic if the ISO ever assigned those strings to different families. We could use "qfa" or "und" as the prefix (so Jeju might be "qfa-jjm" or "und-jjm"), but I'm not sure which of those would be better. Hmm, probably "und". So, "und-jjm"... - -sche (discuss) 18:11, 3 April 2014 (UTC)
  • Re: WP says "many Koreans, including those who speak Jeju, consider Jeju a dialect of Korean, [but] it can be considered a separate language because it is nearly mutually unintelligible with Korean dialects of the mainland". Does that pertain to the spoken form, or the written form?
I strongly suspect both. Chinese uses an ideographic writing system, where radically different readings / pronunciations can apply to the same spelling. I can therefore read certain strings of Chinese in a way that only a Japanese speaker would understand. Korean, meanwhile, uses an alphabetic writing system, where the sounds and the glyphs are much more closely tied. As such, I would be surprised if written Jeju is all that much closer to standard Korean than spoken Jeju, or vice versa.
That said, there are two ISO four-letter codes, cejm for Jeju in general, and chjm for spoken Jeju, so perhaps there is substantial variance. Then again, the ISO codes have been a bit odd sometimes, such as including separate codes for varieties of Levantine Arabic that apparently are nearly fully mutually intelligible, while missing separate codes for lects that aren't mutually intelligible (for instance, full-bore Tōhoku-ben in w:Iwate Prefecture left me absolutely befuddled, with different verb endings and different nouns than standard Japanese). Given this inconsistency, I cannot tell if the existence of these two codes necessarily indicates any significant difference between spoken and written Jeju.
My understanding from [[w:Jeju language]] is that Jeju already has two ISO codes: cejm and chjm. The three-letter jjm code is currently only proposed (possibly just proposed this year, if the “2014” listed here indicates the year), but the four-letter codes are official, as far as I can tell. For instance, enter “Cheju” for Language Reference Name at http://www.geolang.com/iso639-6/ and you'll get both of these codes.
FWIW, searching http://www-01.sil.org/iso639-3/iso-639-3_Name_Index.tab for “Korean” indicates a three-letter code of kor, leading me to think it would be unlikely for kor to be reassigned to anything other than Korean. ‑‑ Eiríkr Útlendi │ Tala við mig 18:53, 3 April 2014 (UTC)
ISO doesn't actually matter. Wiktionary doesn't follow ISO, it follows the BCP 47 subtag registry. —CodeCat 19:31, 3 April 2014 (UTC)
@Eirikr: I mean a three-letter ISO code; we never use four-letter language codes (you may have noticed). And the "kor" you find is for the Korean language, not the family.
@CodeCat: Even within the past couple of months, we've incorporated new three-letter ISO codes without anyone checking the BCP registry. So, following the BCP registry may be a goal some users periodically remember to check if we're meeting, but it's not wrong to observe that in day-to-day practice, the ISO is what gets consulted. - -sche (discuss) 19:58, 3 April 2014 (UTC)
  • In light of the above then, it sounds like the best course would be to add und-jjm to Module:languages/datax for the time being. Do I understand that correctly? Or, do we seek more input on whether Jeju is different enough to merit a code? ‑‑ Eiríkr Útlendi │ Tala við mig 23:31, 3 April 2014 (UTC)
    • We normally create new codes by adding something made up to an existing family code. If the family code itself doesn't exist, the same process is applied. We use fiu-fin-pro for Proto-Finnic for example. —CodeCat 00:09, 4 April 2014 (UTC)
  • Whether this was by design or merely happened this way, proto-languages fit a slightly different naming scheme than other languages: "-pro" is added to the entire code of the family the proto-language is the ancestor of, whether that code is three characters (e.g. an ISO code like "gem" : "gem-pro", "Proto-Germanic") or seven characters (e.g. the example you give, "fiu-fin" : "fiu-fin-pro", "Proto-Finnic"). I'm not aware of a non-proto language that does that, i.e. that uses a code of more than seven characters. It is an interesting idea, though — applying the proto-languages' naming scheme to non-proto-languages to get 'qfa-kor-jjm', as opposed to 'und-jjm'. And although I don't expect we'd ever reach the limit of codes that would be possible under a 'und-xxx' naming scheme, the 'qfa-yyy-xxx' scheme would give us the maximum flexibility to select characters for the last three places (the '-xxx') that clearly represented the language's name. (By which I mean, if every exceptional code for a language whose family lacked an ISO code were prefixed with 'und-', and we needed to grant codes to a Lencan Foobaar language, a Koreanic Foobahr language and a Keresan Foobr language, we'd have to encode them in more creative and thus potentially less memorable ways like 'und-fob', 'und-fbh' and 'und-fbr', whereas under the 'qfa-xxx' scheme each could be 'qfa-(whatever)-foo' or such.) So I suppose 'qfa-kor-jjm' if a better idea than 'und-jjm'... - -sche (discuss) 22:34, 4 April 2014 (UTC)
    • I object to "und-jjm" mainly because "und" is not a family code. However, we could decide to assign family names from the private use area. That would avoid excessively long names, and Finnic could become just "qfi", Koreanic could get "qko". We could abandon the prefix "qfa" too if we decide to, replacing the few families that use it with something else. Doing the same for languages too might not be so good because there are too many of them to fit into the private use area. There's also w:List of ISO 639-3 language codes reserved for local use, which shows that many private use codes have been used by Linguist List. We could adopt their codes, or ignore them if we want to. —CodeCat 23:10, 4 April 2014 (UTC)
  • I went ahead and added qfa-kor-jjm to Module:languages/datax. Please whack me with the cluebat if that was in error. ‑‑ Eiríkr Útlendi │ Tala við mig 17:24, 7 April 2014 (UTC)
    • An error, no. But premature, given the alternative I suggested, yes. So... *gives slight nudges with bat*. —CodeCat 17:29, 7 April 2014 (UTC)
      • Apologies, I didn't understand that you were proposing an alternative. I also saw qfa-kor-pro in the list, leading me to think that the qfa-kor- prefix was already accepted. But to be honest, I didn't go through the history to see when this was added and who added it. ...Though now that I do, I see it showed up in diff back in November, apparently being moved over from Module:languages/alldata, where this code appears in the first version of the page in diff.
      Somewhat confused, ‑‑ Eiríkr Útlendi │ Tala við mig 17:51, 7 April 2014 (UTC)
      I meant the alternative right above. Using private-use codes for families lacking a code. —CodeCat 18:14, 7 April 2014 (UTC)
      On a balance, I think "qfa-kor-jjm" is better than "qko-jjm". It's good for codes like Jeju's to be formed in the same way as proto-languages' codes, rather than in a new way — it's good to avoid proliferation of naming schemes. And forming proto-languages' codes by adding "-pro" to the family is a clear, straightforward scheme, as opposed to having the family be "qfa-kor" but the proto-language be "qko-pro". And changing the family code to "qko", i.e. switching from three-selectable-character family codes (qfa-___) to two-selectable-character codes (q__) would be unwise, IMO, if not infeasible. There are so many families and subfamilies of those families which the ISO has not granted codes to that we would quickly run out of combinations of letters which memorably/intelligibly represented those families' names, if we were limited to two-[selectable-]letter codes. (Module:families/data already includes 40 ISO-code-less families and their subfamilies, and may one day — as it becomes more complete and up-to-date — include at least four times that many, in my estimation.) - -sche (discuss) 22:16, 7 April 2014 (UTC)
      I think codes made out of three parts are too long. And we have 520 possible private use codes available, so I'm sure we can fit all the families we need into that. I'm not sure what you mean about proto-language codes. If we change the family codes, they would change too. So we would have qfi-pro for Proto-Finnic, qko-pro for Proto-Korean, qbs-pro for Proto-Balto-Slavic and so on. I really prefer that to the codes we have now. —CodeCat 23:22, 7 April 2014 (UTC)

Translation boxes only for English terms?[edit]

For example, I can see in the word connive a translation box, not though in the Italian word amare. Is there a rule where translation boxes with multiple language translations are added only to English terms? --Spiros71 (talk) 09:24, 3 April 2014 (UTC)

Yes, only English terms have translation sections. Foreign-language terms are translated only to English. —Stephen (Talk) 09:33, 3 April 2014 (UTC)
Some translingual entries have translation sections as well. — Ungoliant (falai) 10:56, 3 April 2014 (UTC)

Where is the part of speech/type tag list page ?[edit]

Hi, I look for the complete list of part of speech/types tags: ====Noun====, adverb, initialism, etc. I didn't find it in Special pages. Thanks ! —This unsigned comment was added by 201.212.5.12 (talk).

WT:Entry layout explained/POS headers#Headers in use. Keφr 20:17, 4 April 2014 (UTC)
That page is a bit outdated though. —CodeCat 20:55, 4 April 2014 (UTC)

Taming topic cat[edit]

Now that we have Lua, can we get rid of {{topic cat}}'s parameters? It should be a simple matter of parsing the page name to determine if it's a well-formed topical category name with a valid language code followed by a colon followed by something that could be a topic, and, if applicable, followed by a valid script name, then feeding the parts to a back end that could even be the unchanged code from the current topic cat. Later on, we should despaghettify the backend- but this seems like it could be implemented in an hour or two by anyone who knows what they're doing. The nice part is that it won't need any parameters, so it can just ignore those in all the current uses- no botting required.

I've been doing a lot of category stuff lately, and it's really annoying typing in all the necessary stuff, only to have it scream at you in ugly red because you didn't precisely match the category name in the precise format required.

The other side of the coin would be taken care of by an accelerated category adder that would know the language code for the section it was in and let you choose from a menu of existing categories- but that's for later. Chuck Entz (talk) 05:27, 5 April 2014 (UTC)

I was working on a replacement for {{catboiler}} (which powers {{poscatboiler}} and such) a while ago, see Module:User:CodeCat/category boilerplate. It was sort of finished and it's actually used in a few categories already, but it didn't have all the functionality that was necessary to completely replace it yet. Presumably, we would want to use this same module to handle {{topic cat}}, to avoid the proliferation of different pieces of code that all do more or less the same thing. —CodeCat 13:05, 5 April 2014 (UTC)

Something strange my bot did[edit]

diff. It should have added "pt" instead. I have checked my code several times and I still have no idea why it did that. So I don't really know how to fix it, but at least I'm reporting it so that it's known. —CodeCat 23:18, 5 April 2014 (UTC)

Well if you're expecting us to help fix it, you'll have to show the code. The most recent L2 before that line is most certainly ==Portuguese== and not ==Old Norse==. --WikiTiki89 23:21, 5 April 2014 (UTC)
That's what's confusing me... —CodeCat 23:50, 5 April 2014 (UTC)
Ok, here is the code for the part that does the page edit. I'm using the pywikipedia framework and the mwparserfromhell library.
if page.namespace() == 0:
	for langsection in text.get_sections([2]):
		name = unicode(langsection.get(0).title)
		code = None
 
		for template in langsection.filter_templates():
			if template.name == "audio" and not template.has("lang", False):
				code = code if code else blib.get_language_code(name)  # this translates the name to its code using [[Module:languages]]
				template.add("lang", code)
CodeCat 00:09, 6 April 2014 (UTC)
So text.get_sections() is a mwparserfromhell thing right? In that case the bug is probably in their code not yours. --WikiTiki89 00:12, 6 April 2014 (UTC)
It does seem so. I removed all the excess code and it looks like it's grouping the Old Norse, Old Portguese and Portuguese sections as one. So it's a bug in their parser most likely. I've reported it now. —CodeCat 00:30, 6 April 2014 (UTC)
They replied and said that it's caused by incomplete markup on the page. In this case it's a '' that isn't closed properly. So it's part of a larger problem, and they're working on fixing it. —CodeCat 12:07, 6 April 2014 (UTC)
Well I use manual searching for L2 header so I won't have such problem. --kc_kennylau (talk) 12:12, 6 April 2014 (UTC)
You will if you ever come across a commented-out L2. --WikiTiki89 17:32, 6 April 2014 (UTC)
/^==([^=])==/ won't find a commented-out L2. - Amgine/ t·e 22:09, 6 April 2014 (UTC)
The comment doesn't have to be on the same line. --WikiTiki89 22:17, 6 April 2014 (UTC)
Ah, true. Hadn't considered that. - Amgine/ t·e 22:20, 6 April 2014 (UTC)
Yes, let's all roll our own wiki markup parsers, this surely isn't wasted effort. DTLHS (talk) 22:15, 6 April 2014 (UTC)
<curious look> Why wouldn't we? It's hardly as much time wasted as trying to debug someone else's code, since WMF still refuses to release an index parser. - Amgine/ t·e 22:20, 6 April 2014 (UTC)
Because it's better to have a common set of tools that behaves in a known way, that is tested by many people. Because it's much less of a barrier for newcomers to overcome if they don't have to write a template parser if they want to do anything interesting. DTLHS (talk) 02:42, 7 April 2014 (UTC)
Something like
// e.g. $templateAndArguments = {{audio|En-us-Bhojpuri.ogg|Bhojpuri|lang=en}}
$parsedTemplate = file_get_contents( urlencode( "https://en.wiktionary.org/api.php?action=expandtemplates&text=$templateAndArguments" ) );
or were you talking about some other, more complicated re-invention of the wheel? did you really suggest pywikipediabot is a newb-oriented software? (Not everyone writes python. Or should.) - Amgine/ t·e 04:15, 7 April 2014 (UTC)
mw:API:Parse with the generatexml option satisfies most of my parsing needs. There are some limitations, though. Keφr 05:46, 7 April 2014 (UTC)

Automatic links to words in multiple-word entries?[edit]

Is it possible to include some lines in Module:headword so that I don't need to do this every time I see some multiple-word entries? --kc_kennylau (talk) 01:24, 6 April 2014 (UTC)

Not everything with a space is composed of two words that actually exist. DTLHS (talk) 01:25, 6 April 2014 (UTC)
And furthermore, complicating matters, there are some strings of words like "A B C" which are composed of (and which we prefer to link as) "A B" + "C", not "A" + "B" + "C". - -sche (discuss) 01:31, 6 April 2014 (UTC)
It could be made to default to the common case, with an explicit "head=" statement overriding the default. DCDuring TALK 01:37, 6 April 2014 (UTC)
Or we could even have the default be no auto-linking, but allow a special head=+ case to enable auto-linking. --WikiTiki89 07:11, 6 April 2014 (UTC)
What if the head really is +? --kc_kennylau (talk) 07:40, 6 April 2014 (UTC)
You can write it as &#43;. Though personally, I really dislike inventing ad-hoc special-case syntax. It is a first step towards a large unmaintainable mess. Keφr 07:55, 6 April 2014 (UTC)
If we are going to have automated linking, making linking the default which head suppresses or alters (as DCDuring suggests) seems preferable to making it so that users turn on linking by filling in head= with something other than the linking they want. (If you're filling in head=, just go ahead and fill in what head equals.) - -sche (discuss) 08:45, 6 April 2014 (UTC)
+1 —RuakhTALK 19:37, 6 April 2014 (UTC)
But it should be ignored if someone literally writes head= with no value, right? —CodeCat 19:47, 6 April 2014 (UTC)
I want the behavior described by DCDuring. Will make editing easier. --Vahag (talk) 08:47, 6 April 2014 (UTC)
I like this. It would definitely be useful in cases like yêu nhiều thì ốm, ôm nhiều thì yếu. Wyang (talk) 07:03, 7 April 2014 (UTC)
I've thought about implementing something like this, but there are some problems. It's not always desirable to split the terms. For example, it's not so useful to link to the parts of an inflected form of a multi-word verb, like for example gave up. There are probably other cases where splitting doesn't make sense either. —CodeCat 14:24, 10 April 2014 (UTC)
In the case of gave up, I think it is useful to link to the parts. --WikiTiki89 15:12, 10 April 2014 (UTC)
I support the feature, even if the links may be to non-lemma forms and many languages don't have inflected form entries in Wiktionary. "[[gave]] [[up]]" can be converted to "[[give|gave]] [[up]]" manually. It's easier than inserting each pair of square bracket manually. --Anatoli (обсудить/вклад) 23:27, 10 April 2014 (UTC)
@-sche, Atitarev, CodeCat, Wikitiki89, SemperBlotto, Ruakh, Vahagn Petrosyan, Wyang, DTLHS: well. --kc_kennylau (talk) 18:26, 19 April 2014 (UTC)
I've made the change in Module:headword. It only splits on spaces, just to be safe for now, and it only applies it if no headword was already provided. This means that it will explicitly not work with any template that gives something like |head={{{head|{{PAGENAME}}}}}, because the {{PAGENAME}} will override the default. —CodeCat 14:14, 23 April 2014 (UTC)
For something like give up the ghost, I think we would like to link it as give up the ghost. I don't think we want to link all the "of" and "the" instances in headers, although I imagine those could be excluded by default. bd2412 T 15:14, 23 April 2014 (UTC)
So, in cases such as that, the linking should be specified (not left to default). SemperBlotto (talk) 15:17, 23 April 2014 (UTC)
Would we have to fix all existing headwords containing common prepositions or definite articles to specify this? I think it would be easier from a maintenance standpoint (thought obviously not from a programming standpoint) to exclude such terms in the first place. bd2412 T 15:20, 23 April 2014 (UTC)
You're looking at it from the standpoint of someone who already has a good command of the language, and knows such common words already. But it's very conceivable that someone who has no knowledge of Italian will wonder what il means and a link to it would certainly be helpful to them. —CodeCat 15:44, 23 April 2014 (UTC)
Purely from an aesthetic standpoint, it looks really ugly to me when small words are not linked, making the headword look like like it's not all one piece. give up the ghost looks much better as a headword than give up the ghost. --WikiTiki89 15:54, 23 April 2014 (UTC)
You need to consider punctuation (see yêu nhiều thì ốm, ôm nhiều thì yếu as mentioned above) DTLHS (talk) 15:49, 23 April 2014 (UTC)
Could someone give a list of cross-language punctuation? Also, I'm not sure how to split the words while preserving the punctuation. Scribunto doesn't seem to provide a function for that; it always throws the punctuation away when it splits. —CodeCat 16:12, 23 April 2014 (UTC)
Don't we already have a list of punctuation in one of the linking modules (I don't remember which one). DTLHS (talk) 16:14, 23 April 2014 (UTC)
Module:languages#Language:makeEntryName is probably what you mean, but that one only removes some punctuation. It doesn't include periods, commas, hyphens, colons etc. Colons and hyphens in particular should not necessarily be unlinked. In Finnish, the colon is used to separate abbreviations from their case ending, and the hyphen is used for the same purpose in Slovene. Dutch uses the apostrophe for that purpose. It's very hard to come up with cross-linguistic rules. —CodeCat 16:22, 23 April 2014 (UTC)
In that case I guess you should just create a tracking category and review everything in it. DTLHS (talk) 16:23, 23 April 2014 (UTC)
name = "testing, testing, and testing."
name = mw.text.split(name,' ')
for i, word in ipairs(name) do
	last = ""
	if mw.ustring.match(word,'[,.-_?!\'"()%[%]{}@*#$%%^&]$') then
		len = mw.ustring.len(word) --not using #word because #word counts bytes
		last = mw.ustring.sub(word,len,len)
		word = mw.ustring.sub(word,1,len-1)
	end
	name[i] = "[[" .. word .. "]]" .. last
end
name = table.concat(name,' ')

(not tested, punctuation list not complete, this is just a demonstration) --kc_kennylau (talk) 16:46, 23 April 2014 (UTC)

Placeholder words like one, one's, oneself, someone/somebody, and possibly 's should either link to an appendix on the use we make of placeholders or to definitions in the entries for the words written with a view toward this kind of use.
It would also be useful if links to MWEs had some kind of faint underlining to convey that the entire MWE rather than the constituent terms were the object of the link. DCDuring TALK 20:25, 23 April 2014 (UTC)
Something like that would be very complex to do for every single language, so I don't think it's a good idea. And I'm actually surprised that you're suggesting it, as you normally seem opposed to complexity. —CodeCat 20:28, 23 April 2014 (UTC)
There needs to be a way to explicitly disable this without retyping the whole headword. --WikiTiki89 21:10, 23 April 2014 (UTC)

Can we import pronunciations?[edit]

As someone who started learning French without knowing anything about the pronunciation, I find myself having to go to fr.wikt pretty often for pronunciation information. Is it possible to import these pronunciations to en.wikt by bot? I imagine it could be helpful as fr.wikt is pretty good about having them for French words, and other wiktionaries are as well (I think de.wikt, for example). Ultimateria (talk) 03:45, 7 April 2014 (UTC)

I think we might have to keep the attribution, which could be difficult. Perhaps the bot could sift through the history and add a link in the edit summary to the original editor who added the pronunciation... --Yair rand (talk) 03:48, 7 April 2014 (UTC)
The copyright notice when you submit an edit says, "You agree that a hyperlink or URL is sufficient attribution". I guess that's a bit vague now that I think about it, but I've always taken it to mean, a link to the page that's being copied from, whose edit-history then identifies you. (I think the only reason that Transwikis import the edit history is that it's expected that the source-page might then be deleted, which would destroy the edit-history and therefore the attribution.) (Actually, I guess you're probably talking about plagiarism rather than copyright, since the pronunciation information itself is not copyrightable, and we would not be copying the expression of it; but there as well, I think that linking to the source-page is sufficient attribution.) —RuakhTALK 05:38, 7 April 2014 (UTC)
Tbot (talkcontribs) used to do this, and mindlessly imported pronunciations from the Korean Wiktionary (Category:Tbot entries (Korean)), which uses a different set of IPA symbol conventions from the Korean editors here. There are still hundreds of pages in that category uncleaned-up. Wyang (talk) 03:54, 7 April 2014 (UTC)
We could use the audio-file link, though, where-ever it exists. Not sure, if that's what the original question meant. --Anatoli (обсудить/вклад) 04:13, 7 April 2014 (UTC)
But, couldn't we do it....not blindly? I mean, couldn't we analyze the French Wiktionary's IPA standards, and see if they are compatible with ours, and then proceed if everything's kosher? It seems like the addition of large amounts of content would be worth examining, right? Yair's thoughts on attribution are important to consider, but, as they state, probably surmountable. -Atelaes λάλει ἐμοί 05:08, 7 April 2014 (UTC)
Agreed. —RuakhTALK 05:38, 7 April 2014 (UTC)
I think DerbethBot already imports all audio files. --Yair rand (talk) 05:25, 7 April 2014 (UTC)

Make module text output call template[edit]

Hi all. Is it possible to let the text output '{{temp|aaaa}}: {{temp|aaaaa}}' from a module be displayed as '{{aaaa}}: {{aaaaa}}' rather than just a string '{{temp|aaaa}}: {{temp|aaaaa}}'? Thanks in advance, Wyang (talk) 06:27, 7 April 2014 (UTC)

Short answer: no. Templates cannot be called from the output of a Module. --WikiTiki89 06:33, 7 April 2014 (UTC)
But Module:it-conj can build a table? --kc_kennylau (talk) 06:47, 7 April 2014 (UTC)
A table is not a template, it's just special wiki syntax. --WikiTiki89 06:49, 7 April 2014 (UTC)
OK... I have changed the code (Module:vi-pron) to avoid this. Thanks. Wyang (talk) 07:03, 7 April 2014 (UTC)
@Wyang:{{aaaa}}: {{aaaaa}}--kc_kennylau (talk) 08:48, 7 April 2014 (UTC)
What about other templates, such as '{{ko-conj-adj|stem1=노랗|stem2=노래|stem3=노라|haet=노랬|hal=노랄|ham=노람|han=노란|stem1_r=nora|stem1a_r=norat|stem2_r=norae|cstem=ㅎ}}'? Wyang (talk) 12:22, 7 April 2014 (UTC)
Templates can be called inside modules. Look up the "expandTemplate" function in the Scribunto documentation. —CodeCat 12:57, 7 April 2014 (UTC)
Just to point out, I didn't say they can't. I said they cannot be called in the output of a Module. They need to be expanded within the module but it is not such a good idea and should be avoided if possible. --WikiTiki89 16:04, 7 April 2014 (UTC)
But what is the "output" of a module? To me, it's the text that the module returns as its result. So it's not something anything can be "in". —CodeCat 16:07, 7 April 2014 (UTC)
The text is "in" it, with wiki syntax and everything. People are used to calling templates in renderable text and often don't know the difference between templates and wiki syntax. --WikiTiki89 16:09, 7 April 2014 (UTC)
I think it applies to anything with { } in it. So that includes templates, magic words ({{PAGENAME}} and such) and parser functions like #if and #invoke. —CodeCat 16:16, 7 April 2014 (UTC)
Those are essentially templates. Wiki syntax is also tables, markup, etc. --WikiTiki89 16:36, 7 April 2014 (UTC)
@Wikitiki89: Well you could just place in the content of the template just like what I did. --kc_kennylau (talk) 18:28, 19 April 2014 (UTC)
But then it's not a template anymore. --WikiTiki89 18:43, 19 April 2014 (UTC)

How do I build testing modules?[edit]

Module:User:kc_kennylau/sandbox? Is this allowed? --kc_kennylau (talk) 07:00, 7 April 2014 (UTC)

Sure. All the same rules that apply to other Username-space content would apply, namely that it has something vaguely to do with building a dictionary, and that you're not using it as a free alternative to MySpace. -Atelaes λάλει ἐμοί 07:14, 7 April 2014 (UTC)
Thank goodness. My free alternative to Twitter is safe. Keφr 08:32, 7 April 2014 (UTC)

Template:ttbccatboiler needs minor fix[edit]

This edit: (diff) is no doubt an improvement, but it left things kind of messy (see Category:Translations to be checked (German)). Chuck Entz (talk) 14:07, 7 April 2014 (UTC)

Lua replacement for Template:langrev running out of memory[edit]

I tried replacing calls to {{langrev}} in {{ttbc}} and {{trreq}} to use the Lua equivalent, which is Module:languages/templates#getLanguageByCanonicalName, which is a wrapper around Module:languages#getLanguageByCanonicalName. Unfortunately when I did that, many pages using these templates starting showing server errors. Apparently Lua ran out of memory so whoever wrote it thought it would be best to just let the whole page crash. Not so good.

In any case, that means that we're not able to replace this template with its Lua equivalent just yet. We need to find a better way to do it. Does anyone have suggestions? —CodeCat 17:18, 7 April 2014 (UTC)

Create a module which computes a reverse-lookup table, mw.loadData it, and index that. That is, do what I did at Module:User:Kephir/test1 and Module:User:Kephir/test2. This might be lighter on memory. Just a conjecture, though. Keφr 16:28, 8 April 2014 (UTC)

Editing subpages[edit]

The [edit] button at the top of each section in the tea room seems to have disappeared - now if I want to make a comment I have to go to the monthly subpage and edit there instead. But this is only happening in the tea room - not in the grease pit, the beer parlour, or anywhere else that I know of. What's going on? —Mr. Granger (talkcontribs) 21:38, 7 April 2014 (UTC)

That must have something to do with the permissions, which I changed recently. Until we find a way around it, I'll revert it back. --WikiTiki89 21:42, 7 April 2014 (UTC)

Does anyone use CSVLoader[edit]

I had a problem that its current version forces capitalization, I thought I would simply ask for the earlier version from its creator w:User:Ganeshk but I doubt he will even see my message as he is apparently on a break and his talk page is being spammed by the signpost thing and auto-archived (couple more days and in the trash my message goes...)

Do any of you guys happen to have a version of CSVLoader that works on Wiktionary? Neitrāls vārds (talk) 06:53, 8 April 2014 (UTC)

A new type of collapsible content[edit]

I've recently thrown together a little javascript snippet to do some collapsing on {{grc-pron}}, which can be seen at User:Atelaes/viewSwitching.js. I was wondering what folks would think about adding it to our Common.js. Essentially, what it does is switch between two different representations of the pronunciation of an Ancient Greek word, one which is more compact, taking up only a single line, and one which takes up more space, but has more detail. My goal was to try and reduce the amount of pre-definition space, while retaining our current level of detail for those who want it. It, of course, integrates with Conrad's hiding infrastructure. I also tried to make the javascript fairly general, such that other templates could make use of it. Any feedback is appreciated. Thanks. -Atelaes λάλει ἐμοί 21:34, 8 April 2014 (UTC)

I don't think your addition is bad as such, but there was recently some discussion about migrating away from our in-house collapsing code and using the built-in MediaWiki code instead. I do think that's something we should look into, so that would affect your code as well. —CodeCat 21:38, 8 April 2014 (UTC)
I only half-tried, but I wasn't able to find to find any detailed documentation of the MW collapsing content. Could it do what my code does? -Atelaes λάλει ἐμοί 21:44, 8 April 2014 (UTC)
I haven't really worked with it at all, so I don't know. All I know is that, for collapsible tables, it collapses the table itself instead of the surrounding div. That alone is a big advantage, I think. —CodeCat 21:46, 8 April 2014 (UTC)
Ok, after looking at the manual, and the source code, it doesn't look like Mediawiki's built-in can do switching, only hiding and unhiding, so I'm going to persist in championing my code. Additionally, I wonder if the built-in code is really ideal for our purposes at all. As so often happens, the code is really built with Wikipedia in mind, not Wiktionary (something I can't fault them for, they are certainly more important than we are). Specifically, it's slow and not centrally controllable, unlike ours. That's fine if you want to open one of the two hidden tables at the end of a Wikipedia article, but not fine if you want to blow up twenty consecutive inflection or translation tables, something I find myself doing from time to time. It's also not fine if there's a specific class of content you always want shown or hidden, which is a genuine use-case on our project. That being said, if we decide not to use the built-in's here, there may well be improvements we could make to our home-brew stuff. -Atelaes λάλει ἐμοί 22:41, 8 April 2014 (UTC)
Definitely, yes. Not having to wrap tables in divs would be a good start. —CodeCat 22:51, 8 April 2014 (UTC)

We can use templates for tracking things too[edit]

We've been using tracking categories so far, which are quite useful. But they are awkward to use from within modules. I tried out something else instead: using templates for tracking. Modules are able to transclude templates, but they're not necessarily required to use the output of that template in any way. However, the act of transcluding, in itself, causes the page to appear as a transclusion for that template. So this can be used in much the same way as tracking categories are. So instead of creating categories and adding entries to them, you create empty templates and transclude them. We don't have any proper system for this yet but I propose we create Template:tracking and use subtemplates of that, as necessary. —CodeCat 14:21, 9 April 2014 (UTC)

I fail to see the advantage over tracking categories. --WikiTiki89 14:23, 9 April 2014 (UTC)
Like I said, they can be used from modules, which is much harder for categories. I'm also not proposing that we choose one or the other, it's more like I'm making it known that this alternative method exists, and we can use it too when necessary. —CodeCat 14:24, 9 April 2014 (UTC)
Oh I see what you mean. You have to output categories, but not templates. That seems like a bad workaround for something that we should just ask the developers to develop. --WikiTiki89 14:30, 9 April 2014 (UTC)
Yes, exactly. The nice part about template transclusion is that it works "outside" the invoke/result system, so it doesn't disrupt the normal functioning of a module or template. Tracking categories and tracking templates are both workarounds of course, but they work, so we might as well use them. —CodeCat 14:34, 9 April 2014 (UTC)
I don't think a tracking category is workaround; I think categories are exactly what should be tracking things. Maybe we can get the developers to add a way to include categories without outputting anything. For now though, I see no problem with using template transclusions as a workaround. --WikiTiki89 14:38, 9 April 2014 (UTC)
I suppose that's true. But then it could be argued that the "what links here" feature should actually be a category. Furthermore, categories are distinguished in that they are actual wiki pages and can have content other than the entries listed there. I don't think categories were ever intended to be used in the way we use them when Wikipedia was first made. —CodeCat 14:47, 9 April 2014 (UTC)
It doesn't matter what was intended. What matters is does it make sense to use categories for that? And I think it does make sense. --WikiTiki89 15:11, 9 April 2014 (UTC)
One could argue that because of the above-mentioned impracticality, it does not. Keφr 15:21, 9 April 2014 (UTC)
Well I mean it conceptually makes sense, which is why we've been using tracking categories since well before Lua came around. Using template transclusions to track things makes much less conceptual sense, but as you point out it currently makes more practical sense. Although, conceptual sense is probably even more subjective, so feel free to disagree. --WikiTiki89 15:27, 9 April 2014 (UTC)
Can you make an example of this? — Ungoliant (falai) 14:29, 9 April 2014 (UTC)
Look at what I did in Module:languages/templates. Something like that would be impossible to do with categories. —CodeCat 14:30, 9 April 2014 (UTC)
Looks good. I support the proposition. — Ungoliant (falai) 14:39, 9 April 2014 (UTC)
But categories should be preferred, as they are easier to navigate and to find than what-links-here pages. — Ungoliant (falai) 16:45, 9 April 2014 (UTC)
@CodeCat: There is a flaw in this plan. You tried to use the frame object without knowing what it is. Quote from mw:Lua reference manual#Frame object: The frame object is the interface to the parameters passed to {{#invoke:}}, and to the parser. Thus, it can not just be used from any module, it needs to have the frame passed to it from the originally invoked module. --WikiTiki89 16:25, 19 April 2014 (UTC)
But what about mw.getCurrentFrame? —CodeCat 16:38, 19 April 2014 (UTC)
Ok, I guess you can use that too. But in the situation below, that was not done. --WikiTiki89 16:42, 19 April 2014 (UTC)

What should Wikimedia link templates do with "invalid" language codes?[edit]

There are a number of pages showing errors right now, because they use a Wikimedia link template like {{wikipedia}} with a language code that Wiktionary doesn't recognise. For example, hr (Croatian, considered part of sh here), simple (Simple English, considered part of en) and so on. We certainly do want to be able to link to these Wikipedias, but their non-standard (from our perspective) codes are a bit of a problem. How would this best be solved? —CodeCat 13:39, 11 April 2014 (UTC)

Why not an explicit list of exceptions for use in the kind of templates where such use is known to be valid? Occasional processing of the dumps could find new exception-template combinations, because it doesn't seem worthwhile to waste more categories on it, though it might be.
I never thought that we could have a perfectly accurate and complete list of language codes for all conceivable purposes, especially as the variable used by such codes seem to tempt folks to use it for other purposes, to which temptation some folks inevitably will succumb.
Would this even be a problem if our modules weren't so designed to throw ugly error indications at the failure of a variable to be on a defined, finite, but very large list? DCDuring TALK 14:00, 11 April 2014 (UTC)
I suppose we could make something along the lines of {{wikimedia language}}, but in reverse. —CodeCat 14:31, 11 April 2014 (UTC)
Wikimedia project language codes aren't the same as our language codes, so they shouldn't be processed by the same code that processes our language codes. Why even bother checking against our language data modules, since there are so many of our codes that don't have projects, and several projects with codes that don't match ours? We should check the language codes in Wikimedia link templates against lists of valid Wikimedia project language codes, after first converting the WT language codes that are different into their WM equivalents. Chuck Entz (talk) 21:49, 12 April 2014 (UTC)
The problem is that some templates like {{wikipedia}} apply language-specific formatting to the text. For example, the entry Загреб in Serbo-Croatian needs to tag the actual name of the article in the box as Serbo-Croatian and apply Cyrillic styling to it. Language/script tagging goes via {{lang}}, which uses our usual script detection routine to figure out the script based on the given text. And that, in turn, requires looking up the language and knowing what scripts it uses. That's where it fails. —CodeCat 22:03, 12 April 2014 (UTC)
And why exactly has this suddenly started to be a problem? Chuck Entz (talk) 23:15, 12 April 2014 (UTC)
Because the supporting templates, like {{lang}}, were converted to use Lua. That doesn't mean there wasn't a problem before, of course. It just means that now it's more obvious that there is one. Things changed specifically with this edit, which switched from {{script helper}} to {{lang}}. {{script helper}} has no Lua support, it just outputs whatever language and script you give to it (it's what all our script code templates use underlyingly). However, you can see that there was also some Lua in the old version: it retrieved the first script of the given language code. It only did that if there wasn't already a script specified, so an error was avoided back then by specifying the script in the entry, like at Загреб. So that means that linking with the code "sr" or "hr" with no sc= parameter would have triggered an error anyway, which is hardly proper behaviour for the template. The proper solution here could be for the template (and any like it) to recognise that "sr" is not a valid code on Wiktionary, and convert it to one that is valid before it's given to {{lang}}.
To confuse the matter, though, there's already a conversion step for the language code in that template, using {{wikimedia language}}. That template does the exact opposite: it takes a Wiktionary code and translates it into a Wikimedia code, like for example nan > zh-min-nan or nb > no. This conversion step is also used for other external linking templates, like the interwiki links in our translation template {{t+}}. That means that, in principle, the lang= parameter on {{wikipedia}} and {{t+}} specifies the Wiktionary-internal code, and not the Wikimedia code, and this lets us write {{t+|nb|word}} and generate word (no) with the link "corrected" to point to no.wiktionary instead.
Such a translation step works fine if you can uniquely determine the Wikimedia code from the Wiktionary code. But it fails when there's more than one wiki in what we consider the same language, like for English/simple or for the three varieties of Serbo-Croatian. {{wikipedia}} isn't the only template with this problem. {{t+}} also has the same issue; it's currently impossible to link to the Bosnian, Serbian or Croatian Wiktionaries using {{t+}}. {{t+|sh|something}} only links to the Serbo-Croatian Wiktionary: something (sh).
So really, the issue is not in the changes I've made. They've only exposed a deeper flaw in the thought process that has gone into these templates and how they're meant to work. In the end, we need to decide, is the lang= parameter supposed to specify a Wiktionary code, and if so, how do we deal with cases where that code does not uniquely define the Wikimedia code to link to? This is something we shouldn't just answer for {{wikipedia}} and the likes, but for {{t+}} as well. —CodeCat 00:06, 13 April 2014 (UTC)
User:Kephir has gone ahead and created a template that does the reverse conversion. Unfortunately, it's creating more errors than we already had, and the number is still climbing. It also doesn't address the underlying problem that I described above. —CodeCat 13:09, 15 April 2014 (UTC)

Tabbed languages problem[edit]

Can anyone figure out what's making tabbed languages break at tej? Lower Sorbian and Polish are being treated as subheaders of Hungarian, but I can't figure out why. —Aɴɢʀ (talk) 18:42, 11 April 2014 (UTC)

Columns templates. Fixed. Keφr 19:11, 11 April 2014 (UTC)
Thanks. I wasn't the one who used a line-initial ";" for formatting, but I wasn't aware it isn't compatible with tabbed languages. —Aɴɢʀ (talk) 19:18, 11 April 2014 (UTC)
I don't think that was the problem. The problem was in {{top4}}, which Kephir fixed. The line-initial semicolon was just bad formatting. -Atelaes λάλει ἐμοί 19:22, 11 April 2014 (UTC)
Oh, okay. Here's another problem: {{R:lv:LEV}} adds a category to the pages where it's transcluded, but it puts the category in the top (usually English) section instead of in the section where it's transcluded (usually Latvian). Anything we can do about that? —Aɴɢʀ (talk) 22:14, 11 April 2014 (UTC)
Well, I'm pretty sure the problem is that it ends up being the first category, which I'm assuming is because it uses {{catlangname}} instead of simply creating a category manually. Yair's done a lot of work to tabbed languages since I worked on it, and they were the one who wrote the category sorting in my incarnation, so I'm not 100% sure on this, but I believe that tabbed languages does not ignore the order in which categories appear. This is useful because not all categories are easily placed in a language just by reading their name. Would it be a significant detriment to simply drop the use of {{catlangname}} and simply code the category manually? -Atelaes λάλει ἐμοί 22:46, 11 April 2014 (UTC)
{{catlangname}} shouldn't be causing the problem here. It expands to the category wikicode, as if you had typed it manually, but it does some processing so it's "smarter". I'd be very surprised if changing it back to a manual category fixes the problem. @Angr: Could you give an example of an entry that has the problem? —CodeCat 23:17, 11 April 2014 (UTC)
vasara is the example I looked at. In this case the category appears correctly in the Latvian tab, but so do all the Finnish categories. The problem is that it comes before them. Why would that be? -Atelaes λάλει ἐμοί 23:36, 11 April 2014 (UTC)
I suspect it's because of the <ref> tags. They're handled specially, and apparently the wiki software lists the categories listed in references before it lists the categories in the rest of the page. —CodeCat 00:06, 12 April 2014 (UTC)
Ah. I did not notice that. That would make sense. -Atelaes λάλει ἐμοί 00:23, 12 April 2014 (UTC)
The page I found the problem on was actually ass, and using {{catlangname}} isn't the problem because it was adding a category directly until I changed it to use {{catlangname}}, which I did in hopes that that would solve the problem. (It didn't.) —Aɴɢʀ (talk) 07:34, 12 April 2014 (UTC)

Small bug in orange links gadget[edit]

I found a small bug in the gadget that turns links orange when the page exists, but doesn't have a language section of the same name. On Appendix:Proto-Germanic/furi, the Dutch link to veur#Dutch is blue, but the actual page only contains a "Dutch Low Saxon" section, none for "Dutch". Is it only looking at the first part of the name? —CodeCat 01:38, 13 April 2014 (UTC)

I believe that's exactly what's happening. My jQuery is crap (something I need to remedy at some point), but it looks like the code is simply using .inArray(), which, according to the documentation page, functions identically to JS's native .indexOf(). So, it's simply testing whether the target hash matches the beginning of an anchor on the page. I imagine that this produce false positives rather rarely, but obviously not never. Does simply using Yair rand (talkcontribs) ping them? I'm not comfortable enough with the code to make the changes myself, both because of my jQuery impotence, and because there might be a reason for the sloppiness that I'm not thinking of. -Atelaes λάλει ἐμοί 02:20, 13 April 2014 (UTC)
Fixed, I think. --Yair rand (talk) 07:18, 24 April 2014 (UTC)

Loophole, sort of[edit]

It wouldn't let me edit User:Mglovesfun/to do/French/verb forms needing attention because I'm editing another user's page. But it would allow me to move it thus bypassing the issue all together. I wouldn't call this a bug because it's expected behavior. Can we call it a loophole? If so, does it actually need closing or can we just undo any bad moves? Renard Migrant (talk) 11:40, 14 April 2014 (UTC)

If I recall correctly, the original purpose of this edit filter was to prevent vandals from thrashing user pages and feeding bots with bad data. Nobody actually thought about moving pages to bypass that, including vandals, so it was not put in. If we want to maintain this filter, I guess we have to close this loophole now that the cat is out of the bag.
Though personally, I would drop the filter altogether, if only to counter the "vandals must be stopped at literally any cost" mentality I sometimes perceive in regulars, and to remove the impression that userspace is a sacred property of the user. Most of the time I hear about this filter, it is preventing someone from doing something useful. The abuse log also does not seem to contain evidence of any spectacular vandalism prevented. Keφr 12:17, 14 April 2014 (UTC)
I believe it was prompted by a rash of spam left by bots on inactive users' pages. Most of the actual vandalism is against admins, who can protect their own pages. Chuck Entz (talk) 13:18, 14 April 2014 (UTC)

Broken Swedish declension template[edit]

Why is Template:sv-noun-reg-ar broken? All eight fields are blank. See for example biltvätt. Can one do "related changes" on a template and see all related changes in low-level templates that it calls? --LA2 (talk) 20:09, 17 April 2014 (UTC)

The forms are only shown if {{isValidPageName|{{{sg-nom-indef|}}}}} and so on are true. But those parameters (sg-nom-indef and such) are normally empty, so they are not valid page names and the test fails. —CodeCat 20:49, 17 April 2014 (UTC)
It used to work fine. What has changed? LA2 (talk) 19:59, 18 April 2014 (UTC)
Perhaps your edit in isValidPageName on April 2 is guilty? So either you can change it back (it wasn't broken, it worked fine) or you can fix the Swedish templates that use this template? LA2 (talk) 20:04, 18 April 2014 (UTC)
It wasn't at all fine, and reverting those edits would certainly break a lot of things. Read the "Recent layout changes" thread above. —CodeCat 21:48, 18 April 2014 (UTC)
I was not involved in designing these templates, and I have no idea why they use isValidPageName in the first place. My involvement is limited to adding Swedish words. I have no intention in getting involved in overall layout or template architecture politics. I'll leave it as broken as it is and possibly abandon Wiktionary alltogether if things are going to become broken in this way. It was fun while it lasted. --LA2 (talk) 21:54, 18 April 2014 (UTC)
Repeat after me: templates are not data and I should really just relax. DTLHS (talk) 22:26, 18 April 2014 (UTC)
Guys I've fixed it --kc_kennylau (talk) 02:26, 19 April 2014 (UTC)
You fixed nothing, you just cheated around the problem by making {{isValidPageName}} treat empty strings as valid page names, which they're obviously not. I've reverted. —CodeCat 02:29, 19 April 2014 (UTC)
The template works that way, treating empty strings as valid pagename. --kc_kennylau (talk) 03:17, 19 April 2014 (UTC)
Why don't you actually fix Template:sv-noun-reg-ar? That's the template that's really broken, Template:isValidPageName is fine. —CodeCat 12:32, 19 April 2014 (UTC)
I've fixed the template, apparently because everyone else is too busy arguing with me. —CodeCat 12:40, 19 April 2014 (UTC)
@CodeCat: You have only fixed one template. All these templates use the "bug" that isValidPageName with an empty string returns valid. Moreover, {{isValidPageName|aksdjfkasjglkjas}} returns valid. --kc_kennylau (talk) 13:39, 19 April 2014 (UTC) deleted my line after checking --kc_kennylau (talk) 13:43, 19 April 2014 (UTC)
{{isValidPageName}} is meant to be used when a particular page name is allowed, not when it exists. For the latter we already have {{#ifexist:. The template is marked as deprecated because in most cases where it has been used in the past, it was to allow people to include wikilinks in a parameter without breaking things. {{l}}, {{head}} and other Lua-enabled templates can now handle such cases flexibly, so this template isn't really needed anymore then. —CodeCat 13:44, 19 April 2014 (UTC)
Maybe I was deceived by my pride. --kc_kennylau (talk) 13:49, 19 April 2014 (UTC)

Various sections at [edit]

Something's gone wrong with language and other sections starting at Mandarin section and below.--Anatoli (обсудить/вклад) 09:48, 18 April 2014 (UTC)

  • An incompatible mixture of template types had been used in the derived terms section. Fixed. SemperBlotto (talk) 10:03, 18 April 2014 (UTC)
Thank you! --Anatoli (обсудить/вклад) 10:20, 18 April 2014 (UTC)

Module errors in Proto-Germanic entries[edit]

I'm not sure why but I think this [[3]] is causing module errors in some Proto-Germanic entries. See Appendix:Proto-Germanic/abraz and Appendix:Proto-Germanic/agluz for examples. Anglom (talk) 15:19, 19 April 2014 (UTC)

@Anglom: Look at the whole function and you'll know why: the variable "frame" is not even defined in the function! Change line 51 to "return export.tag_text(frame, text, lang, sc, face)" and line 55 to "function export.tag_text(frame, text, lang, sc, face)" to solve the problem. --kc_kennylau (talk) 15:56, 19 April 2014 (UTC)
I can't edit it. I would appreciate it if someone else would, though. Anglom (talk) 16:09, 19 April 2014 (UTC)
@-sche, Angr, CodeCat, SemperBlotto, Kephir, Wikitiki89: sure. --kc_kennylau (talk) 16:14, 19 April 2014 (UTC)
The problem is something that CodeCat overlooked in trying to create tracking templates. --WikiTiki89 16:22, 19 April 2014 (UTC)
What do you mean by overlooked? What's the problem created? Moreover, would the edit that I suggested solve the problem? (I mean using the preview function in the module) --kc_kennylau (talk) 16:25, 19 April 2014 (UTC)
Never mind, I found the answer above. --kc_kennylau (talk) 16:27, 19 April 2014 (UTC)

Diacritic differentiation in search box[edit]

Recently, perhaps as part of the Cirrus update, Greek diacritic letters are now all treated as seperate letters in the search box as well as the page search proper. E.g. typing "ά" into the search box will now show as completions words that begin with "ά" only, and not "α ἀ ἁ ᾶ ᾳ ᾅ" &c. Likewise, searching for "ὅρος" yields only "ὅρος", and not "ὄρος" or "ὀρός". I find the old behaviour preferable, for two reasons.

First, typing Ancient Greek diacritics is pretty much universally more difficult than typing the plain letters. I do a fair amount of contributing on an iPad (not by choice, it's the only tool I have access to for much of the day) and typing anything but a tonos when not editing requires copying it from elsewhere. Before, I could just type in the word sans diacritics in the search box, then click the right suggestion. Now the fastest way to navigate is to make a Google search.

Second, it's quite common to forget the exact diacritics or accent placement of an Ancient Greek word, and this change makes it significantly more difficult to navigate.

I can understand the advantage of such a change, but I really don't think we have enough entries in either Ancient or Modern Greek for such an advantage to be worth the problems I have specified.

Conclusion (tldr): The search tool now treats "α ἀ ἁ ᾶ ᾳ ᾅ" etc. as seperate letters (as opposed to its treatment of "e ë é" etc.), which makes navigation much more difficult. Is it possible to return to the original behavior? ObsequiousNewt (ἔβαζα|ἐτλέλεσα) 15:31, 19 April 2014 (UTC)

As a sidenote, I personally don't like the new search engine AT ALL, because when I use my mobile to view this website, I can't go to a page directly using the search box, I have to go through a searching page first. --kc_kennylau (talk) 15:50, 19 April 2014 (UTC)

term (and generic question) about named/positional param ordering[edit]

I'm looking at the term template specifically, but this could apply to any template that has precedence but no standard written. I'll look at simple (maybe old?) term templates and see term|lang=<lang>|<gloss>, but this is not normal, in fact out of hundreds of thousands of term entries that I've parsed only 6000+ have this ordering.

It seems that the correct ordering is term|<term>|<gloss>|lang, and if <gloss> is not there, that's ok, but it should be represented like term|<term>||lang. Is this correct?

I'm asking because I've been parsing data out of a recent XML dump for a personal project, and the non-standard usages of term are driving me crazy. Rather than code a whole section of workarounds, I'd rather fix this at the source. I know that named parameters appear after the positional parameter they are supposed to modify, but thats not the case above, lang comes directly after term.

It doesn't render incorrectly which makes me think there is no standard for where the lang position should go. Is this correct? Or should I go through and change all these term entries as I come across them? I know its not a priority or will affect anything but its breaking my parser and causing me lots of headaches.. Thanks! —This comment was unsigned.

To give you a quick answer before someone who actually is good at templating gives the full story, here is my understanding:
{{term}} has three positional parameters: 1: the term itself, 2: a piped alternative which appears instead of the term itself, 3: a gloss. I suppose there could be more but I haven't seen them. lang (language), sc (script), tr (transcription) are the three most common named parameters, but there might be more of them too. Named parameters can appear anywhere. Position 1 for the positional parameters refers to the first parameter slot that does not have a "name=" in it. The full story can be found at Mediawiki Help:Templates.
For {{term}} the second slot is very commonly not used, but the empty position is mandatory to make sure the correct interpretation is given to the gloss (3rd) parameter. DCDuring TALK 22:32, 20 April 2014 (UTC)
{{term}} also has two other parameters: "pos" for grammatical part of speech (aka, word class) and "lit", for literal gloss of a term that presumably also has a non-transparent gloss. Also, whether we need it or not most of the operative code is in Lua modules, as is the case with {{term}}. Just a little something we do, together with CSS and JS to help make the whole thing even less transparent. DCDuring TALK 22:44, 20 April 2014 (UTC)
Thats what I thought, the named parameters don't give me grief its just the positional parameters when they conflict with named params :) I've been using this page as a reference - https://en.wiktionary.org/wiki/Template:term. Would it be appropriate for me to change this {{term|lang=mul||ᚫ|tr=a|ansuz}} to this {{term||ᚫ|tr=a|ansuz|lang=mul}} (where lang is the final param)? I'd rather fix it here than code around it, its a rare case to see it like this..Panikal (talk) 22:58, 20 April 2014 (UTC)
Named parameters (with a name and an equals sign) are not ordered, they can be placed anywhere among the others. Even this is valid: {{term|tr=a|3=ansuz|lang=mul|2=ᚫ}}. So there is no reason to change what you propose, it's simply part of the wiki to allow different ordering of parameters. —CodeCat 23:41, 20 April 2014 (UTC)
Ah ha! That was the hint I needed..Thanks! I've looked around for that sort of universal rule but hadn't found it - the pattern I was seeing was all the pages reflect all the template help pages, so I assumed it was just a general rule with some workarounds to render bad templates. ;) I'll change my approach for this then. Thanks again! Panikal (talk) 23:53, 20 April 2014 (UTC) Edit - Only six thousand individual lines of the latest articles dump have 'bad templates' by what I thought was the definition, so its almost a de-facto standard.... :)
Often templates have certain orderings of parameters that are more common, but it's up to the editor to decide what they want to use and nothing is really standardised, nor does it need to be. Usually lang= is put last, but I often find myself writing {{IPA|lang=...|/word/}} by force of habit. If you're looking to parse wiki code, you may want to give mwparserfromhell (for Python) a try. It's very good. —CodeCat 00:08, 21 April 2014 (UTC)
*Not* having to worry about 'exceptions' of the named params being in place of the positional params allowed me to change my approach and reduce the complexity of that part of the code by about 50%. I hope it doesn't change. ;) Thanks again for your help! Panikal (talk) 01:13, 21 April 2014 (UTC)

Middle Vietnamese[edit]

I'd like to add some entries for common words in Middle Vietnamese. They'll all have citations ranging from the 17th century to the early 19th century. Before I begin, I have a couple questions:

  1. Should I put these words under a new "Middle Vietnamese" section and category? There doesn't seem to be an ISO 639 code for Middle Vietnamese, yet it doesn't feel right to put these words under a Vietnamese section, even with an "archaic" label, because there are substantial differences in orthography, grammar, and vocabulary.
  2. What should I do about words that can't be represented in Unicode? For example, I can fake the u+apex+tilde in "cu᷄̃" (cũng) with U+1DC4. And "bbĕào" (vào) uses a letter slated for the next release of Unicode.

 – Minh Nguyễn (talk, contribs) 07:13, 21 April 2014 (UTC)

We can certainly create our own code for Middle Vietnamese, as we have for many other languages that don't have an ISO 639 code. I'd suggest mkh-mvi. Until Unicode can accommodate Middle Vietnamese, I suppose approximations like U+1DC4 are the way to go, but we can't use images as substitutes, since they obviously can't be accommodated in page names. Until the "b with flourish" is part of Unicode, I'd suggest using some existing Unicode character like ƀ as a substitute. —Aɴɢʀ (talk) 10:51, 21 April 2014 (UTC)
U+A797 ꞗ Latin small letter B with flourish is available in Unicode 7.0 Beta, which is expected to be released in July. With the release so close, maybe we might as well use the actual character. (I included it in a font a couple years ago.) – Minh Nguyễn (talk, contribs) 09:33, 22 April 2014 (UTC)
It'd be great to have these. Presumably, they will be in Quốc Ngữ script, as attested since Rhodes' dictionary? How would you decide on the boundary between Middle Vietnamese and Modern Vietnamese? Wyang (talk) 09:39, 22 April 2014 (UTC)

Yes. Of course the primary script in that time was still chữ Nôm, but it shouldn't be a problem logistically until we start finding examples of archaic Nôm usage.

I probably won't quote de Rhodes's dictionary directly, per Wiktionary guidelines. However, his 1651 Catechism has a wealth of material. I've been citing it in places like Citations:bánh. The other major source I have is Philipphê Bỉnh's handwritten Sách sổ sang chép mọi việc (1822). You can find scans of the first couple pages here. Bỉnh interestingly sticks to de Rhodes's orthography (minus the B with flourish) even as his contemporaries have moved to something largely identical to today's Vietnamese alphabet.

Come to think of it, it may be a stretch to represent Bỉnh's early 19th century Vietnamese as "Middle Vietnamese", even if he's using a Middle Vietnamese orthography. Do you think it'd be better to treat Middle Vietnamese as a chronolect (as something like "vi-mid" in Module:etymology language/data)?

 – Minh Nguyễn (talk, contribs) 06:41, 24 April 2014 (UTC)

Template:ja-usex issue[edit]

Can someone please help fix this problem with the usex. It falls over exactly on the string "お聞き"/"おきき"

あのう,ちょっとお ()きしたいんですが。 (えき)はどこでしょう。

Anō, chotto okiki shitai n desu ga. Eki wa doko deshō.
Excuse me, but could you tell me where the station is?

. It's commented out at あのう. --Anatoli (обсудить/вклад) 12:48, 23 April 2014 (UTC)

Fixed. Keφr 15:54, 23 April 2014 (UTC)
Dziękuję bardzo! --Anatoli (обсудить/вклад) 22:48, 23 April 2014 (UTC)

Protecting word of the day[edit]

What happened to the word of the day being protected from non-admin edits? --WikiTiki89 21:36, 24 April 2014 (UTC)

I unprotected the pages mainly because it makes bot maintenance more difficult. They're still semi-protected though. I don't know if full protection is really necessary... it's kind of overused on Wiktionary. —CodeCat 22:06, 24 April 2014 (UTC)