Wiktionary:Grease pit

Definition from Wiktionary, the free dictionary
(Redirected from Wiktionary:Bug reports)
Jump to: navigation, search

Wiktionary > Discussion rooms > Grease pit

Welcome to the Grease pit!

This is an area to complement the Beer parlour and Tea room. Its purpose is specifically for discussing the future development of the English Wiktionary, both as a dictionary and as a website.

The Grease pit is a place to discuss technical issues such as templates, CSS, JavaScript, the MediaWiki software, extensions to it, the toolserver, etc. It is also a place to think in non-technical ways about how to make the best free and open online dictionary of "all words in all languages".

It is said that while the classic beer parlour is a place for people from all walks of life to talk about politics, news, sports, and picking up chicks, the grease pit is a place for mechanics, engineers, and technicians to talk about nuts and bolts, engine overhauls, fancy paint jobs, lumpy cams, and fat exhausts. That may or may not make things clearer... Others have understood this page to explain the "how" of things, while the Beer parlour addresses the "why".

Permanent notice

  • Tips and tricks about customization or personalization of CSS and JS files are listed at WT:CUSTOM.
  • Other tips and tricks are at WT:TAT.
  • Everyone is encouraged to expand both pages, or to come up with more such stuff. Other known pages with "tips-n-tricks" are to be listed here as well.

Grease pit archives
2006
2007
2008
2009
2010
2011
2012
2013
All subject headings

Contents

March 2013

WT:CSS [edit]

I started a new page WT:CSS, because it was pointed out that our main style sheet is not documented. Michael Z. 2013-03-01 19:34 z

translation tables slow [edit]

I think starting with the introduction of Web fonts (but I'm not sure), it now takes a long time for translation tables to drop-down-able. (Newest Firefox for Mac; newest Firefox for Windows.) Is there anything to be done about this?​—msh210 (talk) 06:42, 4 March 2013 (UTC)

I suspect that it's not the translation tables, per se, but just general slowness. I could be wrong, but I believe that pretty much nothing clickable becomes active until the page is finished being drawn. I've noticed slowness everywhere- not just on pages with web fonts. Chuck Entz (talk) 06:59, 4 March 2013 (UTC)
Over the years this Wiki has become slower and slower. Every time somebody adds a bit more cleverness, or adds complications to a template, or replaces simple text with a template allowing users to change its appearance, it gets a little bit slower. I think we should stamp down on added cleverness, and maybe roll back some we already have. KISS SemperBlotto (talk) 08:11, 4 March 2013 (UTC)
I wonder too how much past cleverness might have simpler and more elegant solutions, now that folks have a clearer idea what we want for the site. I know that some processes I've encountered at various job sites have fossilized from years and years of past half-formed ideas about what was required, and sitting down and looking at specific inputs and required outputs can often lead to a much more streamlined way of doing things. -- Eiríkr Útlendi │ Tala við mig 16:19, 4 March 2013 (UTC)
Isn't {{t}} and its relatives a reasonable suspect for performance problems? Can't someone figure out a way for Lua to improve performance. The number of large and very large translations tables seems to be growing faster than average page size. DCDuring TALK 18:23, 4 March 2013 (UTC)
If Scribunto really can make templates like {{t}}, {{context}} and other big templates load faster, it could be a very useful tool for improving the usability of our larger pages. Mglovesfun (talk) 18:55, 4 March 2013 (UTC)
Have you tried turning off WebFonts in the preferences? -- Liliana 18:27, 4 March 2013 (UTC)
Good point. Okay, two more bits of information: (1) I think it only happens the first time I load an entry (any entry, not the specific one I'm looking at) in a cacheless browser. (2) I just tried it with WebFonts off and it didn't happen. (Or if it did then it was small enough a delay that I didn't notice it, which was not my experience previous times.)​—msh210 (talk) 19:50, 4 March 2013 (UTC)
Just what I suspected: downloading all these fonts takes a long time and the browser is effectively frozen while it happens. This makes WebFonts a real nuisance. -- Liliana 20:26, 4 March 2013 (UTC)
how do we make we bfonts opt-in, on a per-language basis? My browser downloads 1.6 MB of web fonts on this site, but needs zero of them to display all of the languages on the Main Page. This is irresponsible for any website, much less one that should be accessible to people with poor network bandwidth. Michael Z. 2013-03-04 20:42 z
Wow, I thought the whole webfonts feature only kicked in if the browser was missing a font required to display a given page. I had no idea it was causing downloads even when not needed. That's not good. -- Eiríkr Útlendi │ Tala við mig 20:49, 4 March 2013 (UTC)
It only loads a font when there's some text on the page that is set to use a specific font and the user doesn't already have that particular font on their computer, even if they already have some other font that could display the text, I think. --Yair rand (talk) 21:08, 4 March 2013 (UTC)
I've been having this issue since December. No idea what's causing it. --Yair rand (talk) 21:04, 4 March 2013 (UTC)

So who chose the web font files for Wiktionary? Are they the same fonts that we are specifying in MediaWiki:Common.cssMichael Z. 2013-03-05 00:15 z

The WebFonts default settings, which we're currently using and can get changed through bugzilla, apply certain fonts to certain languages. Since this doesn't cover all of our uses, there are also some set from Common.css. --Yair rand (talk) 01:28, 5 March 2013 (UTC)
What are the default settings (the docs[1] only list all supported languages)? Which fonts set from Common.css? Where is our documentation for this? Which specific browser or OS inadequacies are we serving fonts for? Michael Z. 2013-03-05 01:58 z
"Supported languages" seems to mean languages which fonts are served for by default. The only fonts set from Common.css are for {{Bugi}}, {{Ethi}}, and {{Mymr}}, I think. --Yair rand (talk) 02:05, 5 March 2013 (UTC)
How do I find out which ones? How were the requirements determined? I hope we don’t add another 300 kB to the page load just because one editor requests their favourite font. I can’t afford to have my mobile plan notch up a tier just because I visit Wiktionary five times during a month. Michael Z. 2013-03-05 02:24 z

Modules can now be documented [edit]

There has been an update to Scribunto, which automatically transcludes the documentation subpage onto the top of the page. They can be used to provide nicely-formatted documentation of the module, and also allow you to categorise it (put the category in <includeonly> tags). Documentation subpages are treated as "special" by the software. Unlike subpages with other names, they are not interpreted as modules. The documentation subpage's name can be changed by editing MediaWiki:scribunto-doc-subpage-name. Its default value is "doc", but I've changed it to "documentation" per WT:RFM#Documentation subpages to /documentation. —CodeCat 22:19, 6 March 2013 (UTC)

A recent update to Scribunto [2] has changed the way documentation pages are handled, it's now at MediaWiki:Scribunto-doc-page-name instead. I've updated it accordingly. —CodeCat 16:33, 15 March 2013 (UTC)

updates to User:Conrad.Irwin/creation.js [edit]

This script generates sense-lines of the form {{form of|[[lemma]]|lang=foo}}. Unless I'm mistaken, it no longer needs to explicitly wikilink the terms, because the templates create links automatically and our page counter no longer relies on the presence of square brackets. Also: we could discuss whether to update it to use {{head|foo|partofspeech?}} rather than '''pagename'''. - -sche (discuss) 18:01, 9 March 2013 (UTC)

I would definitely agree with using {{head}}, although I'm not sure if a PoS is needed, since many form-of templates themselves already add categories. I don't know if that is desirable, but that's a separate question. Also, I think it would be a good idea to replace all existing cases where such raw links are still in use. Could someone make a list of all templates that still allow such usage? I can then add a cleanup category to them, and run a bot script to update all the usages so that we can finally abandon this "legacy". —CodeCat 18:08, 9 March 2013 (UTC)
What is the advantage, apart from uniformity, to having {{head}} instead of using PAGENAME for, let's say, English? Why would we want to have such a vast number of transclusions of a single template? DCDuring TALK 18:47, 9 March 2013 (UTC)
For English, the advantage is that it is consistent with our intended coding of headwords elsewhere. There is somewhat of a consensus to move towards more CSS-based formatting combined with making better use of semantic HTML and classes rather than hard-coded formatting. One of those things is to write headwords as <strong class="headword" lang="foo">word</strong>, which we've already started doing for several templates and modules and which I would definitely consider a good thing. However, if we use plain bold text for English, then that would make English inconsistent with all other languages. —CodeCat 19:04, 9 March 2013 (UTC)
Is it worth calling {{head}} on more pages for this reason? Doesn't the extra template call to a relatively large template slow down the loading of pages? Mglovesfun (talk) 20:04, 9 March 2013 (UTC)
It's not really a very large template, and when it's converted to Lua it will be quite a bit faster because Lua can easily support any number of optional parameters the way {{head}} uses them, without any significant slowdown. And anyway, {{head}} isn't really called that often per page... {{l}} is called, on average, more often within any single language section than {{head}} is called on any given page (to put that differently: most entries have more links than pages have entries). —CodeCat 20:25, 9 March 2013 (UTC)
Yes, please get right of square brackets. User:Mglovesfun/vector.js has a line (more than one in fact) to get rid of square brackets from templates that do literally nothing. Mglovesfun (talk) 20:30, 9 March 2013 (UTC)
Ok, then I would like to have a list of all the templates that currently contain code to allow raw-linking in their parameter. I have already noted {{form of}}, which is used by many other templates as well; it now adds entries to Category:Entries using form-of templates with a raw link. You can recognise the templates because they use {{isValidPageName}}. Come to think of it... are there any other uses for that template at all? —CodeCat 14:05, 11 March 2013 (UTC)

Improving how module documentation currently displays [edit]

Currently, when a module needs documentation, it shows a link, like on Module:User:CodeCat. But most of the time, we only want/need the documentation page to put the module in a category, so once we create it, it ends up transcluding an empty page and looks like this: Module:eo-conj. I wonder if that could be improved, because it seems like a problem in a few ways. Firstly, there is no indication that anything at all has been transcluded, unlike what {{documentation}} displays. Secondly, there is no link to the documentation page itself; this would be fixed by fixing the previous problem, but a tab like we have on Template: pages would also be a good idea. And finally, it seems rather pointless for Scribunto to think that it has transcluded documentation. But all it has really transcluded is a category, so it ends up showing a horizontal rule with nothing above it, which leaves you to guess about what it means. —CodeCat 18:04, 9 March 2013 (UTC)

List request. [edit]

Hi. I'm trying to insure that all English plurals belonging in the categories, Category:English plurals ending in "-ies", Category:English plurals ending in "-es", and Category:English irregular plurals ending in "-ves", are properly categorized. However, as we currently have 115,950 English plurals, weeding through that list is proving to be excessive. Can someone with the technical knowhow generate individual lists of all English plurals ending in "ies", "ses", "xes", "ches", "shes", and "ves", preferably limited to terms which are not already in the aforementioned categories? I will then plow through the lists and fix the ones which need to be categorized. (I suppose this could be automated entirely if someone could make a bot that understood that plurals like "waves" are normal formations while plurals like "pelves" are an "-es" formation and plurals like "wives" and "wolves" are a "-ves" formation). Cheers! bd2412 T 03:22, 11 March 2013 (UTC)

I noticed that you've been adding that category to many pages. However, when {{en-noun}} is converted to Lua, that will all become redundant, because Lua can easily perform the categorization itself, automatically. —CodeCat 13:44, 11 March 2013 (UTC)
I'm afraid I don't know what Lua is, or how it would perform such categorization. Although some of these pluralizations are predictable, it would need to know for example that "leaf" becomes "leaves" while "waif" becomes "waifs". bd2412 T 01:54, 12 March 2013 (UTC)
Re: what Lua is: See Wiktionary:Scribunto. Re: knowing that "leaf" becomes "leaves" while "waif" becomes "waifs": Well, technically, that information is already embedded in the templates; [[leaves]] contains {{plural of|leaf}}, for example. But I'm not sure how useful that fact is, since {{plural of}} is not English-specific, so we wouldn't really want to "contaminate" it with this sort of categorization information. (Though to be honest, I'm not sure these categories should exist, anyway; wouldn't it be better for [[leaf]] to be in Category:English nouns with irregular plurals in "-ves"? The latter, in addition to being preferable in general IMHO, is also doable by Luicizing {{en-noun}}.) —RuakhTALK 02:25, 12 March 2013 (UTC)
I don't see a conflict between having leaf in a category for nouns having a certain kind of irregular plural, and having leaves in a category for nouns being that kind of irregular plural. I think the categorization would be particularly useful, given that leafs exists (as a form of the verb, to leaf), and that similar instances occur of words existing that readers might mistakenly assume to be the regular plural form of words with irregular plurals. If someone would be so kind as to generate the aforementioned lists, I will gladly effect this categorization in a matter of hours. bd2412 T 02:52, 12 March 2013 (UTC)

Module:lang/legacy [edit]

I think the question of how we should really handle language-codes (etc.) is incredibly complex, because languages are incredibly complex, and there are a lot of just-slightly-independent dimensions (e.g. WMF language prefix vs. ISO language code vs. HTML language tag); but I don't think we wait until we've hammered that stuff out (or even started hammering it out) before we start taking advantage of Scribunto.

So, how to take advantage of Scribunto, without hammering out the issues surrounding language codes?

One option is to require that language-manipulation be handled in template-space, before invoking Lua; so, for example, Template:context would call {{languagex}} to get the language-name for a given code, and would pass that in to the Scribunto module it uses. The problem with this option — or at least, one problem with this option — is that {{languagex}} is exactly the sort of expensive template that Scribunto is supposed to help us move away from.

Another option is just to create Module:lang now, with the intent of improving it later. The problem with this option is that any real improvements will probably require fundamental changes that will break everything that uses the module.

So instead, I'd like to suggest that we create Module:lang/legacy ("legacy" being a software-engineering term describing an old system that's still in use but does things in ways that are now considered less than ideal), with a more-or-less direct translation of what we've got now. It would then be pretty straightforward to Luacize existing templates without making any breaking changes to them; and then, at some glorious future date when Module:lang is ready, we can slowly modify these templates to take advantage of its luminous beauty.

Are people O.K. with that general approach? If so, I'll set about creating Module:lang/legacy, and will post back here for further feedback before we actually start using it.

RuakhTALK 05:09, 11 March 2013 (UTC)

Isn't that more or less what Module:languages already does? It is pretty much a direct import of the language code templates, and I haven't made any other changes. —CodeCat 13:41, 11 March 2013 (UTC)
Yeah, I noticed that module later. (And I noticed that you hadn't started using it yet, presumably because you wanted to gather input first? If so, I appreciate your caution.) So basically what I'm proposing is (1) that Module:languages be moved to Module:lang/legacy (or Module:languages/legacy if you prefer); (2) that it be changed to match our current structure more precisely (e.g., proto: and so on); and (3) that it be a table of functions (corresponding to existing templates like {{langnamex}}) rather than of raw data. (The raw data could still be exported as p.data or something, but the current approach has the module only include raw data, which is unfortunate.) —RuakhTALK 15:12, 11 March 2013 (UTC)
I did post about it on the BP or GP (I don't remember which). And I haven't started using it because of the speed issues it has, which are discussed on the talk pages. However, the good news is that they've added a new function specifically for this case. It imports data as read-only, but allows it to be shared by all invocations on a page. So while a single use of that module is still somewhat expensive, it would never be imported more than once per page so it is not a problem. I'm not sure what the use would be of your proposal though. I realise that it would be for compatibility reasons, but even then I don't see the purpose of converting it into a table of functions. Also, one of the caveats with the read-only import is that the imported table can't contain functions, only raw data. —CodeCat 16:50, 11 March 2013 (UTC)
Re: "I did post about it on the BP or GP (I don't remember which)": I'm almost positive that you didn't. You did post about User:CodeCat/Module:lang, though, which may be what you're thinking of. Re: read-only import: Well then, the data can go in Module:lang/legacy/data. :-)   —RuakhTALK 02:34, 12 March 2013 (UTC)
Ok, after thinking about it a bit more I think I understand. You are asking for a kind of "glue" module between old code and the language data. But in the case of {{languagex}} I don't see much of a point. After all, a Lua call like languages_legacy.languagex("fr") would just translate to languages["fr"][1]. There is an alternative though, if you like the idea of wrapper functions around raw data. Lua supports so-called metatables, which are tables that really have accessor functions behind them. Metatables, being functions, can't be included in a read-only module though. —CodeCat 16:56, 11 March 2013 (UTC)
But languages_legacy.languagex("gem-pro") would translate to languages["proto:gem-pro"][1], because of the {{langprefix}} ugliness. (I'm quite seriously proposing that we reproduce exactly what we have now, including the stuff that no one likes, because there is still no agreement on how to improve that stuff. What I'm proposing is that we create a clearly-demarcated "legacy" area that allows us to migrate existing templates to Lua without breaking them.) —RuakhTALK 02:34, 12 March 2013 (UTC)
  • Any comments? —RuakhTALK 04:31, 22 March 2013 (UTC)
    It looks eminently wise to me, but I'm obviously so poor in wiseness reserves that I can't be trusted. Seriously, though, when you post about something in the GP and nobody really complains too much, it means that you might as well create it (I mean obviously you should post again before we actually use it, but that's another story). —Μετάknowledgediscuss/deeds 15:45, 25 March 2013 (UTC)
    I have an alternative proposal which is somewhat similar. I don't think it's wise to call it "legacy" because we'll never really be able to get rid of it entirely. Certain gadgets and templates rely on being able to subst: language templates. Therefore, I propose that we create an extra module that acts as a glue between wiki-space and module-space. Something like {{subst:en}} would become {{subst:#invoke:languages/invoke|language_name|en}}. Templates like {{languagex}} and {{family}} would then simply contain such an invocation to "forward" the request to Lua. —CodeCat 15:57, 25 March 2013 (UTC)
  • I agree with your proposal, but I see it as complementing mine rather than as an alternative to it.   ·   I agree that we'll probably always have certain language templates (or at least, that we don't currently intend to ever eliminate them all), but firstly, a lot of the details will hopefully change (do we really intend to keep {{langprefix}} forever?), and secondly, the underlying Lua support for them will really hopefully change. What is "legacy" here is the first pass at the Lua implementation: I hope that we will create a better Module:lang within the next year or two. Module:lang/legacy is the stopgap, the temporary glue that lets us migrate safely without sacrificing the long term. —RuakhTALK 14:38, 27 March 2013 (UTC)
I'm not really sure what you're saying, though. In what backwards-incompatible way do you think Module:languages would need to be changed? —CodeCat 14:59, 27 March 2013 (UTC)
I don't have very coherent thoughts yet, but our current system has a lot of inflexibility, and it has difficulty dealing with cases like als (which means "Tosk Albanian" when it's an HTML tag but "Alemannisch" when it's a subdomain of wikipedia.org) and Bosnian (which we mostly treat as part of Serbo-Croatian). I think we should consider decoupling some things that our current system wrongly assumes align one-to-one. (Some of these incorrect assumptions, we might decide to keep anyway, as valuable simplifications. But so far we haven't even really examined them very hard, because our system was too inflexible to make them seem conceivable.) —RuakhTALK 04:39, 28 March 2013 (UTC)
I can't think of anything like that, except maybe language families. Do we want languages to be able to belong to more than one family? The issue with Tosk/Alemannic doesn't really seem like an issue, anymore than using "zh" as the subdomain when we have no such code has been an issue so far. I realise that you want to be cautious about it, but I get the feeling there really aren't any serious problems and you're trying to look for problems that don't exist "just in case". —CodeCat 18:23, 28 March 2013 (UTC)
I'm confused. I gave two specific examples, and you added another one (zh vs. cmn), so I don't understand what you mean when you say that you "can't think of anything like that". And I agree+disagree with you about Tosk/Alemannic: I agree that it's been exactly as much of an issue as zh vs. cmn, but I conclude that it is an issue rather than that it isn't one. More broadly — I think there are plenty of problems, and I can't imagine that you're blind to them. (You yourself have tried to make proposals that attempt to address some of these issues, and in fact, we've recently had arguments about them!) The only questions are (1) whether those issues can be fixed without breaking things, and (2) if so, whether some sort of migration strategy is needed; and in that respect, yes, I'm being cautious. I think that, thanks to the flexibility of Lua, we'll probably find that some of these problems won't require a migration strategy (e.g., if they can be fixed by adding additional data-points and tacking on or-expressions), and that some of them will. (This is not a matter of "trying to look for problems that don't exist", it's a matter of trying to prepare for solutions to acknowledged problems.) —RuakhTALK 19:24, 31 March 2013 (UTC)
I'm not blind to them, I just don't see how our current approach is so bad. How do we currently handle the zh-vs-cmn problem? We use a template that translates a Wiktionary code to a Wikimedia subdomain. This problem only really surfaces in cases where links need to be automatically generated and there's only a few templates that need that. —CodeCat 19:57, 31 March 2013 (UTC)
I'm not saying "our current approach is so bad", I'm just saying it's a bit broken, and should be a bit fixed. ;-)   Also, I guess we're avoiding talking about the elephant in the room. Neither of us is happy with {{langprefix}}, so why are you so reticent to view it as legacy code (at least in Lua)? —RuakhTALK 20:36, 31 March 2013 (UTC)
I didn't mention langprefix because to me it was self evident that it wasn't going to be implemented in Lua. —CodeCat 20:39, 31 March 2013 (UTC)
Oh, sorry, then I think most of this discussion has been at cross-purposes. To be absolutely clear: My proposal is that we port our current behavior to Lua, but in a module that is clearly marked "legacy". (I actually thought I'd been clear about that — repeatedly — but apparently not.) My rationale for porting our current behavior is as follows: (1) I think we should start taking advantage of Lua ASAP for templates like {{context}}; (2) I think there are some aspects of our current behavior that we can all agree should be changed (e.g. {{langprefix}}); and (3) I think it will take some time to reach consensus on how to change them. My rationale for marking it "legacy" is as follows: (4) to CodeCat "it was self evident that [langprefix] wasn't going to be implemented in Lua" except, of course, in modules marked "legacy". :-)   —RuakhTALK 20:48, 31 March 2013 (UTC)
And that is where I ask, again, what the point of such a module is if we're going to change it anyway. Why do all the extra work in porting something we know is legacy? I really don't understand. —CodeCat 21:04, 31 March 2013 (UTC)
I have explicitly said what I think the point is. (It's in the sentence that begins, "My rationale for porting our current behavior is as follows".) If you don't understand one of my premises, please say which one. If you disagree with one of my premises, please say which one and why. If you don't agree that my premises lead to my conclusion, please say so. (I mean, I suppose if you just want me to copy-and-paste my previous comment into a new one, I'm willing to do that, but it seems a bit strange.) —RuakhTALK 05:07, 1 April 2013 (UTC)
I still don't understand what you're saying. First, you say "we all agree that langprefix should go" and then you said "let's discuss possible ways to make it go". That makes no sense to me. If it goes, doesn't it just... well, go? Disappear? My confusion is that your statements appear to bring conclusions that contradict themselves. Either it goes or it stays in some form. And if we agree that it should go, what more details do we need to work out? —CodeCat 13:45, 1 April 2013 (UTC)
There are currently a whole bunch of entries that use things like lang=gem-pro, but the actual language template-name is {{proto:gem-pro}}. Without langprefix to correct the mapping, all of those entries will break. You have your own view about how this should be fixed: you think that proto:gem-pro should be scrapped in favor of gem-pro. I do not agree with that view. (I don't think this is news.) —RuakhTALK 14:04, 1 April 2013 (UTC)
So instead you filibuster the whole thing by requiring us to add "legacy" modules, which are really only there for your own satisfaction? *sighs* It looks like once again this is an issue that is between you and me, and that nobody else actually finds interesting enough to comment on... —CodeCat 16:43, 1 April 2013 (UTC)
WTF?? You are such a hypocrite. You could just as well say that I'm proposing the "legacy" module for your satisfaction, because the alternative that I'd propose is one that you would dislike. I am offering to do all the work to preserve the status quo until a consensus in demonstrated. But no, you think that your own view is somehow magically perfect, and everyone would somehow magically agree with it, if only I would get out of your way. Sheesh, grow up. —RuakhTALK 20:37, 1 April 2013 (UTC)
Du calme, du calme. Ruakh, I happen to (with my highly limited knowledge) agree with you — but calling people hypocrites and the like won't help. (Sorry if that sounds patronising. I just think that having a heated argument over something this minor is only detrimental to Wiktionary.) —Μετάknowledgediscuss/deeds 00:58, 2 April 2013 (UTC)
From my point of view (i.e. fr.wiktionarist), you really should focus on getting rid of hacks like this langprefix thing if you want to be able to use the full power of Lua. You just need a module with language functions (get_name(s), get_script, get_type, get_family) with an associated data module (instead of thousands of inefficiently disseminated data in templates and subtemplates). Dakdada (talk) 17:38, 1 April 2013 (UTC)

form of template bug [edit]

{{feminine of|calmo#Adjective|calmo}}

displays

  1. feminine form of calmo#Adjective

For historical purposes when that gets fixed, it is:

  1. feminine form of calmo#Adjective

This syntax used to work, and I'm not sure why it doesn't. I guess... calmo#Adjective isn't a valid page name. Is my guess right? Mglovesfun (talk) 11:43, 11 March 2013 (UTC)

Oddly, I think it's actually the code that allows putting raw links into form-of templates that is the cause of this. You're right, it's not a valid page name, and that's what that code goes by to determine whether something is a raw link or just a page name. So it treats its parameter as if it were a raw link, except it's not a link. However, once that code is removed, it should work. On the other hand, the template is missing a language parameter, so that still needs to be fixed. Another thing to consider is that there are probably several #Adjective sections on any given page, so the current approach doesn't actually do what it's intended to do. What you really want is to link to the adjective section of whatever language it is, but I don't think that is currently possible. I think if we have to choose between linking to #Adjective and linking to #language, the latter is preferable. —CodeCat 13:49, 11 March 2013 (UTC)
Special:WhatLinksHere/calmo#Adjective seems to be valid, mind you. Mglovesfun (talk) 19:47, 11 March 2013 (UTC)
Yes, but in a very sneaky way. Notice that when you actually visit the page, it shows links to "calmo" alone. When your browser sees that URL, it actually strips off the # part, so the webserver never sees #Adjective. If you ever actually sent a request for "calmo#Adjective" to the server, it would probably shout at you for providing an invalid request. :) —CodeCat 20:43, 11 March 2013 (UTC)

Javascript to tackle 404-errors [edit]

I previously posted it in the beer corner, but I figured the grease pit might be more appropriate: I rewrote my example userscript which, upon hitting a 404 error page scans other wiktionaries to see if the word exists there, and if so, displays them as interwiki.
Enable the userscript at User:Stratoprutser/404_native.js and test it out with klompvoet, danim, or real non existing words. -- Stratoprutser (talk) 13:34, 11 March 2013 (UTC)

No bot owner template? [edit]

I miss this template from the English Wikipedia, and it seems hard to introduce here. --Njardarlogar (talk) 17:54, 11 March 2013 (UTC)

{{bot owner}} should be fine. Mglovesfun (talk) 19:47, 11 March 2013 (UTC)
Or “I operate NjardarBot (talkcontribs).” No need for a userbox. :-)   (If you really want a userbox, by the way, then that's a policy matter, not a technical question, and belongs at BP, not here.) —RuakhTALK 02:36, 12 March 2013 (UTC)
We don't need babel boxes, personal fluency levels could be included the text with more detailed specifications. We don't need user pages either, we could include all that information on Wiktionary:Stasi files.
We should have {{bot owner}} because it standardises and makes more accessible highly relevant user information. --Njardarlogar (talk) 08:43, 12 March 2013 (UTC)

Etymology trees [edit]

A lot of proto- language entries duplicate some of the descendants content. For example, if a Proto-Germanic word is descended from a PIE word, the PG descendants are duplicated in the PIE entry. These often get out of sync, and require many edits to synchronize. Some entries (most?) even just don't duplicate them at all, and require the reader to click the link to find out the further descendants. Couldn't this be fixed by putting the entire tree into a standalone wiki page (maybe in the appendix or template space) and having lua scripts run through the things to pull out the relevant parts? Is this feasible? --Yair rand (talk) 21:31, 12 March 2013 (UTC)

That could work, but it could turn rather nasty in itself if we have to deal with sub-descendants and sub-sub... For example, part of the Germanic tree would be duplicated on an Old Dutch entry, and part of its tree would in turn go in a Middle Dutch entry. So while it's a good idea, we should be very clear about when it should be applied and when not. Also, another point to consider is that a single "line" in the PIE descendants might have several words in it, each of which might have a separate entry and a list of descendants of its own; see *bʰerǵʰ- for an example. If we go with your approach, those would have to be split into several lines. —CodeCat 21:40, 12 March 2013 (UTC)
If we're using Lua I assume we would be able to give it instructions as to which parts of the tree to display (for example, in a Dutch entry you may want to not go back all the way to the PIE root). So I don't see duplication as a problem with this approach.
I don't understand your point about multiple "lines" for one entry- can you restate it? DTLHS (talk) 23:41, 12 March 2013 (UTC)
At Appendix:Proto-Indo-European/bʰerǵʰ-, the Germanic line lists two separate Proto-Germanic forms. Both of these forms are derived from the same PIE etymon, but they're separate forms, and have separate descendants. So the descendants of the PIE etymon form a tree in multiple dimensions: not necessarily just one branch per daughter language. —RuakhTALK 06:58, 13 March 2013 (UTC)

404 errors? [edit]

I keep getting this error randomly when I visit pages:

Not Found
The requested URL /w/index.php was not found on this server.
Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.

Is anyone else getting that too? It's very annoying... —CodeCat 00:23, 13 March 2013 (UTC)

Me too. DCDuring TALK 00:51, 13 March 2013 (UTC)
On 'pedia too. Which is a good thing, because that means somebody will actually care and if it's a fixable problem, it'll be fixed soon. —Μετάknowledgediscuss/deeds 01:02, 13 March 2013 (UTC)

Genitive of proper nouns [edit]

The template {{genitive of}} puts words into the appropriate "... noun forms" category. Is that appropriate for proper nouns? See Kleinasiens as an example. SemperBlotto (talk) 11:35, 13 March 2013 (UTC)

pos=proper noun. Mglovesfun (talk) 11:42, 13 March 2013 (UTC)
OK - that puts it into both cats (presumably intentionally). SemperBlotto (talk) 11:46, 13 March 2013 (UTC)
I would prefer the forms of proper nouns to be in the normal "noun forms" category. There is already some disagreement on whether proper nouns are as distinct from nouns as we consider them to be, and making that same distinction in forms is a bit overboard. I can't really think of a good reason why someone would want to look up a list of proper noun forms specifically. —CodeCat 14:56, 13 March 2013 (UTC)
Yes, I tend to agree with you. Actually, I have often wondered if our users ever use any of our massive range of categories at all - does anyone have any evidence that they do? I think that their main use is for editors to see what words we have, and especially what similar words may be missing. SemperBlotto (talk) 15:58, 13 March 2013 (UTC)
I usually treat the form-of categories as kind of a "because it has to be in at least one category" thing. So I don't usually make any further subdivisions. —CodeCat 16:01, 13 March 2013 (UTC)

Meetup & videostream tomorrow - focus on Lua [edit]

Tomorrow's meetup at Wikimedia Foundation headquarters in San Francisco focuses on how Lua as a templating/scripting language improves our sites, and includes a brief introduction to Lua. It'll also be streamed live on the web, and the video will be posted afterwards. Please feel free to visit or watch! Sumana Harihareswara, Wikimedia Foundation Engineering Community Manager (talk) 15:46, 13 March 2013 (UTC)

conjugation template for German reflexive verbs [edit]

Have we got a conjugation table template for German verbs that are reflexive? What about for verbs that are both reflexive and separable, e.g. fremdschämen (which can also be inseparable, and so really needs two tables), which conjugates like "ich schäme mich fremd" (and "ich fremdschäme mich")? - -sche (discuss) 21:56, 13 March 2013 (UTC)

I have always preferred not to have separate entries for reflexive verbs if they are formed using separate words or clitics in a language. That especially applies to languages like Dutch or German where the word order may be vastly different. So different, in fact, that any entries we create for inflected forms will be almost useless. Just consider in how many different ways the reflexive pronoun may be arranged in a few typical German sentences. Add a separable verb into the mix and it becomes even worse. For that reason, I prefer to redirect reflexive verbs to their non-reflexive entries, and add {{reflexive}} to the specific senses. I have already done this for Dutch. —CodeCat 01:03, 14 March 2013 (UTC)
Alright, but that doesn't answer my question. de.Wikt doesn't have entries for e.g. de:sich fremdschämen, de:sich benehmen, etc, but the tables in de:fremdschämen, de:benehmen, etc include "sich". Do we have tables that do likewise yet? If not, I can set about creating some (though I might need help). - -sche (discuss) 02:03, 14 March 2013 (UTC)
What I am saying is that we probably shouldn't have such tables. Consider a verb like irren, which has some reflexive and some non-reflexive senses. Should that entry have two conjugation tables, both containing the exact same conjugated verb forms, but one with the reflexive pronoun and one without? I don't think it should. —CodeCat 02:15, 14 March 2013 (UTC)

A standard location for Lua transliterations [edit]

One of the obvious advantages of Lua is the ability to automatically transliterate words into Latin script. It is definitely something we'd want to add to templates like {{l}}, {{term}}, {{head}} and {{t}}. However, for that to work, there has to be a single common scheme for the functions that do the transliteration. The problem is that every language could have its own transliteration scheme, so just putting them all into one module will eventually run into speed issues because that module would eventually become too large. Therefore I propose that we form a single common scheme, an "interface" so to say, that transliteration functions have to adhere to so that they are interoperable with one another. Compare it to the way all of our script templates work the same way and are therefore interchangeable with one another. Is there a way we can do this for transliterations too? —CodeCat 00:58, 14 March 2013 (UTC)

I think invoking them from Module:foo-translit is the most logical location, if that's what you mean by "scheme", but I don't really mind if people would rather have it at Module:foo-common, invoked as tr. I'd like to go on record that transliteration modules should be language-based, not script-based, to reduce the complexity (and sheer size) of individual modules. —Μετάknowledgediscuss/deeds 01:03, 14 March 2013 (UTC)
I know, and that is kind of what I had in mind. However, if we do it for every language, how do we handle cases where there is no transliteration module for a language yet? Is Scribunto capable of handling a failed module import gracefully? —CodeCat 01:05, 14 March 2013 (UTC)
No idea. But first of all, which location do you prefer? I ask because I'm planning on creating a bunch of these soon. —Μετάknowledgediscuss/deeds 01:52, 14 March 2013 (UTC)
I would prefer keeping it separate, in Module:foo-transliteration. But I just thought of something else we could try. As far as I know, transliteration isn't context-dependent: the same letter always becomes the same Latin letter(s) regardless of how it appears in the word. That means we may not even need whole functions to do this; we could just store a list of letter-pairs. And since that would consist of only data, it would be possible to add it to Module:languages (which may not contain any functions). —CodeCat 01:58, 14 March 2013 (UTC)
That's a really bad idea IMO. For one thing, the premise is wrong (example: Korean) and for another Module:languages is already too big for me to even load it in a reasonable amount of time, last I checked, let alone edit it. —Μετάknowledgediscuss/deeds 02:03, 14 March 2013 (UTC)
The time for you to load it is a lot longer than the time Lua takes to load it. A recent Scribunto update actually added a function specifically for loading such large modules containing data. So the size is really not a problem, at least not if we are to believe the developers. As for the premise... when does it not apply to Korean? I had the impression that Korean was actually rather regular. Can you give an example of a single Korean letter or syllabic that can be transliterated in several different ways? Also, just to make it clear, this idea isn't meant to be able to transliterate every language, it would be hopeless to attempt it for the likes of Han characters. —CodeCat 02:10, 14 March 2013 (UTC)
I know that... but there's still the problem of me wanting to edit it! If it gets too big, it's a real problem for editors. Anyway, my point with Korean is that if you just take the letters ㅇ, ㅗ, and ㄱ, if you combine them in one order you get 공 (gong) but in the opposite order you get 옥 (ok). Can Lua handle that? —Μετάknowledgediscuss/deeds 02:14, 14 March 2013 (UTC)
If our browsers can tell the difference, why can't Lua? From what I can tell in w:Korean language and computers, Hangul is encoded by combining all three individual letters into a single character. I presume that means that from a transliteration perspective, Hangul behaves as a syllabary and "gong" and "ok" are two different characters, each with a single Unicode codepoint, like Chinese characters or Kana are. —CodeCat 02:21, 14 March 2013 (UTC)
Oh and just to clarify, exceptions like Japanese "ha" being pronounced as "wa" can simply be explicitly overridden with a tr= parameter like we have now. Automatic transliteration is simply meant to provide a useful default transliteration, but it should be possible to override it when it's wrong, just like we could override an irregular plural form. —CodeCat 02:25, 14 March 2013 (UTC)
Sometimes Unicode is really weird... does that mean that Module:ko-translit will be gigantic? (Yes, I'd rather foo-translit over foo-transliteration, so I will be using that as the standard now unless you have a good reason not to do so.) I agree on the exceptions, although for languages like Kyrgyz where there don't appear to be any exceptions, overrides aren't necessary. —Μετάknowledgediscuss/deeds 02:29, 14 March 2013 (UTC)
For Korean you need a formula to decompose hangeul chracters into individual jamo. I have written a transliteration tool a while ago in C#. I've got it somewhere at home, happy to share if someone want to write transliteration tool. I wonder what Google translate uses to transliterate Mandarin and Japanese (often wrong, especially Japanese!). --Anatoli (обсудить/вклад) 02:39, 14 March 2013 (UTC)
It could become pretty large, yes. Which is kind of unfortunate considering that Hangul itself is so well-structured. Hangul could be easily transliterated if we could piece apart individual code points like Anatoli said, but that would require more than we can put into a simple data table like Module:languages. On the other hand, the module we will presumably create to handle automatic transliteration, Module:transliteration, could simply be hand-coded with an exception specific to Hangul. The function could work like this: if the script is Hangul, then do some fancy code-point processing in Unicode, else use the pair-wise table. —CodeCat 02:53, 14 March 2013 (UTC)
The logic for decomposing is simple, a JavaScript can handle this. I will get my code when I have a chance and post a logic somewhere. The complete program had some flaws as it didn't take into account some consonant changes, which should be reflected as per Revised romanisation. --Anatoli (обсудить/вклад) 03:07, 14 March 2013 (UTC)
I don't understand exactly what's going but it sounds interesting.
For languages without manageable automatic transliteration this module should be skipped but it would be useful if people could add missing sounds or correct them, e.g. if Arabic "ظهر" were automatically transliterated as "ẓhr", an editor would edit to make it "ẓuhr" (insert the unwritten vowel). --Anatoli (обсудить/вклад) 02:05, 14 March 2013 (UTC)
Well, for languages like Arabic, automatic transliteration wouldn't be terribly helpful if we use the basic page name for it. But I think that transliterating the fully vowel-marked version of the word could work? We already add vowels to the head= parameter, so the template/module could be written to use this instead of the page name. —CodeCat 02:10, 14 March 2013 (UTC)
I meant if Lua could be used in editing or adding translations, not in ready entries (e.g. in preview). Fully vowelled Arabic (not sure about Hebrew) could be transliterated but not sure if this could be made perfect (without errors), perhaps it can, if strict spelling rules are followed (eg. hamza is written when it's appropriate and ى and ه are not used instead of ي and ة. --Anatoli (обсудить/вклад) 02:27, 14 March 2013 (UTC)
Hebrew is impossible because WT:HE TR requires marking vowel stress, which Hebrew doesn't do. Yes, Arabic would require strict spelling rules to be followed which translations currently do not (but I think entries usually do). —Μετάknowledgediscuss/deeds 02:42, 14 March 2013 (UTC)
I suggest to use it only if transliteration is missing, specific transliteration should override Lua. We have SO many translations and entries with no translit. Lua transliteration may have some warning advising people that it can be incorrect (for selected languages?) Also note my reply re: Korean above. I can spend some with whoever works on the Korean transliteration. --Anatoli (обсудить/вклад) 02:49, 14 March 2013 (UTC)
Well, at least for Greek and Cyrillic, and generally any fully alphabetic script, the transliteration could be made flawless (but it would not include stress marks). I don't see lack of stress marks as a reason to avoid automatic transliteration altogether. A transliteration without them may not be complete, but it won't be wrong either, so it may be usable for Hebrew too. Devanagari and the other Indic scripts are encoded as alphabets in Unicode (the consonants and vowels are separate), but they need special treatment because the transliteration of the consonants depends on whether a vowel character follows ("devanāgarī" is actually encoded as "d-e-v-n-ā-g-r-ī"), so a simple pair-table would need to be supplemented by a function that suppresses the inherent vowel of a consonant when necessary. Such a function could, however, probably work for all indic scripts as long as we tell it which letters in a given script are consonants and which are vowels. —CodeCat 02:53, 14 March 2013 (UTC)
Actually, Cyrillic and Greek can be flawless even with stress marks :) See Module:ru-translit. —Μετάknowledgediscuss/deeds 02:57, 14 March 2013 (UTC)
That is a lot of code. Somehow I think that it could be a lot simpler, but I don't really know what a lot of it does (specifically, what purpose does it serve, why is it there?). How much of it is actually specific to Cyrillic? —CodeCat 03:00, 14 March 2013 (UTC)
The scheme used for Russian is a mix of transliteration with arbitrary exceptions where parts of words are phonemically transcribed. Not a good exemplar. Michael Z. 2013-03-14 04:41 z
The developer, also a Russian, did what he felt was right for the Russian language and what is our policy. It has described exceptions. E.g. adjective endings -ого/-его are transliterated as -ovo/-(j)evo, not -ogo/-(j)ego, it's standard. The code needs to cater for these where possible. --Anatoli (обсудить/вклад) 04:49, 14 March 2013 (UTC)
Yes, but it is quite different from any other transliteration scheme, and is far from a typical example or prototype for any transliteration code. Michael Z. 2013-03-14 05:34 z
Not so sure about "any other". The Japanese particles and , for example, are transliterated phonemically as "wa" and "e", not as their usual hiragana readings "ha" and "he". Catering for these exceptions may be big hurdles in some cases. As you yourself mentioned below (transliterating letters depending on their position), CodeCat mentioned about Indic languages, will make automatic transliteration harder and will require more sophisticated code. Russian may turn out an easy example. --Anatoli (обсудить/вклад) 05:44, 14 March 2013 (UTC)
Yes, romanizing logographic and syllabic scripts is more complicated than for the Cyrillic alphabet. Usually. Michael Z. 2013-03-14 05:58 z
I don't think automated transliteration should cater to such exceptions. If the default is wrong, it should be overridden just like we do with any other template that generates a default form (such as {{en-noun}}'s plural). See my comments further down. —CodeCat 14:09, 14 March 2013 (UTC)
(In response to Metaknowledge 02:42, 14 March 2013 (UTC).) In Hebrew, not only stress is the problem. חָכְמָה, for example, can be chochmá or chach'má (two different words). Basically, even if we wouldn't mark stress, any word with U+05B0 HEBREW POINT SHEVA and many with U+05B8 HEBREW POINT QAMATS would be ambiguous — and that's a large proportion of all Hebrew words. (That's just re general problems of automating Hebrew transliteration. I haven't been following this discussion at all, and don't understand, e.g., its first post.)​—msh210 (talk) 06:37, 14 March 2013 (UTC)
For Russian there is currently no single entry missing transliteration, translations miss it sometimes, which I have been fixing. Still see some use for Russian in the future. Agreed about Indic. Uyghur is fully vowelised, even if it's Arabic abjad based. Armenian, Georgian seem easy. Thai, Khmer, Lao, Burmese would be complex but possible. Khmer might be the easiest, check with Stephen G. Brown.--Anatoli (обсудить/вклад) 03:07, 14 March 2013 (UTC)

If romanization is automated, then there could be multiple schemes per language. Perhaps the displayed scheme for a language could be a user pref, or an entry could show several commonly-used romanizations. It would be reasonable to add a BGN/PCGN transliteration for a geographic name, for example, as that’s what would be seen on many maps. In the future we could decide to include other transliterations, e.g., Cyrillizations of Latin or Chinese script. The framework should leave room for this. We might also use LOC transliteration for foreign-language titles in citations, as this is used in library catalogues and bibliographies.

Romanizations aren’t necessarily straight table lookups. Some important ones include exceptions for occurrences at the beginning or end of a word, or after a vowel or consonant. But we could start by implementing ones that are straight lookups.

ISO language tags have a standard representation for transformed text, although the tags can get lengthy. This might be useful for cataloguing our schemes. Michael Z. 2013-03-14 04:43 z

While that would be nice, I think it kind of misses the point. The point of automated transliteration, to me, is that it can provide a sensible default where it has not yet been provided. So, for example, if someone types {{head|ru|noun}} on an entry without a transliteration, automated transliteration could make one itself. But it would still be necessary and desirable to check it and to override it if necessary, so it's not meant as a substitute for manual transliterations. —CodeCat 14:04, 14 March 2013 (UTC)
I disagree. Your position makes sense for some languages, like Russian. But we've had a real problem with some smaller languages, like Telugu, where the contributors use various transliteration schemes, ranging from ASCII to ad hoc, and often neglect transliteration altogether. The only person cleaning that mess up has been Stephen G. Brown. An automated transliteration system for Telugu will be more reliable than what users give as a transliteration value, and thus it definitely would be a substitute for manual transliteration. —Μετάknowledgediscuss/deeds 22:51, 14 March 2013 (UTC)
I agree with CodeCate on this one. Humans know or should know better. We should have the ability, perhaps, to give us both manual and automated, then, someone with knowledge of standards could fix non-standard transliteration. So, to give a Russian example, no point in having automated transliteration of "что" as "čto" (incorrect), I'd prefer manual "shto" (non-standard but correctly showing non-standard reading), then I know that I have to put "što" to make it standard. I stumbled across similar problems with Bengali and Thai. There are many exceptions in readings in various languages. I already mentioned Japanese particles particles and . --Anatoli (обсудить/вклад) 23:24, 14 March 2013 (UTC)
But that only makes sense if exceptions exist; if we don't know, then we should ask our local experts like Stephen or just bring the issue up at language forums. In many languages, like Greek, our system specifically says that it is trying to reproduce the orthographical conventions rather than the sounds, so there will never be exceptions.—Μετάknowledgediscuss/deeds 23:31, 14 March 2013 (UTC)
We can have exception-free languages, like e.g. most Cyrillic-based, except for Russian. Don't know Telugu but in Hindi (like Arabic) there is strict and relaxed spelling, ग़रीब (ġarīb) can be spelled (casually but too common) "गरीब" (without a "nuqta", a dot under (ġa) -> ग़ (ga)). It's still "ġarīb", not "garīb". Manual transliteration should provide the correct pronunciation, as nuqta is often ignored in Hindi (8 Devanagari letters can use it).
I personally disagree with WT:EL TR, they totally ignore the way foreign words with "b" and "d" are transliterated "μπ" /b/, "ντ" /d/, etc. --Anatoli (обсудить/вклад) 00:19, 15 March 2013 (UTC)
no point in having automated transliteration of "что" as "čto" (incorrect)
Anatoli, there’s exactly a point of having a transliteration of the word spelled č-t-o be čto. Otherwise, it is not a transliteration. Why are you putting pronunciation into the place for transliteration? Michael Z. 2013-03-15 01:05 z
I agree with Michael on that point. Consider what would happen if we started "transliterating" English the same way. We'd end up with something resembling enPR wouldn't we? —CodeCat 01:15, 15 March 2013 (UTC)
That's what Lua will produce - "čto". I want to override it with "što" because "čto" is misleading and doesn't help anybody, foreigners and even some uneducated Russians still read out "что" as "čto" when it should be "što" it is a practice accepted and used over years by editors working with Russian. This exception is not predictable like akanye and the knowledge of Russian phonology and sound changes doesn't help to arrive at the correct pronuncation of the word, so it has to be specifically explained. IPA is not sufficient, many people dislike or don't understand it and IPA is not used in translations. --Anatoli (обсудить/вклад) 01:21, 15 March 2013 (UTC)
I think you need to consider what transliterations are for. The purpose is to allow someone to read the word when they don't know the script. It's not meant to tell them how to pronounce the word, that's what the pronunciation section is for. Moreover, if someone is able to read Cyrillic, they shouldn't need the transliteration, should they? So if you think about it, someone who doesn't need the transliteration will end up reading the Cyrillic letters что (čto) while someone who does need it will find što instead. That is just inconsistent, and it's rather strange that the transliteration (which is meant as a reading aid) gives different information. I think the transliteration should only have information in it that can also be deduced from the original script (in combination with the regular phonology/orthography of the language). If you really want to show that что is to be read as što, then that should be written in addition to the regular transliteration čto, not replacing it. —CodeCat 01:32, 15 March 2013 (UTC)
I think the present situation is fine, but if you want to have this conversation, move it to the BP. —Μετάknowledgediscuss/deeds 01:38, 15 March 2013 (UTC)
MK, what is relevant here is that transliteration schemes that meet the criteria for transliteration schemes also tend to be suitable for mechanical transliteration (whether it be by machine or by a non-native reader). Substituting a complex, proprietary, ambiguously-defined, phonemic transcription system is a loss for readers, editors, and openness, as well as for automated transliteration. We should define some baseline standards for transliteration. Michael Z. 2013-03-15 14:42 z
Anatoli, one objective of having both transliteration and pronunciation is exactly to show when the two differ. By not transliterating the word (which requires a respect for its letters), you are obscuring that very information, potentially contributing to the problem you describe. If accessibility of the pronunciation is lacking, then improve the pronunciation, as you have done in some entries, instead of destroying the transliteration. Michael Z. 2013-03-15 14:42 z
Think about what would happen if someone who just started learning Cyrillic comes across что. They have just learned that ч is č or ch or some variety. Yet here they suddenly see what they think is a "wrong" transliteration, so they will correct it. That's what I'd probably do too if I found this. —CodeCat 15:05, 15 March 2013 (UTC)
Relevant: #Criteria for romanization systems Michael Z. 2013-03-15 20:41 z
People who start learning Japanese and learn hiragana, see the phrase これなんです. I they only know hiragana they will read "kore ha nan desu ka?". An automatic transliterator would also romanise it so. You need a person knowing Japanese to correct and say it's "kore wa nan desu ka?". This is how Japanese is transliterated. There's no difference with the Russian "что это?", which is "što éto?", not "čto éto?". It's not just understanding the writing system. Transliterating letter by letter ("čto eto") is just unhelpful in this case. Foreign users can ask if they think it's "wrong", native users understand exactly why it's transliterated that way. People who know Cyrillic but don't know exception will misread the word. I'm OK with Lua to transliterate the default way (taking into account some basic rules of changing, like поезд (pójezd) but небо (nébo) "е" = je/e, берёза (berza) but жёлтыйóltyj) "ё" = jo/o) but it's up to editors with the knowledge to override the default and correct.
If anyone wants to check the complicated rules about Korean standard transliteration, read w:Revised Romanization of Korean, there are too many consonant changes, like + (b + n = mn), + (l + n = ll, nn). @Michael, please just stop talking about "destruction" of the transliteration. --Anatoli (обсудить/вклад) 10:42, 16 March 2013 (UTC)
Sorry, Anatoli, but you’re still, apparently willfully, missing the point of transliterating Russian Cyrillic when you talk about “correcting” it into something that fails most objective criteria for transliteration. Russian is not Japanese or Korean. Also, a small number of editors are lording it over your own little empire and disregarding the needs of the readers when you create a proprietary “system” full of vagaries, refuse to clarify them, and insist that everyone else doesn’t understand it well enough to have a valid opinion about it. Michael Z. 2013-03-23 17:39 z
  • An automated transliteration of Burmese would generate the ALA-LC system very easily, and could probably be made to generate the MLCTS as well, but four years ago Stephen and I reached the compromise that Burmese entries would show four romanization systems (two that are orthography-faithful transliterations and two that are pronunciation-faithful transcriptions), while Burmese words mentioned on other pages (e.g. in Etymology and Translations sections) would just use the pronunciation-faithful BGN/PCGN transcription. —Angr 11:02, 16 March 2013 (UTC)
For multiple transliteration methods, we could use multiple modules named in the form:
  • {{ my-translit | မြန်မာဘာသာ }} – default romanization, e.g. BGN
  • {{ my-alaloc-translit | မြန်မာဘာသာ }} – other romanization, e.g. ALA-LC
Or a single module with a transliteration, method, or scheme argument for the non-default methods:
  • {{ my-translit | မြန်မာဘာသာ }}
  • {{ my-translit | မြန်မာဘာသာ | method=alaloc }}
Standard tags for ISO t extension are
  • alaloc – American Library Association-Library of Congress
  • bin – US Board on Geographic Names
  • buckwalt – Buckwalter Arabic transliteration system
  • din – Deutsches Institut für Normung
  • host – Euro-Asian Council for Standardization, Metrology and Certification
  • iso – International Organization for Standardization
  • mcst – Korean Ministry of Culture, Sports and Tourism
  • stats – Standard Arabic Technical Transliteration System
  • ungegn – United Nations Group of Experts on Geographical Names
Specific versions are typically tagged like ungegn-2012. Non-standard methods would be tagged with an “x” private-use code, e.g., x-wiktMichael Z. 2013-03-17 20:54 z
But the problem is that only two of the systems in use here are predictable from the spelling—the other two (including the one used outside the Burmese pages themselves) are not (always) predictable from the spelling. Though I suppose the BGN/PCGN transcription is predictable often enough that it will be OK as long as it's possible to manually override the automatic transcription, e.g. via {{my-translit|မြန်မာစကား |tr=myanmazăga:}} to prevent the template from automatically generating myanmasăka:. —Angr 10:20, 19 March 2013 (UTC)
Does that correspond to note 1 on page 3 of this standard, or lines 1 and 4 of the first table in this one? It looks like it might be predictable, but requiring some more-complicated programming. A manual override like you describe sounds like a good compromise, until and if that programming can be added. Michael Z. 2013-03-22 19:04 z

dump grep request: Hebrew section SGML comments [edit]

Can someone please generate a list of pages, each of which has a ==Hebrew== section containing <!?​—msh210 (talk) 06:52, 14 March 2013 (UTC)

את . שוקולד . אגרוף . אגרוף תאילנדי . אינדונזי . אנגלית . מים . טוב . אלוהים . בן . דרום . היה . גדל . מצא . פן . עם . ילד . עור . אהרן . ויקרא . אחת . לבן . תוכי . כי . ז־כ־ר . כדור . אח . מת . אי . שם . מספר . נזהר . ישן . כוס . תת . ציבור . זה . יותר . ירדן . טבע . איזה . צילם . מתמטיקה . הפעיל . האיר . אָ . הבא . פקד . בער . ־ון . בא . קשת . קורס . נשא . שלח . עאכ״ו . פחות . באשר . שימוש . כפה . אלהים . צפון . ־ים . הבין . סדין . נפלא . מאפיה . התקלח . פסח . י־ל־ד . מעות חטים . ־ה . זיין . קעקע . גת . יום טוב . הזיע . י״ט . הארץ . הטהר . הצטנן . השתדל . השתכנע . הסתעף . השתמר . השתכר . התלכלך . התקמט . התחבא . התלבש . התקשר . השתתף . התנכר . תיכנת . ארצה . רעש . הביא . מלח לימון . חומצת לימון . חשמן . חרש . כרית . מחמד . ארוחת עשר . בנים . נ־כ־ר . ילדים . לבד . גילה . מרדכי . גלעד . פרו . ישרצו . כהה . מלכי־צדק . זכור . תמלא . יאמר . עמו . הבה . נתחכמה . ירבה . בנו . ישימו . משוגע . תיראן . מצה . תחיין . יראו . תחיון . לכי . תכה . שמך . רעך . להרגני . נודע . אסרה . אראה . כדי . להעלתו . העלה . בני ישראל . ואמרו . אלי . מכרה . ושמעו . נלכה . שלשת . ושלחתי . הכה . עיני . תשליך . ידו . והיה . ולקחת . פיך . והוריתיך . אוצר . שוק שחור . אשובה . מערכת הפעלה . מעבר לים . אינדונזיה . מג״ב . כוח . הקב״ה . האט . חומוס . קטון . ניגש . ויאמן . וישמעו . עניםRuakhTALK 03:37, 15 March 2013 (UTC)
Many thanks.​—msh210 (talk) 15:37, 15 March 2013 (UTC)

Edittools? [edit]

Anyone else having trouble with Edittools in Chrome? Using Chrome on Win 7. Edittools were working fine this morning, but I get back from lunch and they completely fail to load, not even the default ones... -- Eiríkr Útlendi │ Tala við mig 20:23, 14 March 2013 (UTC)

Downloadtools [edit]

Is there any kind of API, or database, where I can download some ogg files from wiktionary???

Best regards --77.47.30.210 21:26, 14 March 2013 (UTC)

A function to convert Korean hangeul to Roman letters (basic) in C# [edit]

This is the code I promised to share for converting Korean hangeul to Roman letters. The code breaks up hangeul blocks into jamo components, e.g. (han) = (h), (a), and (n).

I can give the full code in C# as well for the graphical program (includes Cyrillisation of Korean). Just need a C# compiler (csc.exe)

The code also handles (l/r) but doesn't cover all cases.

                private string romanize(string stringToConvert)
                {
                        string result = "";

                        string [] rLeads = {"g", "gg", "n", "d", "dd", "r", "m", "b", "bb", "s", "ss", "", "j", "jj", "ch", "k", "t", "p", "h"};
                        string [] rVowels = {"a", "ae", "ya", "yae", "eo", "e", "yeo", "ye", "o", "oa", "oae", "oi", "yo", "u", "ueo", "ue", "ui", "yu", "eu", "eui", "i"};
                        string [] rTails = {"g", "gg", "gs", "n", "nj", "nh", "d", "l", "lg", "lm", "lb", "ls", "lt", "lp", "lh", "m", "b", "bs", "s", "ss", "ng", "j", "c", "k", "t", "p", "h"};
                        char currentChar;
                        int index = 0;
                        string l = "";
                        string v = "";
                        string t = "";
                        int charInt = 0;
                        string syllable = "";
                        bool wasVowel = false;

                        for (int i = 0; i < stringToConvert.Length; i++)
                        {
                                currentChar = stringToConvert[index];
                                
                                if (((int)currentChar >= 44032) && ((int)currentChar <= 55203))
                                {
                                        charInt = (int)currentChar;
                                        try
                                        {
                                                l = rLeads[((charInt - 44032) / 588)];
                                                //convert R to L if after a consonant
                                                if      ((l == "r") && (!wasVowel))
                                                        l = "l";
                                        }
                                        catch (IndexOutOfRangeException ex)
                                        {
                                                l = "";
                                        }

                                        try
                                        {
                                                t = rTails[((charInt - 44032) % 28) - 1];
                                        }
                                        catch (IndexOutOfRangeException ex)
                                        {
                                                t = "";
                                        }
                                        
                                        try
                                        {
                                                v = rVowels[((charInt - 44032 - (charInt - 44032) % 28) % 588) / 28];
                                        }
                                        catch (IndexOutOfRangeException ex)
                                        {
                                                v = "";
                                        }
                                        
                                        syllable = l + v + t;
                                        if ((syllable.Substring(syllable.Length -1, 1) == "a") ||
                                                (syllable.Substring(syllable.Length - 1, 1) == "e") ||
                                                (syllable.Substring(syllable.Length - 1, 1) == "i") ||
                                                (syllable.Substring(syllable.Length - 1, 1) == "o") ||
                                                (syllable.Substring(syllable.Length - 1, 1) == "u"))
                                        {
                                                wasVowel = true;
                                        }
                                        else
                                        {
                                                wasVowel = false;
                                        }

                                        if (useSyllableDelimiter)
                                                result = result + syllable + "-";
                                        else
                                                result = result + syllable;
                                }
                                else
                                {
                                        //trim dashes if the next character wasn't Korean
                                        if ((result.Length > 1) && (result.Substring(result.Length - 1, 1) == "-"))
                                                result = result.Substring(0, result.Length - 1) + currentChar;
                                        else
                        result = result + currentChar;
                                }
                                index++;
                        }

                        if (keepOriginal)
                                return stringToConvert + "\n" + result;
                        else
                                return result;
                }

Hopefully someone gets interested in making a transliteration tool for Korean. The above code is basic, it converts the Google Translate way - well, almost, the finals are "k", "p" and "t", not "g", "b" and "d", which is more standard. It doesn't take into account the changes required by Revised romanisation (current standard in South Korea) but if you're able to start, then I'll help to get the rules, which are not too complex. --Anatoli (обсудить/вклад) 04:35, 15 March 2013 (UTC)

Example conversion of a text from Korean Wikipedia:
Source:
한국어(韓國語)는 주로 한반도에서 쓰이는 언어로, 대한민국에서는 한국어, 한국말이라고 부른다. 조선민주주의인민공화국에서는 조선어(朝鮮語), 중국(조선족 위주)에서도 조선어(朝鮮語)로 불린다. 카자흐스탄 등 구 소련의 고려인들 사이에서는 고려말(高麗말)로 불린다.
19세기 이후 한반도와 주변 국가의 정치 사회상 변화에 따라 중국(특히 옌볜 조선족 자치주), 일본, 러시아(특히 연해주와 사할린), 우즈베키스탄, 카자흐스탄, 미국, 캐나다 등에 한민족(韓民族)이 이주하면서 이들 지역에서도 한국어가 쓰이고 있다. 한국어 사용 인구는 전 세계를 통틀어 약 8천200만 명으로 추산된다.[1] 일제 강점기에는 일본 제국의 문화 말살 정책으로 상당한 핍박을 받았다.
Converted text (needs tweaking, I know):

hangugeo(韓國語)neun juro hanbandoeseo sseuineun eoneoro, daehanmingugeseoneun hangugeo, hangugmalirago bureunda. joseonminjujueuiinmingonghoagugeseoneun joseoneo(朝鮮語), junggug(joseonjog uiju)eseodo joseoneo(朝鮮語)ro bullinda. kajaheuseutan deung gu soryeoneui goryeoindeul saieseoneun goryeomal(高麗mal)lo bullinda.

19segi ihu hanbandooa jubyeon guggaeui jeongchi sahoisang byeonhoae ddara junggug(teughi yenbyen joseonjog jachiju), ilbon, leosia(teughi yeonhaejuoa sahallin), ujeubekiseutan, kajaheuseutan, migug, kaenada deunge hanminjog(韓民族)i ijuhamyeonseo ideul jiyeogeseodo hangugeoga sseuigo issda. hangugeo sayong inguneun jeon segyereul tongteuleo yag 8cheon200man myeongeuro chusandoinda.[1] ilje gangjeomgieneun ilbon jegugeui munhoa malsal jeongchaegeuro sangdanghan pibbageul badassda.

--Anatoli (обсудить/вклад) 04:46, 15 March 2013 (UTC)
I have Luacized that function, cleaned it up slightly (IMHO; YMMV), and put it at Module:ko-utilities. —RuakhTALK 03:11, 17 March 2013 (UTC)
@Anatoli: Which cases doesn't it cover?
@Ruakh: I'm going to put that at Module:ko-translit with the function being named rv (to match Korean template parameters). Just thought I'd let you know; if there's a problem with me doing that you can move it back. —Μετάknowledgediscuss/deeds 03:29, 17 March 2013 (UTC)
Can you give it a longer name? "rv" doesn't really mean much. —CodeCat 03:31, 17 March 2013 (UTC)
Decided not to change the function's name for now. The reason for rv is that there are multiple transliteration systems for Korean. Wiktionary primarily uses Revised Romanization, but entries often use {{ko-pron}} to show three more methods, one of which cannot be reliably deduced from the hangeul alone (nor can the IPA, for that matter). We should Luacize all possible methods used on Wiktionary. —Μετάknowledgediscuss/deeds 03:36, 17 March 2013 (UTC)
I was actually hoping for something like "revised_romanization" or maybe shorter "revised_rom" if you want. —CodeCat 03:40, 17 March 2013 (UTC)
Ruakh, thanks for the efforts but do you have a working version so far? (the current module was renamed to Module:ko-translit, which requires Module:ko-hangul I tried to call but it didn't work. Not sure if you're in the middle of development.
@Metaknowledge, before we can starting tweaking for details, need to get the basic functionality to work. --Anatoli (обсудить/вклад) 11:13, 17 March 2013 (UTC)
It works just fine, you just don't understand how to use Scribunto modules. Please read Wiktionary:Scribunto. —RuakhTALK 16:13, 17 March 2013 (UTC)

Lua loops? [edit]

Does anyone know what happens if you put a never-ending loop into a Lua module? Does it stop the entire wiki? SemperBlotto (talk) 18:23, 16 March 2013 (UTC)

It wouldn't stop everything as far as I know, there is a time limit. Why not try it? —CodeCat 18:30, 16 March 2013 (UTC)
I somehow doubt that the servers would give exclusive access to one process from one instance of one page, let alone have no time limit on it. If they did, the system programmers should be fired as grossly incompetent. The worst that might happen would that the page would freeze up for the person viewing the page. Chuck Entz (talk) 19:15, 16 March 2013 (UTC)

Interlanguage links [edit]

I have a question unrelated to Wiktionary and hope someone can point me in the right direction.

For a small wiki I sometimes contribute to, I want to introduce other-language versions. The wiki is small, though, so we don't want the overhead of multiple wikis. I'm trying to come up with a solution for the wikimaster, but I don't understand the configuration aspects very much.

My idea is to have the language links at the left link to a subdirectory. For example, if you are on "thisPage.html" and click "Spanish" in the language list at left, it would go to my.wiki.org/es/questaPagina.html.

I've found articles like mw:Manual:$wgInterwikiMagic, but nothing that addresses something exactly like this. Any suggestions welcome.

--BB12 (talk) 20:44, 16 March 2013 (UTC)

I don't get it, anyone? Mglovesfun (talk) 21:12, 16 March 2013 (UTC)
I'm happy to explain it differently. What don't you get? --BB12 (talk) 21:13, 16 March 2013 (UTC)
How about this: What's the easiest way to have a multilingual wiki in a case where the URL is wiki.myweb.org (so I can't have es.myweb.org, etc.)? --BB12 (talk) 22:16, 16 March 2013 (UTC)
(e/c) On this wiki, a page with the absolute URL http://en.wiktionary.org/wiki/this might contain the interwiki link [[fr:this]], which is a link to http://fr.wiktionary.org/wiki/this. If I understand correctly, BB wants it to be a link like http://en.wiktionary.org/wiki/fr/this instead (but on his wiki, not on Wiktionary). - -sche (discuss) 22:18, 16 March 2013 (UTC)
Yes, that seems, to me, to be the easiest way to make a wiki multilingual. I would think this is a really simple tweak in the settings, but I haven't gotten anywhere with the wikimaster, so I was wondering if someone here could point me where to go or suggest what should be done. --BB12 (talk) 00:49, 17 March 2013 (UTC)
I think that such a thing could be done by creative use of the interwiki-map (e.g., mapping es to //my.wiki.org/es/$1.html), but it seems messy and potentially fragile. For example, I could easily imagine getting everything working so that [[thisPage]] links just fine to its Spanish counterpart, but then having no way for that Spanish counterpart to link back to the English.
Instead, I'd suggest that you do something similar to how en.wikt produces sidebar links to Wikipedia when you use e.g. {{projectlink|pedia}}. The way that works is, the template produces wikitext like <span class="interProject">[[w:...|Wikipedia]]</span>, which results in HTML like <span class="interProject"><a href="//en.wikipedia.org/wiki/..." class="extiw" title="w:...">Wikipedia</a></span>. We then use CSS to prevent that link from being displayed normally, and we use JS to move it into the sidebar. In your case, you'd presumably add interwiki-links via a template like {{interwikis|es=questaPagina|fr=cettePage}} or whatnot.
You'd probably also want to use mod_rewrite to implicitly add uselang=es to Spanish pages, so that the whole interface is in Spanish, rather than just the content.
RuakhTALK 02:13, 17 March 2013 (UTC)
Thank you for the suggestion. I have passed that on to the wikimaster! --BB12 (talk) 17:44, 17 March 2013 (UTC)

Some Latin templates now only ever require one parameter -- How about making it all of them? [edit]

{{l/la}} and {{la-decl-1st}} can now be passed a single parameter with macrons and the templates will automatically generate the macronless version of the word.

e.g. While you can still generate an inflection table with:

  • {{la-decl-1st|stell|stēll}}

now you can instead simply use:

  • {{la-decl-1st|stēll}}

The magic happens in Module:Latin, written in Lua. I'd recommend also using the same logic in {{l|la|...}} and making the requirement for two versions of Latin words a thing of the past.

Hopefully I haven't broken anything. Pengo (talk) 14:59, 17 March 2013 (UTC)

I have changed the name to Module:la-utilities, and I changed {{l/la}} to reflect that. I think the next obvious step with Latin templates is to merge {{la-decl-2nd}} and {{la-decl-2nd-N}}, and {{la-decl-2nd-ER}} (they should all eventually be a redirect to the first one), because we could just add a function to the Module:la-utilities that outputs the last two characters of a string; for example, if it's um it takes the neuter declension and if it's us or er it takes the masculine declension. —Μετάknowledgediscuss/deeds 16:05, 17 March 2013 (UTC)
Just to make it clear, {{l/la}} and its relatives were created before Lua came around, and were intended to be faster than {{l}}. However, now that Lua is here, they may well be redundant because {{l}} would presumably be quite a bit faster when Lua-cised. So it's better not to change or use those specialised link templates at all until we know for sure whether they are still needed. —CodeCat 17:52, 17 March 2013 (UTC)
Well, in the mean time I think Pengo killed two birds with one stone by improving {{l/la}} and providing a way for us to edit one template and change which module is invoked in all the other templates that need macron-stripping (eventually, all of them). If you ever want to finish figuring out the best/fastest way to {{l}}-ify, with subpages or not, then we can make a copy of {{l/la}} and replace all uses of it in the template namespace with the copy. But it doesn't look like it'll be worked out anytime soon, so IMO there's no point preserving it as is. —Μετάknowledgediscuss/deeds 18:05, 17 March 2013 (UTC)
What I'm worried about is backwards compatibility. If we extend {{l/la}} with this extra functionality, it will no longer be possible to replace it with {{l|la}} as easily, if and when the time comes. I strongly recommend that for the time being, the specialised templates should not have extra abilities that the general {{l}} does not also have. —CodeCat 18:16, 17 March 2013 (UTC)
But when the time comes, {{l}} should have lang-specific functions like this. Where else would we put this kind of template? —Μετάknowledgediscuss/deeds 18:20, 17 March 2013 (UTC)
I mostly agree with CodeCat. Language-specific functionality belongs in language-specific templates; in this case, I suppose that would be {{la-l}} or {{la-onym}}. {{l/la}} is intended to be a hackish variant of {{l|la}}, part of a family of templates with identical behavior, and it should conform to the requirements of that family. —RuakhTALK 18:48, 17 March 2013 (UTC)
I agree with Metaknowledge on that point though. If {{l}} can be made to automatically strip diacritics, why not? It can probably be made to work the same as automatic transliteration (in effect, it's the same thing). —CodeCat 19:15, 17 March 2013 (UTC)
Because all editors use {{l}}. Editors who don't usually work on Latin understand that they need to look at the documentation for (say) {{la-noun}} before using it, and that they can't just assume that it works the same way as {{en-noun}} or {{fr-noun}}; but they should be able to expect that {{l}} works the same way they're used to.
Also, there are a whole bunch of problems with that Lua module. Each of those problems could, in principle, be fixed, but I think it's reasonable to expect that clever language-specific code will always have little problems and inconsistencies, for two reasons: (1) none of us is perfect (our cleverness is in finite supply); and (2) such code almost always does, and should, optimize for the 99% case, such that it's sometimes inapplicable to rare edge cases (e.g., Latin entries that really should have macrons for whatever reason). Do we really want all of those problems to be in {{l}}? Currently, when the language-specific code is in a language-specific template, we can always fall back on using a generic template that imposes fewer requirements (e.g. using {{head}} for pluralia tantum because of a language-specific noun-headword template that "knows" that the noun lemma is a singular form); but if it's the generic template itself that has the problematic language-specific code, we're SOL.
RuakhTALK 20:00, 17 March 2013 (UTC)
But there are no rare cases. AFAICT, it's 100%, not 99%. In the end, I don't really mind what you do, as long as you don't break stuff. For example, don't edit {{l/la}} without editing {{la-decl-1st}}. I would replace it, but it looks like Module talk:la-utilities/tests is currently failing, so I'm going to revert the changes to {{la-decl-1st}} for now. —Μετάknowledgediscuss/deeds 20:30, 17 March 2013 (UTC)
Re: {{la-decl-1st}}: Thanks.   Re: there being no rare cases: I think there are always rare edge cases, or at least, that we always want to leave the door open to rare edge cases. Maybe people who send SMSes in Latin treat ō_ō and o_o as two distinct emoticons? Maybe we get a Perseus dump of 10,000 entries with macrons in their titles, and want (temporarily) to be able to link to those entries (instead of having them be enforcedly orphaned until they're all properly fixed and merged)? Maybe we'll want {{l|la||bār}} to work? I have no idea. It just seems rather extreme to impose macronlessness as a technical restriction in 100.000% of cases. —RuakhTALK 21:05, 17 March 2013 (UTC)
Things can be used in ways we couldn't have foreseen, and interact in ways we would never expect, so that we may need an out for reasons unconnected to the unreal and relatively tidy universe of Latin morphological rules. I firmly believe that having an override should always be the default, and that it should be removed only where experience shows it's unnecessary, and where there are compelling reasons such as performance or usability. It just seems a good idea on principle not to design things around our alleged omniscience and infallibility. Chuck Entz (talk) 22:16, 17 March 2013 (UTC)
If you insist. What really matters to me right now is that, judging by Module talk:la-utilities/tests, the module isn't working correctly yet. (PS: When I text in Latin, I never use macra. If I really need to distinguish, I use an underscore following the letter. But that's just a bit of trivia I thought I'd share.) —Μετάknowledgediscuss/deeds 23:25, 17 March 2013 (UTC)
It was working when I saved it. There's been many improvements made in this short time, but also someone broke it while trying to fix something I did that probably breaks conventions. As it says at the top of Module:la-utilities, to test while editing, "Preview page with this template" with: Module_talk:la-utilities/tests . I've fixed it for now, but probably needs some work to be correct. Pengo (talk) 00:19, 18 March 2013 (UTC)
The edge cases aren't really an issue as it is: if you use two parameters it uses the old behaviour. I've been as conservative as possible with the code, so if two parameters are given, they're still both used and no macron stripping occurs (I did this originally in anticipation of performance concerns). It means for New Latin emoticons, you can still use {{l/la|ō_ō|ō_ō}}, which is a syntax that could be guessed or worked out by any user of the template in this unlikely situation. I didn't document it explicitly because I didn't think it would ever be necessary, and I'd except them to simply use [[ō_ō]], but I'll add it to the test cases. Note, I didn't make {{la-decl-1st}} as conservative (for simplicity's sake), but it could easily be made so. Pengo (talk) 00:03, 18 March 2013 (UTC)
Maybe one of the parameters could be set to - to suppress the automatic stripping. Which of the two would be more intuitive, I don't know. —CodeCat 00:20, 18 March 2013 (UTC)
That would be easy enough to do, but I don't think it's at all necessary, unless it's to fit in with behaviour of other {{l}} languages. And I really don't see the controversy. It's hardly surprising behaviour that a link to the Latin ācer should link to the actual entry, acer#Latin, and not to the non-existent page, ācer#Latin, as it currently does. All other existing behaviour stays the same -- {{l/la|zebra}} still links to zebra#Latin, and {{l/la|elegans|ēlegāns}} still does what it did too, and if you really want to override the macron stripping behaviour you just use two arguments, e.g. {{l/la|ō_ō|ō_ō}} although I'm yet to see a real-world example of where this would be necessary. By the way, {{l/la}} is only transcluded by a handful of pages (11 all up, while it's not being used by {{la-decl-1st}}), so the rush to protect it seems a little unwarranted. Pengo (talk) 00:37, 19 March 2013 (UTC)
At the time that I protected it, it was widely transcluded; I had no way of telling that almost all the transclusions were via {{la-decl-1st}}. Thanks for the note; I'll correct that. (BTW, regarding your earlier comment that "someone broke [the module] while trying to fix something I did that probably breaks conventions" — nope, it was just a stupid mistake on my part. Some of the unit-tests were already broken even before my changes, so when my changes broke a few more, I didn't catch on at first that I'd messed up. Sorry about that.) —RuakhTALK 05:35, 19 March 2013 (UTC)
Fair enough, no worries. I think half of what I thought was broken code was from some other templates/pages being reverted. Anyway, any idea how to get those last two tests to pass? Would be nice if it would accept html entities, though not sure if it's needed. Pengo (talk) 11:08, 19 March 2013 (UTC)

Language table in Lua [edit]

With Lua it seems that it would be better (and easier) to have all language information (mainly code=names mapping) in a single page/module. This is what is being worked on in Module:languages here, and I'm working on a similar thing on fr.wikt with fr:Module:langues (the actual data table is in fr:Module:langues/data).

It looks like it may be a much better way to handle languages, instead of creating several templates for every language like currently (i.e. thousands of templates in the end).

However, someone on fr.wikt asked a question about performance. Although for a given page using such module may be more efficient, what would happen if someone changes the data table, just to add a single language ? How would this impact all the pages that use this module (in this case, potentially all articles) ? I asked this question at mw:Talk:Lua scripting#Lua changes and Job queue and I believe you may be interested to have this answered as well. Dakdada (talk) 15:18, 17 March 2013 (UTC)

Do we know about #mw.language.fetchLanguageName? In the lua debug console:
=mw.language.fetchLanguageName("ar")
العربية
=mw.language.fetchLanguageName("ar", "en")
Arabic
=mw.language.fetchLanguageName("ar-Arab")
 
Uses ISO 639 language codes, of course.  Michael Z. 2013-03-21 17:01 z
That's what I used at first when my initial table (on fr) was incomplete. There are two major issues with this :
  1. Some languages are missing, some codes are not standard (e.g. als) and the name may differ from the ones on Wiktionaries.
  2. It is slow when there are several names to retrieve. Loading the table in Module:languages is way more efficient (easily more than 10 times faster).
So in the last version of the module in fr, we completed the table with our 4500 current language codes and I ditched this function (although it can still be used in a secondary module). But I'm still concerned with the job queue impact, so for now we can't use this module (but several other modules are being tested with it). Dakdada (talk) 18:55, 21 March 2013 (UTC)
Ouch. Might be worth revisiting some time. I presume (ha ha) that a native function might get optimized to perform better than anything we could write in a scripting language. Also, it might be configurable to use Wiktionary codes or names.[3] Michael Z. 2013-03-21 19:48 z
Obviously the function was not made to be queried hundreds of times in a row. If this issue is solved then we may consider switching. Dakdada (talk) 20:51, 21 March 2013 (UTC)

How to move all of Category:Templates with /doc subpage? [edit]

Following WT:RFM#Documentation subpages to /documentation, I've added this category to all templates that still use the "old" name. Modules already use /documentation exclusively. How can these be moved automatically? I don't think bots can do moves, can they? Also, the tab at the top of the page should be changed as well (and if possible, one should be added to Modules too). —CodeCat 17:50, 17 March 2013 (UTC)

Re: "I don't think bots can do moves, can they?": Sure they can; search /w/api.php for action=move, or check out e.g. mw:Manual:Pywikipediabot/movepages.py. But I don't know if there's any page-move analogue to the concept of a "bot edit", so it may flood recent-changes unless done very slowly. —RuakhTALK 17:57, 17 March 2013 (UTC)
I just realised that regular accounts can't move pages without leaving a redirect. So whichever bot is used for this, it would need administrator rights... —CodeCat 22:00, 18 March 2013 (UTC)

Latin first declensions in a single template [edit]

I've mashed all of Latin's first declension templates into one: {{la-decl-first}}. See the documentation for how it works and examples. It largely replaces eight similar templates, which is possible because Lua can look at what the last few characters of a parameter are. For example:

I can't see any problems with using it as is, but some might want to wait for the dust to settle, or perhaps until second and third declension templates are done too, when we can be more certain they'll and have a consistent format and parameters, or perhaps a super-declension-template is made that encompasses them all.

The guts of the code is in Module:la-utilities. I've tried to keep presentation code separate from other code, and also tried to leave it flexible enough to accommodate the addition of future declension tables relatively easily, or other uses. It largely still uses an existing empty-table template for presentation, but someone might feel like making it build the tables from scratch internally.

I'm far from a native Lua or Latin speaker, so please let me know if there's any errors or issues or corner cases I may have missed. See the template's documentation for more information. Pengo (talk) 09:54, 19 March 2013 (UTC)

Simplification of romaji entries [edit]

Like Mandarin pinyin at some stage, Japanese rōmaji entries need to be converted to soft redirects to hiragana and katakana entries (not direct to kanji as hiragana serves as disambiguation for multiple Japanese homophones. This is the outcome of the discussion we had on Wiktionary:Beer_parlour/2013/February#Stripping_extra_info_from_Japanese_romaji.

I wonder if it's doable via a bot. There are too many entries in Category:Japanese romaji, which have PoS headers and don't use {{ja-romaji}} template. Generating new ones is perhaps straightforward but not conversion.

This is how the romaji entries will look, (the only category they belong to is Category:Japanese romaji). Copying from Wiktionary:About_Japanese#Romaji_entries:

A hiragana only example: "tsuku"

==Japanese==

===Romanization===
{{ja-romaji|hira=つく}}

A katakana only example: "rūto"

==Japanese==

===Romanization===
{{ja-romaji|kata=ルート}}

A hiragana and katakana example: "ringo"

==Japanese==

===Romanization===
{{ja-romaji|hira=りんご|kata=リンゴ}}

--Anatoli (обсудить/вклад) 04:46, 20 March 2013 (UTC)

For comparison, Japanese rōmaji will work similarly to Category:Mandarin pinyin. The debate about the Japanese rōmaji was resolved without a vote (see Wiktionary:Votes/2011-07/Pinyin entries for the vote on Mandarin pinyin). The vote actually prescribed NOT to add any definitions but some, especially old monosyllabic have definitions. With Japanese rōmaji we decided, not to have any definitions at all, only soft redirects. --Anatoli (обсудить/вклад) 04:53, 20 March 2013 (UTC)
Is there potentially information in romaji entries that would be lost if a bot went through and deleted everything? DTLHS (talk) 05:33, 20 March 2013 (UTC)
In theory, no, as all the information on romaji entries is essentially duplicated in the corresponding kana entries. This was a large part of the decision to simplify, since romaji entries have basically just been disambiguation pages created as dupes of the kana pages to aid users who don't yet read kana.
In practice, there may be cases where the romaji entry was developed but the kana entry has not been. Provided the romaji entry information is good, I think that wikicode can just be copy-pasted to the corresponding kana entry, and Bob's your uncle. -- Eiríkr Útlendi │ Tala við mig 05:41, 20 March 2013 (UTC)
On the pinyin vote we also had a rule not to add any pinyin entry if hanzi didn't exist. This rules is followed. There are some entries in Category:Mandarin pinyin entries without Hanzi with both blue and red links but no "just red". It's a good idea not to create rōmaji before real Japanese entry exists. I don't know if this rule should be enforced but what's the point of a redirect to nothing or spend time adding all definitions and other info to a transliteration entry. The converted entries can be viewed in the history, if anything valuable is lost. Fine by me. Let's encourage work on real Japanese and save time. --Anatoli (обсудить/вклад) 05:50, 20 March 2013 (UTC)
While cleaning up some categories (suffixes, counters Yes check.svg Done), found wa-ga without kana (わが) but kanji exists (我が). Will convert/create this one but no need to worry if some are lost. Pity the creator didn't bother to create a hiragana entry. --Anatoli (обсудить/вклад) 05:55, 20 March 2013 (UTC)
wa-ga should be waga anyway...  :) -- Eiríkr Útlendi │ Tala við mig 06:08, 20 March 2013 (UTC)

Sorry, whatever you proposed doesn't work. Entries must have definitions, otherwise AutoFormat will go and tag them as having no definition. -- Liliana 16:08, 20 March 2013 (UTC)

Really? What about thousands of Category:Mandarin pinyin entries? To avoid your KassadBot picking them up # See ... on a new line is used.
@Eirikr. I made waga as well. --Anatoli (обсудить/вклад) 20:12, 20 March 2013 (UTC)
I agree with Liliana that each Romaji entry should have a line starting with "#" in the wiki code, which is currently not the case at tsuku. Unlike tsuku, Pinyin biǎomiàn does have a line starting with "#": # {{pinyin reading of|表面}} surface. With Romaji, you should better follow the model of Pinyin as closely as possible rather than introduing a different format that uses "See also". Moreover, this dramatic change of treatment of Romaji should go through a vote. I oppose making this dramatic change without a vote. --Dan Polansky (talk) 22:23, 20 March 2013 (UTC)
The new line and # at the beginning is generated by the template. Mandarin, Gothic romanisation entries follow exactly the same patterns - they are soft redirects. The topic has been in the Beer Parlour for a long time with {{look}} to attract input and the most active Japanese editors - User:Haplology and User:Eirikr responded positively and are already using. The rationale was explained but I repeat briefly
  1. The structure of using Romaji as an index (soft redirect) follows the structure of Japanese dictionaries. Users use "tsuku" to get to "つく". There is no duplication of information.
  2. All the information in the rōmaji entries is contained in hiragana and katakana entries, only one click away.
  3. Roman script is not the correct script for the Japanese language, it's only romanisation. No need to mislead users that romaji is a replacement for the Japanese writing system.
  4. Currently, Japanese romanisation is the only exception (to my knowledge) from other languages. All languages have entries in their native scripts only, if they are not used in other scripts - i.e. Russian is only in Cyrillic, Arabic - only in Arabic. Romanisation entries are helpful only to find entries in their proper form, they are not nouns, verbs, they are romanisation.
  5. Maintenance hell, mismatch between entries, missing Japanese entries when romanisation entries exist.
Dan, if you wish, set up a vote but since Japanese editors agreed to this method, I don't see a reason. when you opposed the vote on Mandarin pinyin you used Japanese romaji as a reason to vote against it, what's your reason this time? You're not going to maintain Japanese romaji entries, are you? --Anatoli (обсудить/вклад) 23:03, 20 March 2013 (UTC)
Re Mandarin pinyin entries, the vote on pinyin explicitly disallowed any definitions (i.e. English translations in the entries), only links to hanzi (Chinese characters) - "a pinyin entry have only the modicum of information needed to allow readers to get to a traditional-characters or simplified-characters entry". (I was neutral on this rule). See "yánlì", which was used for the vote. This rule wasn't strictly followed in some cases but if it's causing confusion, the we might need to remove all English translation from Mandarin romanisation entries. Anyway, removing definitions was suggested by Eirikr, supported by Haplology and I agreed.
"yánlì" entry from pinyin vote:
==Mandarin==

===Romanization===
{{cmn-pinyin}}

# {{pinyin reading of|trad=嚴厲|simp=严厉|lang=cmn}}
# {{pinyin reading of|trad=妍麗|simp=妍丽|lang=cmn}}
# {{pinyin reading of|trad=沿例|simp=沿例|lang=cmn}}
# {{pinyin reading of|trad=岩櫟|simp=岩栎|lang=cmn}}
# {{pinyin reading of|trad=沿歷|simp=沿历|lang=cmn}}

--Anatoli (обсудить/вклад) 23:10, 20 March 2013 (UTC)

I am saying that if you plan to do sweeping content-removing changes in all Romaji entries, as you do, you should provide evidence of consensus of all interested editors rather than just those active on Japanese entries, in the form of a vote. In this discussion, a particular proposal on formatting has met opposition form an editor who does not edit Japanese for the most. As I do not know what your proposal entails exactly, I cannot create the vote for you. If you plan to forbid definitions from Romaji entries, that should very clearly be stated in the vote, not the implicit way it was done in the Pinyin vote; the actual practice in Mandarin romanization does not actually remove all definitions, as you have pointed out. Lack of express opposition in Beer parlour is not good enough evidence of consensual support of drastic sweeping changes. Letting the topic sit in Beer parlour for a long time is just a waste of time. {{look}} hardly ever attracts any input, as you should know by now, being an experienced Wiktionary editor; the template could as well be deleted as far as I am concerned. Furthermore, I am far from sure you are correct in your estimate of what has and has not reached support of the most active editors of Japanese entries, as I have found the following statement made by Eiríkr Útlendi: "I must therefore strongly oppose any move to strip romaji and / or kana entries of POS and gloss information." --Dan Polansky (talk) 10:30, 23 March 2013 (UTC)
If you had actually read through the discussion instead of merely looking for ammunition, you would know that Eiríkr was reacting to his original perception of an earlier form of the proposal. After clarifications, discussions and further development, he came to be a proponent of the resulting concept. I'm not saying anything about your main point, just your misuse of Eiríkr's comment. Chuck Entz (talk) 14:48, 23 March 2013 (UTC)
Sure, I should carefully read through the whole discussion to find out what is actually being proposed, and who supported what at what points of time, and what changes of opinion occurred, while you cannot be bothered to write up a clear proposal and produce evidence of its being supported. --Dan Polansky (talk) 00:18, 24 March 2013 (UTC)
Here again, a quick scan through the posts would have shown that it was my only post on this topic. It's not my proposal, so I have no obligation to write it up. My only connection to it consists of having read everything as it was posted, and being being mildly annoyed at your jumping in and assuming things without checking. Chuck Entz (talk) 01:12, 24 March 2013 (UTC)
I admit that I could have been reading more carefully, skimming more slowly, and being more attentive overall. Nonetheless, I still think that I should not have to wade through a fairly long discussion to see whether there is or there is not a consensus, on what the consensus is, and how many people have been involved in the discussion. --Dan Polansky (talk) 01:58, 24 March 2013 (UTC)
  • Um, I usually skip to the bottom and read the last few paras to find out how things turned out. I know my wife does this with novels. Had you done so, you would have seen my comment:

@Anatoli, the new {{ja-romaji}} looks great from an editor and user usability standpoint. Barring any concerns voiced by other editors, I think this thread has reached a successful conclusion.

Even without reading anything else, "this thread has reached a successful conclusion" might be a hint that consensus, or at least broad agreement, had been found... -- Eiríkr Útlendi │ Tala við mig 04:40, 24 March 2013 (UTC)

Module:IPA [edit]

I made a very basic IPA > X-SAMPA transliterator at Module:IPA. Needs work.

Also relevant: Wiktionary:Beer_parlour/2013/January#(X)SAMPA Michael Z. 2013-03-20 21:05 z

I didn't know you could write table keys in that way... —CodeCat 21:37, 20 March 2013 (UTC)
CodeCat : it's right here mw:Extension:Scribunto/Lua_reference_manual#table.
By the way, is the usefulness of X-SAMPA accepted here on en.wikt ? On fr.wikt we chose to move everything in a gadget (even then I don't think anyone uses it).
But if you need a list, check out the gadget list here : fr:MediaWiki:Gadget-APIversXSAMPA.js (not sure if it is complete though). Dakdada (talk) 21:43, 20 March 2013 (UTC)
I think Lua is preferred to a gadget, though, because it runs on the server. —CodeCat 22:03, 20 March 2013 (UTC)
Putting X-SAMPA in a Lua module would have a cost, as it would be loaded in every page with IPA. I'm not sure it is worth it, given very few people actually use it (if any). Gadgets are a good way to give users the API to X-SAMPA conversion, since only the people who want to use it would load the gadget from the site. Although that way we assume that the people who absolutely want to read ASCII pronunciations have javascript enabled... Dakdada (talk) 23:16, 20 March 2013 (UTC)
Has anyone figured out how to import a table with mw.loadData? This would let the server load the transliteration table once only, in read-only mode, even if there were many instances of IPA on a page. I couldn’t get it to load a table with Unicode data.
I did use it but I have to admit that I did not compare it to a simple require to see if the data was really cached. Dakdada (talk) 10:27, 21 March 2013 (UTC)
Where can I see your code? Michael Z. 2013-03-21 15:22 z
The module is here (sorry it's in French): fr:Module:langues, with the table in fr:Module:langues/data. As an example, the page fr:Utilisateur:Darkdadaah/eau/Pamputt can be created within 0.5s with mw.loadData. When I replace it by require, the page is built in 8 seconds, with twice as much memory used. Dakdada (talk) 18:45, 21 March 2013 (UTC)
X-SAMPA could be incorporated into {{IPA}}. I’d like to see a gadget that shows only IPA by default, and lets the reader toggle IPA/X-SAMPA display, or copy X-SAMPA. Less clutter on the page for the 99.999% of us who have no use for X-SAMPA. Michael Z. 2013-03-21 01:00 z
It would be easier if both IPA and X-SAMPA were created with a single template. Right now it's something like {{IPA|}}, {{X-SAMPA|}} so just hiding one or the other would leave an ugly comma. Dakdada (talk) 10:27, 21 March 2013 (UTC)
Yes, exactly. If X-SAMPA can be reliably derived from IPA, then it can be there every time, and no need for a separate template. But seeing as we know of zero users of X-SAMPA, there’s no need to show it to everyone at all. Any ideas for an unobtrusive interface? Michael Z. 2013-03-21 15:22 z
I suggest to create a JavaScript tool based on Module:IPA (which is pretty easy), and remove Template:X-SAMPA, then anyone who wants to see X-SAMPA pronunciation in entries may activate the JS code. --Z 02:22, 1 April 2013 (UTC)

Error when moving a page? [edit]

I'm trying to move avantpaísos to avantpaïsos without leaving a redirect. But when I try, I get an error like this: [6560d38b] 2013-03-20 21:31:29: Fatal exception of type MWException. Is anyone else able to do the move? —CodeCat 21:32, 20 March 2013 (UTC)

Apparently I'm getting the same error with other pages I try to move. —CodeCat 21:35, 20 March 2013 (UTC)

Not me; I've tried and failed. Mglovesfun (talk) 21:53, 20 March 2013 (UTC)
Same here. SemperBlotto (talk) 22:07, 20 March 2013 (UTC)
avantpaís is displaying a script error. This needs fixing urgently. Mglovesfun (talk) 22:43, 20 March 2013 (UTC)

Bot to add {{head|en}} to Category:English plurals [edit]

Bot to do this:

==English==

===Noun===
'''crossings'''

# {{plural of|crossing}}

to

==English==

===Noun===
{{head|en}}

# {{plural of|crossing}}

The regex is pretty simple. I can do it using the regex function on AWB but AWB also cuts of at 25,000 for categories so I could only go as far as that. Perhaps MewBot (talkcontribs) would like to take this one? Nevertheless, it could be done for other languages and also for verb forms, adjective forms and so on. Mglovesfun (talk) 11:08, 22 March 2013 (UTC)

I would prefer another approach, which I was just about to suggest when I saw this. It's my preference that form-of templates like {{plural of}} don't add part-of-speech categories to the entries. It makes sense to me because we already have, as a rule, headword-line templates that add PoS categories, so this makes it more consistent. But there are other reasons as well. In many cases, the form-of templates end up being added to other kinds of entries and other languages, but in those cases it may not be appropriate to have a category. With {{plural of}} this is particularly noticeable because the category it places words in, Category:English plurals, isn't very clearly named because it doesn't say plurals of what. In a language like Catalan, such a name would not be appropriate, because Catalan also has plural adjectives and plural verbs. Yesterday I cleaned out Category:Catalan plurals, which (not surprisingly) contained several adjective plural forms as well. Some templates, including this one, allow you to suppress the category or change its name, but that seems like putting the cart before the horse. Catalan already has a {{ca-noun-form}} template which places the entry in the most appropriate category, so why would we need to add nocat=1 every time we use {{plural of}} for Catalan? That seems backwards. Therefore, I propose this replacement instead:
==English==

===Noun===
{{en-noun-form}} [or {{en-noun-plural}}]

# {{plural of|crossing|lang=en|nocat=1}}
The headword-line template, which we would need to create, would add the plural category instead. So nocat=1 is added to suppress the category of {{plural of}}, which in turn would make it easier for us to find out how many entries still rely on its categorisation. It is my hope that once all instances of {{plural of}} have this parameter, we can remove the categorisation code from the template safely. —CodeCat 13:51, 22 March 2013 (UTC)
What is the advantage of doing either of these over the current situation of a plain wikitext inflection line and categorization by {{plural of}} (presumably eventually to be replaced by {{en-plural of}})? Uniformity? That seems like a positive hazard as it seems to lead folks to believe that they know how to make changes to English entries when the evidence leads me to believe they don't.
There are quite a few cases where the inflection line is for a lemma and {{plural of}} does categorization at the sense line level. DCDuring TALK 15:39, 22 March 2013 (UTC)
Why would you want to do this? Why replace simple code with a template that does nothing? Are you trying to make the wiki run even slower? SemperBlotto (talk) 15:41, 22 March 2013 (UTC)
I can't see any advantages. Intention redundancy? CodeCat you're normally the first to want to get rid of redundancy (even before me). Mglovesfun (talk) 16:37, 22 March 2013 (UTC)
Redundancy isn't really an issue here, it's about what is workable. If thousands and thousands of uses of a template need a nocat=1 parameter just to stop it from doing something, then that seems like bad design. And when people come across something that is badly designed, they're going to try to work around it, which may make things worse. For example, I've seen lots and lots of entries that have tried to avoid the categorisation of {{plural of}} by instead using {{form of|plural}}. While others, like I mentioned, ignored the category with the result that at least for Catalan entries, Category:Catalan noun forms and Category:Catalan plurals contained almost the exact same entries. The only difference between them were either a few entries that lacked {{ca-noun-form}}, or entries that used {{plural of}} for adjectives (which is totally intuitive; it's the category that's wrong!). That can't be a good thing. My proposal helps to make things consistent by sticking to a simple rule that most non-form entries already adhere to: the headword-line template is responsible for the PoS category. {{head}} already works that way, as do the many language-specific templates like {{en-noun}}. I think that is a very simple rule, and if we can achieve a situation where our templates follow it, it will make things easier to understand because editors will know exactly which templates they can expect to add an entry to a category and which not, which avoids errors due to uncertainty. I mean, think about this yourself... would you rather have to remember for each template whether it categorises or not, or would you prefer learning a simple rule? —CodeCat 17:28, 22 March 2013 (UTC)
I have never used "nocat=1" (it is not obvious to me what it is supposed to do), so I have never had to remember what to do. I have no idea which English, French, Italian, Latin or German templates allow such a keyword. SemperBlotto (talk) 17:33, 22 March 2013 (UTC)
That is exactly what I am talking about above. Consistency in how similar templates work is good. It means that once we learn to expect certain behaviour, we can extend that expectation to new templates with reasonably safe knowledge that it will do as we think. Consider another example, overriding the headword of a headword-line template. The majority of our templates use head= for that, so many of us (myself included) would just use head= without even thinking about it. We expect it to work. Similar for linking templates, which take a second parameter to change the displayed link text. Nobody thinks about it, everyone just expects it to work. And that is a good thing because it lessens the mental burden of remembering how all the templates. My proposal is intended to be just one step towards that. —CodeCat 17:41, 22 March 2013 (UTC)

[Aside: here’s a working link to this section: Bot to add {{head|en}} to Category:English pluralsMichael Z. 2013-03-22 18:43 z]

Why not modify {{en-noun}} so it works for plurals? Something like {{en-noun|pl}}. Every time I create an English plural entry, I spend five minutes previewing {{en-noun|-}}, {{en-noun|!}}, {{en-noun|?}}, read the docs again, and then give up and leave it for someone else to clean up. As an editor, I don’t care which template adds the category.
Why replace simple code with a template that does nothing? Code that is completely inconsistent with every other noun entry is not simple, it is obscure. Michael Z. 2013-03-22 20:16 z
There would need to be a way to distinguish a plurale tantum/plural-only noun (a lemma that happens to be plural) from a plural form of a regular singular noun. We wouldn't want pants categorised as a noun plural form, I think? I think adding such functionality to {{en-noun}} is dangerous, because with misuse we could end up with plurals categorised in Category:English nouns. Having a separate template seems like a safer option, and it also fits with the general idea that each part of speech has its own template (for categorisation purposes, noun forms are their own part of speech, distinct from nouns). Of course, just writing {{head|en|plural}} or {{head|en|noun form}} is a possibility too. —CodeCat 21:07, 22 March 2013 (UTC)
Please don't mess with {{en-noun}}. A few templaters have said that {{en-noun}} was one that was sufficiently complicated already so that they didn't want to add features. We have {{en-plural noun}} already as an inflection-line template. It is also inappropriate in the cases mentioned above in which the same headword is both a plural-only and a simple plural. {{head|en}} and hard categorization seem adequate for that case and other exceptional cases that arise.
I see some advantage to creating an English-specific direct sense-line replacement for {{plural of}}. If all other languages want to use an inflection-line approach or a language-specific sense-line approach, then by all means let there be language-specific and generic templates to do so. As the Little Red Book said, let a thousand flowers bloom!!! DCDuring TALK 22:27, 22 March 2013 (UTC)
Maybe this is an opportunity to try out a single template for headword and sense line(s), incorporating HTML dfn and dlMichael Z. 2013-03-22 23:04 z
Yes. It's probably long past time to kill off this idea of wiki-style participation here. I say let there be an apprenticeship period, no edits from non-whitelisted users without approval, etc, qualifying exams for would be template writers, HTML and CSS qualifying exams for adminship. DCDuring TALK 23:34, 22 March 2013 (UTC)
you don’t think one well-designed template could be made more accessible for editors than two vaguely unrelated templates? Michael Z. 2013-03-23 01:10 z
Not for the relevant group of editors for a language or a related group of languages. There is apparently a typical, cognitively economical way of presenting an inflection line for a given language or language family or larger grouping, based on the characteristics of the language, its script, and the PoS in question. Ordinary wikitext should work for formatting almost all languages that use Latin script. The content portion of such templates seems to be a matter of language knowledge. The uniformitarian urge to template-standardize annoys me no end and seems completely contradictory to what a wiki is supposed to be. DCDuring TALK 01:49, 23 March 2013 (UTC)
What puzzles me is why you think using less templates is inherently better. What makes '''word''' or [[word#English|word]] better, in your opinion, than {{en-noun}} and {{l|en|word}}? In particular, what I am confused about is why you make such a sharp distinction between "ordinary wikitext" and "templates". To me, they're the same thing, one is part of the other, and they have to be learned as a single whole. —CodeCat 02:01, 23 March 2013 (UTC)
Overhead. I disagree with him here, but I see his point: when all you want is the headword in italics, it's a touch Rube Goldbergish to have a bunch of templates calling other templates to end up with the same result. It's very easy (especially now, as we're marching into the brave new world of Lua modules) to become enamored of our template cleverness. Templates are merely a tool for producing the right output on the web page. They are often extremely handy, and do wonderful things, but they also have costs. Otherwise, why not have a single template called {{template}}? It would have 47 positional parameters, and 73,953 named parameters- half of which would be for turning other parameters on or off/and or feeding them ad-hoc faked inputs. Maybe we can talk Daniel Carrero into designing it for us Chuck Entz (talk) 03:19, 23 March 2013 (UTC)
Re: "Templates are merely a tool for producing the right output on the web page": That's not true. They're also a tool for marking up the wikitext itself with semantics usable by mirrors, bots, and other external tools.   Re: "when all you want is the headword in [boldface], [] ": But is that all we want? Mzajac and CodeCat actually want the headword to have richer styling than that. (You talk about "the right output on the web page", but of course, the output consists of more than just what you see when you open your Web browser. It also consists of what your browser sees, what search-engines see, what screen-readers see, and so on. None of this requires templates — anything that can be put in a template could also be put directly in the entry's own wikitext — but templates are often simpler for everyone.) —RuakhTALK 17:21, 23 March 2013 (UTC)
By the way, this has long since gone from a Grease Pit discussion to a Beer Parlor one. Chuck Entz (talk) 03:22, 23 March 2013 (UTC)
That's true, but that's also how programs tend to be written these days, it's the whole idea behind encapsulation and object-oriented programming (specifically, the idea of "hiding" details that programmers should not concern themselves with). Writing raw wikitext everywhere seems to me like writing in raw assembly language; yes it works, yes it's fast, and it's clear what everything does right down to the finest details, but it's anything but practical, it offers no consistency and it's a big problem to maintain it. My goal is to use templates as a tool to both get the result we desire and to make things more intuitive and easy to use. To me, "always use a headword-line template" is more intuitive than "use a headword-line template for non-English, use bolded text with an explicit category for non-English, except when there's already a headword-line template... oh and you can use bolded text for non-English too but that's really wrong", and "the headword-line adds the PoS category" is more intuitive than "the headword-line sometimes adds the category, but not always, and sometimes the form-of template does, so be sure to check whether one of the templates adds the category you want". I also think "language is always required" is more intuitive than "language is required except usually for English" (the best evidence for that unintuitiveness is the fact that there are countless context labels that put foreign entries into English categories). —CodeCat 15:08, 23 March 2013 (UTC)
@CodeCat: There are three types of potential bad consequences of using templates in cases where there is no clear functional benefit. First there are performance effects. Each template contributes to latency. Complex templates that call other templates to confirm that a language uses Latin script and are multiply transcluded on a page should be an embarrassment. Second, there are the template-editing consequences. A template that is transcluded more than a hundred thousand times (we have 72 of them) is not easily changed without bad consequences, especially as we don't have any good test case suite AFAICT. The most minimal change in one of million-transclusion templates (We have 7.) can prevent other template changes from going into effect for hours. The effect on contributors is worse. I expect that most users are simply intimidated by the complexity of our template system. It is hard enough just to use them correctly, let alone use all the features, let alone amend them. Instead of having best practices documented so any earnest contributors could create a template likely to be useful, we attempt to impose uniformity.
The relative lack of good documentation for our overall system makes it extremely likely that the loss of a few technically will bring this wiki down. DCDuring TALK 16:30, 23 March 2013 (UTC)
A lot of good points here. Also as yet unmentioned is the fact that we are making web pages that aspire to conform to the HTML standard of structured text representation. Wikitext, when used with disregard for the HTML it produces, is worse than MS Word. It makes this discussion-point you are reading a <dd> in 15 nested one-item <dl>’s. When we update our internals, we can take advantage of the opportunity to try to improve the horrific tag soup generated by this website. Michael Z. 2013-03-23 17:20 z

Reminder of Lua help session in a few hours [edit]

Hi! This is a reminder: today at 1800 UTC, in about three hours, there's a Lua/Scribunto help session on IRC; please see the IRC office hours page on meta for details. Thanks! Sharihareswara (WMF) (talk) 15:05, 22 March 2013 (UTC)

Soft Keyboard, Please? [edit]

What happened to the soft keyboard? I'm missing it very much. ---- Lo Ximiendo (talk) 03:26, 23 March 2013 (UTC)

I have also noticed some weirdness. — Ungoliant (Falai) 03:52, 23 March 2013 (UTC)
  • What is a soft keyboard? SemperBlotto (talk) 08:16, 23 March 2013 (UTC)
    • I thought it was one of those keyboards made of a soft silicone that could be rolled up (like this one), but if Lo Ximiendo is missing hers, she would hardly be asking us about it. Perhaps she means a virtual keyboard? —Angr 10:02, 23 March 2013 (UTC)
“Software keyboard,” by elision. Michael Z. 2013-03-23 17:22 z
I assume that she means that row of icons and text immediately above the edit window (does anyone ever use it?). I notice today that something flashes up temporarily just before it is displayed. SemperBlotto (talk) 17:27, 23 March 2013 (UTC)
I mean that I'm unable to type either Cyrillic or Arabic letters with the virtual keyboard that's provided to us (now in an unusable shambles, to me). ---- Lo Ximiendo (talk) 19:19, 23 March 2013 (UTC)
I think you mean in the Edit window, just above the window where it says, >Advanced, >Special characters, >Help. When you click on Special characters, you can select from varous keyboards, such as IPA, Arabic, or Cyrillic. For me, it seems to be working fine. I don’t know if the skin has anything to do with it, but I still use the Monobook skin. —Stephen (Talk) 21:50, 25 March 2013 (UTC)
Oddly enough, I can still type with the deformed virtual keyboard. It's just that I miss the old appearance of the edit window. (Can I make a screenshot of my situation?) --Lo Ximiendo (talk) 22:24, 25 March 2013 (UTC)
This works fine for me too on en.wiktionary.org, using Firefox 18.0.2 on Linux. Is the problem similar to this screenshot? If so it might be the same problem as bugzilla:46401/bugzilla:46575. If it's not, could you please tell us which browser software (and version) you use and on which operating system? If your browser has a "JavaScript Console" or "JavaScript Debug Window", is any output created in that console when loading the "Edit" page and also when you try to insert Arabic or Cyrillic letters? And yes, a screenshot would also be helpful in that case. :) Thanks! --AKlapper (WMF) (talk) 22:02, 26 March 2013 (UTC)
The problem I'm having (and it's apparently gone, maybe for now at least) is similar to the screenshot you showed (and I'm using version 11 or 12 of Firefox). Thanks for showing me a Bugzilla entry! :) --Lo Ximiendo (talk) 22:10, 26 March 2013 (UTC)

Printing pages with PoScatboiler [edit]

I am getting grossly non-wysiwyg printer results from printing Category:English phrasal verbs. The text and templates embedded in {{poscatboiler}} for that page prints the url for the edit link and for each letter in the index box. Thus there is a largely extraneous first page. "Printable version" creates the same appearance on the screen.

Similar problems appear on principal namespace pages that have urls in {{quote-book}}. I haven't checked further. DCDuring TALK 19:41, 23 March 2013 (UTC)

Do you think you could show an... um... "screenshot"? —CodeCat 21:08, 23 March 2013 (UTC)
It seems to be {{categoryTOC}}:

categoryTOCprinted.pdf Chuck Entz (talk) 21:33, 23 March 2013 (UTC)

But also urls contained in {{quote-book}} and even just inside single square brackets in the same way. This is probably considered desirable behavior for {{quote-book}} and single square brackets, but it is not for the index box. DCDuring TALK 21:41, 23 March 2013 (UTC)
True, but we have to start with a specific example so there's something to look at. My guess is that there's code to hide the URLs that doesn't work when printing. Chuck Entz (talk) 21:55, 23 March 2013 (UTC)
You are probably right. At WP "printable version" displays urls that are in I a square bracket. BTW, if it only occurs for printing, it is unlikely to be urgent for anyone. DCDuring TALK 22:34, 23 March 2013 (UTC)
The MW print style sheet contains the following:
189        #content a.external.text:after,
190     #content a.external.autonumber:after {
191     /* Expand URLs for printing */
192     content: " (" attr(href) ") ";
Presumably this could be overridden by some CSS in the offending template (which I couldn't find using our documentation) or better in the offending class of templates. I assume it is necessary to have the full url to link to a specific portion of a category, but is it? DCDuring TALK 13:20, 25 March 2013 (UTC)
It looks like that table of contents is generated by {{en-categoryTOC}}, and perhaps other members of category:TOC templates. It makes those links with the fullurl parser function. No idea why that should give it the class .external.
I guess we can override that with something like #toc a.external.text:after { content:"" } in the right stylesheet. (Is this a general print stylesheet, or is is specific to Vector?). Michael Z. 2013-03-25 16:48 z
I have the problem in Monobook. I stuck the line in my common.css and it did the trick, selecting what needed to be removed, but not the other urls. For my purposes, I rarely (never?) need the urls. I don't know who else prints this kind of thing. I only do it when I need a medium/small list as a checklist. DCDuring TALK 17:58, 25 March 2013 (UTC)
It doesn't seem to work with 2 non-Latin script category ToCs that I tried (one was Sanskrit), so it is not a general solution. But it suits me fine. DCDuring TALK 18:05, 25 March 2013 (UTC)
Which templates, DCD? We may as well fix this for everyone. Michael Z. 2013-03-25 18:54 z
Okay, I see that {{categoryTOC-Devanagari}} is an example. This template is not built on postcatboiler, but is custom-built and inserted into a category page. There may well be dozens of templates with their own variation of code. We should standardize the HTML and id or class names for such T’s of C. Michael Z. 2013-03-25 19:00 z
I see that these TOCs use class="plainlinks" to prevent the little “external link” arrows from showing up. This is probably a good indicator that a URL should not be printed either. This CSS might be more generally applicable: .plainlinks a.external:after { content:""; }. For good measure, it should probably be in an @media print { } block. Needs testing to see if the rule is specific enough to override the other. Michael Z. 2013-03-25 20:28 z
This seems like a practical, if perhaps tedious, demonstration of the benefits of the use of CSS (and its allies) and of compliance. I rest your case. DCDuring TALK 21:38, 25 March 2013 (UTC)
[Nodding sagely.] Michael Z. 2013-03-25 22:24 z

I think I may have done it, using MediaWiki:Print.css (I keep finding more style sheets). Please reload and confirm. Michael Z. 2013-03-25 22:24 z

It seems to have suppressed the bad urls and not the ones we'd probably want, such as in Citations. But I should try again later to make sure that there isn't cache delay or something (though there shouldn't be as the Sanskrit was not helped by what was on my css page insertion). DCDuring TALK 22:47, 25 March 2013 (UTC)

Random entry (by language) broken [edit]

If you click on the Random entry button, the functionality is as usual. However, if you click on (by language), if you click on any of the languages listed on http://en.wiktionary.org/wiki/Wiktionary:Random_page , you get the following error:

403: User account expired
The page you requested is hosted by the Toolserver user hippietrail, whose account has expired. Toolserver user accounts are automatically expired if the user is inactive for over six months. To prevent stale pages remaining accessible, we automatically block requests to expired content.
If you think you are receiving this page in error, or you have a question, please contact the owner of this document: hippietrail [at] toolserver [dot] org. (Please do not contact Toolserver administrators about this problem, as we cannot fix it—only the Toolserver account owner may renew their account.)
HTTP server at toolserver.org - ts-admins [at] toolserver [dot] org

Please get this sorted out. Kingturtle (talk) 22:51, 23 March 2013 (UTC)

Comment about this and about other, related/similar things that are broken (as a result of toolserver accounts expiring): if no-one plans on fixing certain gadgets in the near future, the links to them should be removed. - -sche (discuss) 23:06, 23 March 2013 (UTC)
There is a plan to move the tools from the Toolserver to the Wikimedia Labs, where tools should be easier to maintain and manage (no user-defined time limit). We can't migrate the tools just yet, though. Dakdada (talk) 17:10, 25 March 2013 (UTC)
Can you put a note atop http://en.wiktionary.org/wiki/Wiktionary:Random_page explain that none of those links currently work, but a plan is in the works to get them working again? Kingturtle (talk) 18:47, 28 March 2013 (UTC)
Yes check.svg Done - -sche (discuss) 19:20, 28 March 2013 (UTC)

Page linking to itself? [edit]

According to Special:WhatLinksHere/နီး, the page နီး links to itself, but I can't find any self-link there. What am I missing? —Angr 16:07, 24 March 2013 (UTC)

I did a null edit and it went away. Probably an old transclusion. Michael Z. 2013-03-24 17:14 z
Thanks! —Angr 19:38, 24 March 2013 (UTC)

Gender tag popups [edit]

Since last night, the gender tags like m, f, n, pl have been showing up with little dotted underlines which tell you what they stand for when you mouse over them. Is there some preference I can set or something I can set in my CSS to turn it off? I find it very annoying. —Angr 11:24, 25 March 2013 (UTC)

Yes check.svg Fixed. However, there's a lot of caching that goes on. To bypass the caching, you can try visiting //bits.wikimedia.org/en.wiktionary.org/load.php?debug=false&lang=en&modules=site&only=styles&skin=vector&* and performing a "hard" refresh (holding down the Shift key while you refresh) a few times. And even this is not necessarily guaranteed, since some of the caching is server-side rather than client-side. If it doesn't work for you, and this bothers you enough that you want the fix ASAP, you can copy that portion of CSS to your own personal Special:MyPage/common.css, since changes to that page are picked up immediately. —RuakhTALK 13:58, 25 March 2013 (UTC)
Well, the dotted underlines are gone already. The popups are still there, but they aren't so annoying. Thanks, Ruakh! —Angr 14:09, 25 March 2013 (UTC)
The popups have always been there. It's just not very obvious that they are there, so I actually thought the dotted line was kind of helpful, even if it didn't look very good. —CodeCat 14:27, 25 March 2013 (UTC)
I would be content with a way to turn it off for me even if it's left on as a default if other people think it's useful. It's just that to me, the dotted underline says, "Here's something that requires your urgent attention", but the gender tags don't require it. I've been using foreign-language dictionaries for over 35 years now; I know what m, f, n, and pl stand for. —Angr 14:31, 25 March 2013 (UTC)
Oops; I made a change to MediaWiki:Vector.css. I forgot that Firefox adds underlines to abbr’s with titles (I use Safari, which does not). It’s one of the few cases where the major browsers have contrary visual rendering. Michael Z. 2013-03-25 16:18 z
I don't understand this at all; your comment doesn't seem to explain anything. Actually, your comment would be a better explanation for not doing what you did. Maybe you've left out a few steps of your reasoning? (Plus, I don't think your comment is factually accurate, since I use Firefox exclusively, and I don't think I was seeing those dotted-underlines until you made this change. But I wouldn't swear to that.) BTW, if your goal was merely to remove underlines when there's no title, then it would make more sense to use abbr:not([title]), rather than hoping that your explicit abbr[title] rule is restoring a default behavior rather than creating a new one. (Or is that not possible for some reason? I admit, I haven't tested.) But anyway, I've reverted your change for now, since it's obvious that it doesn't have consensus. —RuakhTALK 01:14, 26 March 2013 (UTC)
The Vector skin needlessly adds underlines and the help cursor to all abbr elements lacking titles. I had corrected that in MediaWiki:Vector.css, but inadvertently caused the problem being discussed here and resolved by your edit in MediaWiki:Common.css. Now the underlines and help cursor are back again because you removed my edits. If you use a :not selector, your fix will not work in MSIE 8. Michael Z. 2013-03-26 14:11 z [edited —MZ]
Nope, I still don't get it. As far as I can tell, "the problem being discussed here" is the presence of dotted underlines. Seeing as your change to MediaWiki:Vector.css involved a whole chunk of CSS whose sole purpose was to add those dotted underlines in certain cases, I don't see how that can have been inadvertent . . . —RuakhTALK 05:04, 27 March 2013 (UTC)

Japanese pitch accent template Template:ja-accent-common [edit]

I've recently gotten my hands on 1998, NHK日本語発音アクセント辞典 (NHK Japanese Pronunciation Accent Dictionary) (in Japanese), Tōkyō: NHK, ISBN 978-4-14-011112-3.

I'd like to rework {{ja-accent-common}} a bit to change how the information is presented, moving the type of pitch to the front of the line and building in an option to use IPA, as already suggested (but not implemented) by the use of square brackets for the romaji. Would anyone object? I think I'm just about the only one using this template in recent edits. And should I put this thread in WT:BP instead? Note that I am proposing a change in presentation only. -- Eiríkr Útlendi │ Tala við mig 18:51, 25 March 2013 (UTC)

But wouldn’t square brackets look like phonetic IPA, if it appears in the same context as slashes for phonemic IPA? Michael Z. 2013-03-25 22:29 z
  • Sorry, I was stuck in my own head and speaking in shorthand, as it were.
When provided with the correct params, this template currently outputs a string in square-bracket format, suggesting phonetic IPA, but without using the IPA font styling. In uses of it that I've seen entered by editors before me, folks have been using this to input romaji + IPA-style tone diacritics. I've followed suit so far, but this strikes me increasingly as incorrect.
My proposal includes reworking this part of the template to use {{IPAchar}} for proper font formatting, to link to the proper phonology page much as when using {{IPA|lang=ja}}, and to use tone letters on each mora for better visual clarity, since actual phonetic IPA already includes diacritics over the letters that would interfere with the tone diacritics. I'd also explain in the documentation that this is intended for IPA, not romaji. Lastly, I'd see about going back through existing entries and fixing the transcriptions.
If folks are interested, I'll knock up a demo of what I'm thinking. -- Eiríkr Útlendi │ Tala við mig 22:38, 25 March 2013 (UTC)
Maybe you've seen this, but here's the BP discussion about this from two years ago, which links to even more discussion from previous years. The person who started the discussion seems to have left WT since a vote on Pinyin as you can see here Special:Contributions/Vaste. I've never touched pitch myself but it would be awesome to have more of it on here. --Haplology (talk) 06:35, 26 March 2013 (UTC)
  • Cool, thanks for the links, Haplology. I'll have to read those later (hopefully today).
And yeah, pitch is pretty important in JA for distinguishing words that would otherwise be pure homophones. Almost no Japanese dictionaries seem to include this information, be they JA<>JA or JA<>something else, making it difficult for learners to get a handle on. Context can handle a lot of that for us non-native speakers, but I have in the past noticed odd looks on folks' faces when I've used the wrong pitch and they have to re-run what I've said through their internal parsers to make sense of it. After getting a copy of NHK's official pitch accent dictionary, specifying the standard for Japanese broadcasters, I would really like to get that info into as many entries here as possible. Figuring out the best format for the template is part of that.  :) -- Eiríkr Útlendi │ Tala við mig 15:19, 27 March 2013 (UTC)
ArrowGreen.svg So after reading those various threads that Haplology linked to, and the threads linked therefrom, I found the following:
  • Various dictionaries and other sources use a number notation.
Most commonly, this appears to be the number of the syllable after which there is a w:downstep. So for 殊に (koto ni), this is marked as [1] in w:Daijirin, and for 言葉 (kotoba), this is marked as [3] in Daijirin. Words with no downstep are either not marked with any number, or marked with [0] if there are any homophones that do have a downstep. Words with multiple possibile pitch contours are marked with the most common number first, such as [2][0] for 陽炎 (kagerō).
  • Some dictionaries use a diagonal arrow notation. This includes some entries on the JA WT.
This notation uses ↗ just before the first kana with higher pitch, and ↘ just before a downstep. Note that these characters are also used in IPA for global rise and global fall, which are specifically defined as not being used to distinguish words (see w:Intonation_(linguistics)), and thus should probably not be used here on the EN WT to mark Japanese pitch accent.
  • Some dictionaries use a vertical arrow notation.
This notation mostly just uses ꜜ, the IPA character for a downstep, just before a downstep. Some sources might use the IPA w:upstep character ꜛ just before where the pitch rises.
  • Some dictionaries and other sources use accent diactrics.
This uses the gràvè àccènt to mark low tone, and the ácúté áccént to mark high tone. Downstep seems to be marked either by a change from acute to grave, or (especially if the downstep is at the end of the word) by use of the downstep arrow ꜜ.
  • Japanese accent dictionaries use an overline notation to mark high tone, and a hooked overline notation to mark the downstep where high tone ends. This is true both of NHK's official broadcasting accent dictionary, and Sanseido's accent dictionary (sample page here).
  • IPA provides both accent diactrics and w:tone letters.
Tone letters look like ˥ ˦ ˧ ˨ ˩, and for Japanese purposes, we might only need the ˦ and the ˨.
Between the two, I think tone letters are much more usable, as Japanese phonetic IPA already makes use of other over-letter diacritics (mostly just the tilde to mark nasals, but also the diaresis for some vowel sounds), and these don't combine very legibly with accent diacritics. Tone letters are given just to the right of each mora, and thus don't interfere with any other over-letter diacritics.
Ultimately, I think we should use overline and hooked overline notation for kana, and IPA with tone letters and the downstep arrow for complete phonetic IPA information. I'd like to avoid using any romaji at all in the pronunciation section. This would allow for combining the IPA and pitch accent into one bulleted line, rather than the two I've been using so far as seen at 墓地.
Would that be acceptable to other editors? Does anyone feel strongly about using another notation, as well as or instead of the above?
(Note that this is so far all for the "standard" accent used in Japanese broadcasting, based on pitch accent patterns in Tokyo. If folks have access to resources describing other Japanese pitch accent patterns, please chime in.) -- Eiríkr Útlendi │ Tala við mig 22:08, 27 March 2013 (UTC)
Personally I agree and much prefer the overhead and overhead hook notation in addition to IPA notation, but maybe that's partly because I'm most familiar with overhead notation. The biggest danger with second language learners is that they will put way too much pitch variation into their pronunciation and this is why Japanese spoken in Hollywood is painful to the ears. *shudder* The overhead line format drives home the fact that it's pretty much flat, especially to foreign ears. --Haplology (talk) 00:52, 28 March 2013 (UTC) (PS I mean spoken by actors who clearly memorized some Japanese for their lines but have no other knowledge of it, not ordinary people living in Hollywood) --Haplology (talk) 02:31, 29 March 2013 (UTC)
I just got the iPod app version of the book you mentioned ($34.99 + it has sound clips) and after using it a while, I think it might even be better to directly combine overhead/overhook and IPA notation as I think they work wonderfully together. I also think that the tone letters may prompt "Hollywood Japanese pronunciation" unless very little variation is visualized (i.e. keep to the top two steps). Syllables should be distictly denoted for clarity. As for clarifying what the overline/overhook means, I believe that can be provided in the Japanese phonology page. Here's a visual example of what I'm thinking: --Soardra (talk) 20:02, 6 April 2013 (UTC)

Alpha bar not working again [edit]

The alpha bar extension (I don't know its real name) isn't working again: it never appears at all. I mean the horizontal row of previous/next entries that used to appear above the headword, delayed after page load, via JavaScript. So if you're at nut, you might see a bar of links like nuclear - nuclearity - nugget - nut - nuts - nutty. Equinox 12:20, 26 March 2013 (UTC)

See #Random entry (by language) broken. I believe Hippietrail is also the author. Dakdada (talk) 16:50, 26 March 2013 (UTC)
Curses. That was one of my favourite features. I wish we'd centralise our scripts so they were a permanent part of the wiki. Equinox 10:00, 28 March 2013 (UTC)
That's partly the purpose of the migration to Wikitech. Dakdada (talk) 14:12, 28 March 2013 (UTC)

Context tags: selective visibility and categorization [edit]

Lua-ization may bring us opportunities to make context tags better serve the diverse needs and tastes of our user and contributor populations. In the recent past, we have had disagreements about the desirability of topical (eg tagging all senses of noun, presumably widely understood as grammar) rather than grammatical, regional, timeliness (obsolete etc), and register tags (which would leave at least one sense of noun untagged). Currently there is a dispute about the use of relatively obscure (from a user perspective) linguistic terms (ergative, ambitransitive) in context tags. I have previous wished to be able to classify and tag at the sense level various terms using grammatical and semantic categories and labels. I suspect that the complexity and performance issues of {{context}} would have made it difficult to execute selective display of context tags. In any event, at present, it seems silly to try to force such a thing on {{context}} when, according to Ruakh, it is one of the templates that would most benefit from Lua.

There are questions that will have to be raised at BP, such as:

  • Do others agree that some context tags are best not imposed on normal users?
  • Could we agree on some way of visually distinguishing topical vs usage context categorization or should be just enable selective suppression of topical tags?

For tags best not imposed on normal users, there are many questions about how to implement selective display. Conceivably, as I understand the capabilities of CSS, there are virtually no limits on what could be selectively displayed, while being hidden by default from normal users. What would be required is more custom CSS. But gadgets could allow some common subsets (eg, all tags in a given language or set of languages) to be displayed.

Given my incredibly superficial grasp of the technologies, I start by raising the possibility here so that technical considerations could be reflected in any BP trial balloon or proposal. DCDuring TALK 19:49, 26 March 2013 (UTC)

I think our nomenclature surrounding these templates is misleading. Best to ignore the term “context” here. “Topical” is also easily misconstrued.
There are two kinds of labels: restricted-usage and grammatical. The subject-area labels are a type of usage label indicating a technical term’s or sense’s use chiefly within a specialty, or having a different meaning within a specialty, or having a meaning prescribed by some authority. Arguably noun, the common word for a well-known concept, should not be labelled “grammar”.
CSS and gadgets could certainly be used the way you propose, although I dislike the WT:PREF interface used to make such preferences inaccessible.
But why hide such labels at all? We make masculine, feminine, neuter, singular and plural less obtrusive by abbreviating them m, f, n, s, and pl. We could do the same for the more obscure grammatical labels, and as a result I would learn a thing or two. Michael Z. 2013-03-26 20:28 z
Why not finally (gasp!) migrate to {{label}}? -- Liliana 20:31, 26 March 2013 (UTC)
@Mzajac: I think we need to hide some labels that use terms not part of the general of educated non-specialist users to avoid intimidation and needless questions. Also, Ruakh pointed out that some dictionaries seem to use labels in a way that reflects topic rather than usage context. I've thought our problem is that we have no good means to distinguish a label that indicates a topic vs a usage context.
@Liliana: I either follow the herd or would have to depend on documentation. How does that family of templates work? DCDuring TALK 22:21, 26 March 2013 (UTC)
Yeah, what is all that!? I proposed the idea ages ago, but I didn’t get the impression there was any enthusiasm for it. Michael Z. 2013-03-26 22:37 z
My memory for such things isn't so good. In any event, rereading that kind of thing now would probably have a very different effect on me now. I'll see what I can find. What proposals have you made? DCDuring TALK 23:39, 26 March 2013 (UTC)
I was drawn away from editing by other things then, and late forgot how much discussion this prompted. Must catch up on this. Michael Z. 2013-03-27 03:58 z

Edit request [edit]

It would be nice if someone could complete this edit request. Norwegian uses nested translations, and the language code no should therefore be rejected. --Njardarlogar (talk) 08:05, 29 March 2013 (UTC)

Hang on, I don't think there's a consensus not to use no yet, is there? Mglovesfun (talk) 09:59, 29 March 2013 (UTC)
You can't both nest and not nest for the same macro language; that won't look any good. Norwegian has been nested since, er, 2009 or something. --Njardarlogar (talk) 10:20, 29 March 2013 (UTC)
Didn't say I opposed it, just it needs more discussion. Mglovesfun (talk) 19:25, 29 March 2013 (UTC)
I view that as a largely separate topic. You have to mark Norwegian entries as either Nynorsk or Bokmål one way or the other, and the code no tells nothing on its own. Nesting, which has been implemented for a while, solves this problem elegantly. Remember that if so desired, the {{t}} template could rewritten so that nb and nn both pointed to ==Norwegian==. By removing no from the translations, we ensure that all translations are marked properly. Right now, quite a few are not. --Njardarlogar (talk) 19:33, 29 March 2013 (UTC)

Question About Spambots [edit]

We seem to be getting a good number of spam user pages lately. I understand the ones that slip in a url into innocuous-looking text: the appearance of the link in a high-traffic site like ours makes it seem more important to Google and means it gets listed earlier in the list of results than it would otherwise. I'm a little puzzled by the ones that do the same, but without the url: "blah, blah, blah, [phrase including a brand name], blah, blah, blah", Although that phrase may match the text part of the link in their other spam, it doesn't actually point to the site being promoted. How is this worth spending their bot's time on it? Just curious. Chuck Entz (talk) 01:56, 30 March 2013 (UTC)

I asked Amgine about this on IRC once and I think he said that combinations of unlinked terms on a high-traffic Web page are still capable of influencing Google's PageRank algorithms. Equinox 13:11, 30 March 2013 (UTC)

JavaScript question [edit]

How do I check in JavaScript whether the current page exists? I.e. if I am on sdfsdfsdfsfdf I want to know that it does not exist/is not created. --Njardarlogar (talk) 17:08, 30 March 2013 (UTC)

if (wgArticleId) { current page exists } else { it doesn't exist } This, that and the other (talk) 09:08, 31 March 2013 (UTC)
Just what I was looking for. Thanks. --Njardarlogar (talk) 09:38, 31 March 2013 (UTC)
I believe we're actually supposed to write mediaWiki.config.get('wgArticleId') rather than simply wgArticleId; as I understand it, the latter is deprecated, and will eventually be removed. —RuakhTALK 19:13, 31 March 2013 (UTC)

April 2013

Module:links [edit]

Current code can do {{l}}'s job, and {{l}} will use the module instead of its current code soon. The aim of the module is generally handling wikilinks, though -- not just in {{l}}, but in {{term}}, head templates, and other similar templates that create wikilinks.

Some new features have been proposed at Template_talk:l#Lua-ising. The code for the features has been written and tested, we just need to gain official community consensus to implement it.

Any thoughts or suggestions would be welcomed. --Z 04:28, 1 April 2013 (UTC)

Have you tested it to make sure it works in all cases that {{l}} works, and that it doesn't do anything it shouldn't? Also, what is the purpose of Module:useful stuff? "detect_script" in particular doesn't seem like it does anything useful. And the list of languages that have automated transliteration should be in Module:languages. I also warned you not to start adding all kinds of extra code to this until we're sure that it works the way it should. —CodeCat 13:41, 1 April 2013 (UTC)
Assuming "detect_script" does what it seems to based on its name, that would be extremely useful. Several templates for multiscriptal languages like Tatar, Ladino, and Japanese have parameters that require the user to input what script an entry is in. If we can scrap that, that's be great. —Μετάknowledgediscuss/deeds 14:20, 1 April 2013 (UTC)
But what do you do when the word contains characters in multiple scripts? —CodeCat 16:40, 1 April 2013 (UTC)
That doesn't happen in Tatar or Ladino. It does happen in Japanese, and I'm not sure how it works. For example, アメリカ合衆国 is in both katakana and kanji, but is marked as katakana (in the template, that's kk). We'll have to ask a Japanese editor. —Μετάknowledgediscuss/deeds 00:45, 2 April 2013 (UTC)
Why use kk, the language code for Kazakh? In case it matters, the Japanese ISO script codes are:[4]
  • Hira: Hiragana
  • Kana: Katakana
  • Hrkt: Japanese syllabaries (alias for Hiragana + Katakana)
  • Jpan: Japanese (alias for Han + Hiragana + Katakana)
 Michael Z. 2013-04-03 21:38 z
This is totally off-topic, but I guess it's a valid complaint about the template. The answer is that it's faster to type, just like pl= (code for Polish, means plural in templates) or tr= (code for Turkish, means transliteration in templates). In cases like these, editors' ease definitely outweighs using ISO script codes, because it really doesn't matter which we use. —Μετάknowledgediscuss/deeds 23:45, 3 April 2013 (UTC)
Where is the code for the version with the proposed features?
Automatically detecting script sounds like a very good idea, imo. --Yair rand (talk) 01:22, 4 April 2013 (UTC)
Some are removed by CodeCat and you can find them in older revisions, some were moved to this module, others are in commented part of the code, e.g. recognizing reconstructed terms from "*" and linking to appendix is in prepare_title(). --Z 01:42, 4 April 2013 (UTC)
Can somebody please help me work out how to implement script recognition in {{tt-pos}}? —Μετάknowledgediscuss/deeds 02:31, 4 April 2013 (UTC)
I'm not opposed to these innovations in principle, but I do think that we should first get {{l}} to work with this module first, and keep it that way for at least a week or two so that we can be sure there are no unexpected problems. —CodeCat 03:20, 4 April 2013 (UTC)
Why would we want to get {{l}} working with it first? That might could be a while... —Μετάknowledgediscuss/deeds 04:22, 4 April 2013 (UTC)
Currently detect_script() can't be invoked from templates, it's a better idea to rewrite that template in Lua. --Z 03:29, 4 April 2013 (UTC)
Is it? I was hoping to have a model off which I might be able to design more templates with these features. —Μετάknowledgediscuss/deeds 04:22, 4 April 2013 (UTC)
It would be possible after Lua-ization of {{l}} and {{head}} and adding the ability of detecting scripts to them. --Z 05:08, 4 April 2013 (UTC)
Regarding Japanese, I have no idea about how its writing system works, but it's possible to find katakana characters of a word and tag it with Kana class, and other non-katakana characters of the word (if there is any) would be kanji, I assume? If so, it's easy to fix. Does similar thing happen in any other language? --Z 03:29, 4 April 2013 (UTC)
Not that I can think of, but we should assume so just to be safe. —Μετάknowledgediscuss/deeds 04:22, 4 April 2013 (UTC)
The module is tested, the only problem is gender/number part -- output of Module:gender and number and those of gender/number templates are not identical. That's not much of a problem though, we can use the gender templates in the module for now. --Z 05:32, 4 April 2013 (UTC)
Just use the module. The output of the module can always be changed if necessary, that isn't a good reason not to use it. —CodeCat 22:59, 5 April 2013 (UTC)
Ok, change it then. I can't even edit the page, you locked it. The gender templates should be used until gender_and_number is fixed.
The outputs of Template:l and Module:l are identical now (don't know why the forth test fails, they're the same really). I won't have regular internet access after 14th and apparently no one else cares about working on the module, so I would be grateful if Module:l is implemented soon so that I would be able to work on the extra features in this period of time. --Z 05:14, 10 April 2013 (UTC)

Substring module [edit]

Thanks, Z! New question: do we have a basic string manipulation module, just to store stuff like taking a substring of a certain length from the end of a word, etc? If not, should I create Module:string or something? —Μετάknowledgediscuss/deeds 00:23, 5 April 2013 (UTC)

NP, no that's not needed, for this certain task you can simply use string.sub(). --Z 00:39, 5 April 2013 (UTC)
But I want to invoke it directly in an un-Luacized template. That's why I reckon there should be a separate module for it. —Μετάknowledgediscuss/deeds 22:51, 5 April 2013 (UTC)
I see a need to decompose words in any script into components, including any diacritics and ligatures. Use: say you want to check how to pronounce a word in a complex script with diacritics - Burmese, Hindi, Bengali, Thai, Arabic, Hebrew, etc. A Devanagari syllable रा (rā) can't be looked up in Wiktionary:Hindi transliteration because it's + and you can't take out the diacritic from रा to look it up. Some word processors allow to break strings into parts. So, yes, please. Not just the substring but a break up.
Module:ko-hangul has the function syllable2Jamo, which in debug mode shows individual jamo for each hangeul ( (han) = ㅎㅏㄴ (h a n). Need to make it break up hangeul in the run mode. Tried to do it with "syllable2JamoSep" but didn't work. --Anatoli (обсудить/вклад) 00:41, 5 April 2013 (UTC)
If I understand you correctly, you need mw.ustring.gsub(text, "(.)", "%1 ") (try print(mw.ustring.gsub("रा", "(.)", "%1 ")) in console). --Z 00:58, 5 April 2013 (UTC)
Module:String exists on Wikipedia. Because it doesn't exist here yet, I copied the entire code and added extra bits to it when I wrote Module:bo-translit. Wyang (talk) 01:01, 5 April 2013 (UTC)
Great stuff, thank you! I also tried Thai "เค็ม" = เ ค ็ ม and Arabic "اَلْلُغَةُ ٱلْعَرَبِيَّةُ"‎ = ا َ ل ْ ل ُ غ َ ة ُ ٱ ل ْ ع َ ر َ ب ِ ي َ ّ ة ُ‎. --Anatoli (обсудить/вклад) 01:11, 5 April 2013 (UTC)
Wyang, it seems you're able to do auto-translit for a few complicated script languages, especially those you're familiar with, you've done Burmese before in the Chinese Wiktionary, haven't you? I'm especially keen if you could do Hindi, Bengali, Thai, Lao, Burmese and Khmer in any order, possibly Sinhalese. Korean module needs to be made better as well. I'm happy to assist with testing and getting/checking for translit. rules. I'm only familiar to some degree with Korean, Hindi and Thai. Angr is our Burmese expert and Stephen G. Brown knows a bunch, including Khmer and Telugu (no tones in Khmer, so it's simpler than some others). ZxxZxxZ has done a great job with Arabic and Persian but you can only do that much with partially phonetic languages. --Anatoli (обсудить/вклад) 01:23, 5 April 2013 (UTC)
Wyang knows a lot about Burmese himself, I doubt he'll need much if any help from me for it! —Angr 10:18, 5 April 2013 (UTC)
OK. Anyway, I hoped that function print(mw.ustring.gsub("လုံချည်", "(.)", "%1 ")) could also be used for reading out individual characters, so that one could look up each character in e.g. WT:MY TR but are we missing some characters in လ ု ံ ခ ျ ည ်? I can't find some characters, e.g. . --Anatoli (обсудить/вклад) 10:59, 5 April 2013 (UTC)
alone is "u." in MLCTS, ု+ ံ is "um". They are both on the my-translit page. As for the transliteration modules, I'll try to get more familiar with Lua first. While certain string decomposition functions are easier to use than wiki code, some others seem less straightforward, for example the equivalent of {{#switch: (?) Wyang (talk) 11:06, 5 April 2013 (UTC)
What exactly is the purpose of all these string functions when most of them already have more complete Scribunto equivalents? Isn't this just reinventing the wheel? —CodeCat 22:57, 5 April 2013 (UTC)

No one really answered my original question. For example, we need to strip leading hyphens from suffixes for sorting purposes. Where should I put the function that does that? —Μετάknowledgediscuss/deeds 05:44, 17 April 2013 (UTC)

Use mw.text.trim(text, "%-"). Almost whatever string-related function you think of is already defined in Lua and Scribunto. --Z 06:42, 17 April 2013 (UTC)
You don't get it. I know that, I just want to know where to put it. We need to be organized before we put stuff in templates. —Μετάknowledgediscuss/deeds 13:53, 17 April 2013 (UTC)
I understood that was just an example, but as I said, basic string-related functions are already defined and I can't think of anything that isn't defined. Maybe I'm misunderstand what you mean, so lets wait for another person to comment. --Z 14:24, 17 April 2013 (UTC)
Non-basic string functions are usually language-specific, which we have put them in Module:xx-common modules so far, where xx is the language code. Anything else (if there is any) may be put here. --Z 14:28, 17 April 2013 (UTC)
So I should put this in Module:useful stuff? And what should I name the function, suffixSort? —Μετάknowledgediscuss/deeds 14:33, 17 April 2013 (UTC)
Yeah, that looks fine. There was a worthless timewasting discussion about what to name that module, and after thinking a lot, I came up with that title, because I think thinking about what to name a module is ridiculous; no one cares what it is named, only that it works, so I named it after the first thing came to my mind, I suggest you do that too. --Z 15:08, 17 April 2013 (UTC)

Documentation subpage tab for templates [edit]

As an example, {{head}}, at the top the tab is still Template:head/doc but we're migrating these to /documentation (Template:head/documentation). I guess there's a MediaWiki page somewhere that needs updating. Please. Mglovesfun (talk) 12:23, 5 April 2013 (UTC)

Yes, but currently the majority still uses /doc. Would it be possible for it to support both, until all the subpages have been moved over? {{documentation}} already does this (it shows /doc if it exists, but prefers /documentation otherwise) —CodeCat 14:17, 5 April 2013 (UTC)

An idea for a new way to make form bots [edit]

This idea just kind of came to me but I think it could be very useful. The way that User:MewBot and probably all the other bots currently work is that they parse the invocation of the template, and then try to mimic the template as closely as possible. This is really a lot of work and it's not ideal, because it means all the code has to be duplicated and kept synchronised. So I thought... why not let another template or Lua generate the forms in a machine-readable format? That way, the bot only has to understand the output, but no longer has to duplicate any of the intricacies of the template/module. I have added this to Module:nl-verb and less than 5 minutes later I already have a working example. :) See User:CodeCat/bot example. To use it on any Dutch verb table, just add the parameter bot=1 to the template. It is really easy to do as long as your template or module has a strict separation between the part that generates all the forms and the part that displays the table. By writing a second output function that displays a list of forms instead of a fancy table, you can get this result. It can even be done without Lua at all, but Lua does make it a lot easier.

So what can we do with this? In its current form, a bot script could "find" the inflection template on a page like it does before, but it could then add the bot=1 parameter and expand the template (via the MediaWiki API, which is built into the Python wikibot framework). It can then parse the machine-readable output and use that to create entries for each of the forms as before, instead of having to generate all the forms itself. However, this concept could be taken further. The template or module that generates the machine-readable format could actually generate the full form-of entries itself, so that the bot doesn't even need to parse the output but could just flat out create entries straight from it. This, in turn, would open up one huge door: a single bot that can do any inflection table, in any language, without modifications, as long as the template/module generates the proper output. —CodeCat 01:31, 6 April 2013 (UTC)

I've now made these changes to User:MewBot, and it seems to work ok with the few test entries I've tried it on. —CodeCat 22:01, 6 April 2013 (UTC)
Looks good to me. —RuakhTALK 23:54, 6 April 2013 (UTC)

toolserver [edit]

I’ve tried to use several toolserver utilities yesterday and this morning, such as SUL accounts, and I have not been able to make a connection. Is toolserver temporarily down, or is it gone? I remember someone saying that the toolserver tools were being migrated to Wikimedia Labs. If the SUL accounts tool is now at Wikimedia Labs, is there a way to fix the links here (SUL accounts link appears at the bottom of anyone’s contributions page, such as Special:Contributions/Stephen_G._Brown)? —Stephen (Talk) 06:16, 6 April 2013 (UTC)

The toolserver has issues, but it is unrelated to tools migrations. Dakdada (talk) 15:40, 6 April 2013 (UTC)
Thanks. It seems to be working again, although very slow. —Stephen (Talk) 08:48, 7 April 2013 (UTC)

Limits on display of "orange link for missing language section" [edit]

If one checks the appropriate box in per-browser preferences, links display in orange instead of blue "if the target language is missing on an existing page" (actually if the specified section of whatever kind is missing), eg, plate#Latin. plata#Latin displays blue though there is no Latin section at [[plata]]. Does anyone know whether there some kind of limit on the number of headings that are searched in the operation of the this feature? DCDuring TALK 12:28, 6 April 2013 (UTC)

No idea, but the reverse happens too. In the etymology sections of သစ်, 薪#Mandarin, 薪#Cantonese, 新#Middle Chinese, and 新#Cantonese all appear orange even though those sections do exist. —Angr 12:36, 6 April 2013 (UTC)
Angr, that's a different problem. I'm pretty sure that's because none of the sections you linked to has a definition yet. —Μετάknowledgediscuss/deeds 16:45, 6 April 2013 (UTC)
Uncategorized sections come up orange, yes. Mglovesfun (talk) 18:21, 6 April 2013 (UTC)
OK, that's good to know. But what about DCDuring's issue? Why is plata#Latin blue? —Angr 18:28, 6 April 2013 (UTC)
  • I had a hunch.
I searched the plata page for the word "Latin". It appeared in a context note in the second Spanish def.
After deleting that context note, so the word "Latin" doesn't appear on that page, the plata#Latin link now appears in orange.
Yair, that looks like a bug, maybe in page parsing -- would that be hard for you to fix? -- Eiríkr Útlendi │ Tala við mig 19:47, 6 April 2013 (UTC)
  • No, that's the same problem. It's not that you removed Latin from the text, it's that you removed the page from Category:Latin American Spanish. The script works by examining a page's categories; if the page appears in a category that starts with a given language-name, then the script infers that the page has a section for that language. (In particular, the script does not download and parse the page. That would give more accurate results, but would be prohibitively expensive.) —RuakhTALK 00:09, 7 April 2013 (UTC)
    And that answers the lingering question I had about why orange appears when I try to link to a PoS or Etymology section in English. DCDuring TALK 00:40, 7 April 2013 (UTC)
    Does the script not look at hidden categories? has a hidden Cantonese category (definition needed). Why not use it? DCDuring TALK 00:44, 7 April 2013 (UTC)
  • Re: "Does the script not look at hidden categories?": Correct, it doesn't.   Re: "Why not [] ?": Not being the author, I can only speculate, but it sort of makes sense to me: if our only Cantonese category is Category:Cantonese definitions needed, then we arguably don't really have a Cantonese entry. I mean, sure, we've got the ==Cantonese== L2 header, but we don't even identify the POS? —RuakhTALK 02:20, 7 April 2013 (UTC)
I don't know if it would help or even be useful, but we could restrict the script further by requiring that a PoS category have a specific name. The script would have a list of possible PoS names which it could choose from, and it would make the link orange if it finds no match. Since "American Spanish" isn't a PoS category, it would fix the problem above. —CodeCat 02:31, 7 April 2013 (UTC)
That doesn't seem like it should impose much of a performance penalty. (Does Lua/Scribunto help?) If there is any significant performance penalty, should just learn to live with the problem.
I guess there could be other instances of a name of regional dialect or dialect grouping starting with a language name. DCDuring TALK 04:00, 7 April 2013 (UTC)
No performance penalty, no. (Also, that part of it would be handled entirely on the client-side (your browser), via JavaScript, so even if it did have a performance impact, the considerations would be a bit different than for stuff that runs on the server-side.) —RuakhTALK 05:26, 7 April 2013 (UTC)
have#Etymology also comes up orange (well, yellow to me) because there are no categories starting with the word Etymology. Mglovesfun (talk) 11:33, 7 April 2013 (UTC)
We could apply the same idea to that as well. If the script is able to recognise which categories are valid, maybe it could also tell a valid language from an invalid one. The problem, of course, is that there are hundreds of languages, so putting them all into the script might be overkill. I guess this is yet another example of a case where linking to non-language sections just doesn't work. And I kind of agree with that anyway, because have#Etymology would link to the first Etymology section on the page. That works out right in this case, but what about cantar#Conjugation? And never mind if we want to link to the etymology of another language, then you're out of luck... —CodeCat 13:02, 7 April 2013 (UTC)
It only matters for those who select the preferences box. It doesn't really matter for the users we should care most about: casual users. The simple and cheap part (PoS) may be worth installing in the script, not the language part.
I still have hopes that links to English L2s can be to be more narrowly directed to the appropriate L2 headers like Etymology n and the PoS headers instead of running the risk of confusing users at our longer entries. DCDuring TALK 14:07, 7 April 2013 (UTC)
I prefer the opposite, that all languages can be treated equally so that people don't get confused when something that works for English doesn't work for any other language. That doesn't mean that we wouldn't be able to link to specific sections, but we should be able to link to the specific section of any language. Currently, Mediawiki equates sections with the actual name of the header, so it assumes that every header will only ever appear once on a page. That is really a rather strange assumption. —CodeCat 14:11, 7 April 2013 (UTC)
English is the host language. It already behaves differently.
The less attractive we make this for normal users the less we will track real English. Plenty of users are confused by the existence of uppercase entries for English words that don't contain English sections, just German or Translingual. Why not finish the job by putting English in its alphabetical position in multilanguage entries and encourage non-English-language discussions for non-English entries? We already run the risk of turning this wiktionary into one that doesn't track current UK or US English, but some blend of what Webster 1913 tracked and Globish. And there will be fewer to challenge the dated, archaic, and obsolete glosses in our FL entries. But it will still be fun for polyglots. DCDuring TALK 14:25, 7 April 2013 (UTC)
"Bla bla bla think of the children, it will be doom if we allow such transgressions". Ok, can you actually read what I am saying and not go off on a panic spree? —CodeCat 16:07, 7 April 2013 (UTC)
The non-L2 header links? So, for a missing header like Gorilla#Etymology at Gorilla#Translingual it is the contributors responsibility to avoid the reference.
Is plata#Latin fixed using CodeCat's approach? DCDuring TALK 23:06, 7 April 2013 (UTC)
See diff. I set it to ignore categories listed in the exceptionCategories array, which currently only contains Category:Latin American Spanish. Not an ideal solution, but I can't think of a better one that won't come with its own problems.
I think it would be best to leave the 新#Cantonese issue as it is. A link to an empty section like that isn't really much better than a broken one. --Yair rand (talk) 00:10, 8 April 2013 (UTC)
Thanks for making the change and for the helpful explanation. If I see a problem that can't be solved by adding to the array, I'll let you know. DCDuring TALK 00:57, 8 April 2013 (UTC)
  • Um, plata#Latin still shows up as blue for me, despite the lack of any Latin entries on that page. Flushing my browser cache (Chromium on Ubuntu) doesn't seem to have any effect on this issue. -- Eiríkr Útlendi │ Tala við mig 22:19, 8 April 2013 (UTC)
    If you have the per-browser preference box "Color translation links orange instead of blue if the target language is missing on an existing page." checked, I wonder whether it's indeed a browser/OS issue. Do others with Chrome on other OSs have the problem? We don't - and probably can't - have a robust capability for handling less common combinations on our local specials. DCDuring TALK 22:46, 8 April 2013 (UTC)

Very minor bug [edit]

Relevant script: MediaWiki:Gadget-PatrollingEnhancements.js

In the recent changes script that provides for admins the red bar at the bottom right, when I type in that bar it scrolls down the page. It's so innocuous I haven't bothered to report it yet. So if I type a really long deletion summary, I then have to scroll back up to click the red 'd' link to delete the offending page. Mglovesfun (talk) 11:24, 7 April 2013 (UTC)

What browser/system are you using? FWIW to whomever troubleshoots/fixes this issue, I can't reproduce that behaviour; I can type lots of text into the deletion-summary bar, and the page doesn't scroll (using Firefox 19 on Windows XP). - -sche (discuss) 22:43, 7 April 2013 (UTC)
Hmm. Can you experiment a bit, and describe the behavior a bit more precisely? For example:
  • If you type a single letter, then wait a moment, then type another letter, do you find that it scrolls a certain distance when you type the first letter, and then scrolls the same distance when you type the second letter?
  • Or do you find that the scrolling only happens when you hit the space bar? (In many browsers, the space bar can be used to scroll down the page, but of course it's not supposed to do that when you type an actual space into an actual text field!)
RuakhTALK 23:03, 7 April 2013 (UTC)
Couldn't duplicate just by typing in the box no matter how many spaces using FF 19.0.2. and Windows, but I didn't undertake any actual deletion. DCDuring TALK 23:13, 7 April 2013 (UTC)
One small downward movement per character typed, not only the space bar, the same length of movement for every character. Using Google Chrome - I hadn't thought of testing it in Explorer and Firefox which I also have but no longer use. Mglovesfun (talk) 12:29, 9 April 2013 (UTC)

Category:Mandarin pinyin [edit]

The category needs a bit of clean up. With or without the main entry, the structure of the pinyin entry should stay strictly a link to Chinese characters - one word per line, traditional/simplified (if different) are passed as additional parameters. There should be no other definitions, pronunciation or references. (Monosyllabic are a bit different).

The parameters are unnamed. There were only a few entries, which used sim/trad, simp/trad named parameters. They are now gone and could be removed from the Template:pinyin_reading_of template.

Here's an example of a correct entry where trad/simp. are different characters "biànchéng":

==Mandarin==

===Romanization===
{{cmn-pinyin}}

# {{pinyin reading of|變成|变成}}

I wonder what tools we have for repetitive tasks like this other than manual editing -need to remove all references to dictionaries, short translations and pronunciation sections. --Anatoli (обсудить/вклад) 01:54, 8 April 2013 (UTC)

GSOC Proposal -Pronunciation Recording Extension [edit]

Hello Everyone,

On the wikitech-l mailing list,i saw Pronunciaton Recording Tool feature request so i felt i could give this a try and now i am planning to undertake this as my gsoc project Since i am new to open source development if anyone could help me out or mentor me through the project i would be very happy

Thanks Rahul(Rahul_21 on the IRC #mediawiki,#wikimedia-dev)

A quick update on this:
Rahul21 has collected a team, including MDale as code mentor and Lars Aronsson as the community Liaison. The Google Summer of Code application draft is firming up with feedback from WMF, Mozilla, Vorbis, and several languages of Wiktionary. If the English wiktionary wants to have influence in the project's goals, now is the time!
- Amgine/ t·e 15:16, 16 April 2013 (UTC)

Question about data normalization and where entry data goes [edit]

Previous discussions of somewhat similar ideas: WT:BP#Restructuration_of_foreign_languages, WT:BP#Pages_getting_too_big.3F

I've been chewing on this idea for some time.

Arrowred.png Why do we put all data for all languages in one big page? This is extremely messy, from a data organization viewpoint.

ArrowGreen.svg Putting each language on its own subpage would resolve many different issues. (Ignore, for the moment, the major amount of work required to refactor existing entries to implement this.)

  • Instead of putting all information for all languages that have a term spelled "ni" on the [[ni]] page, there would be one [[ni]] page with one subpage for each language: [[ni/Navajo]], [[ni/Japanese]], etc. Or, perhaps using the lang codes, giving [[ni/nv]], [[ni/ja]], etc.
  • Each subpage would be transcluded onto the main page, so any reader looking at [[ni]] would perceive no change.
  • One could tell immediately whether a term were missing in a given language, without having to parse the page or check categories. This would resolve, or at least help resolve, issues like as the link color oddities for plata#Latin in the thread further up this page.
  • Special:WhatLinksHere would be much more useful -- one could find out much more easily whether a given template is used by a given language, for instance.
  • Tabbed languages would potentially be simpler to implement.

I'm keen to find out if anyone knows why we are using the current "all languages on one page" format. I suspect it's entirely due to legacy data and momentum, but I'm aware that I may be missing some other big gotcha or limitation of the MediaWiki software that would render the "each language on a subpage" data format unworkable.

Curious, -- Eiríkr Útlendi │ Tala við mig 22:36, 8 April 2013 (UTC)

I would support this idea and have supported it in the past, but it seems to have enough opposition here that it never actually gets any further than an idea. —CodeCat 22:50, 8 April 2013 (UTC)
I somehow like the reverse naming: [[ja/ni]], [[nv/ni]], but I guess this is harder to use. Wyang (talk) 23:13, 8 April 2013 (UTC)
  • Would the parent pages ([[ja]] and [[ni]] in this example) then be the indices for each language?
However, that does run into the problem that [[ja]] and [[ni]] are already existing pages. -- Eiríkr Útlendi │ Tala við mig 23:39, 8 April 2013 (UTC)
All those look like real advantages. It would also mean that template such as {{l}} would be unnecessary for same-language links and that omitted lang= parameters in {{term}} etc could default to the same language. It would also dramatically improve load time for non-English terms that were homographs of English terms, especially those with really large translation tables. The load-time problem for English terms with large translation tables would not be helped significantly, unless translation tables, too, were on a separate subpage, not automatically transcluded.
How should Translingual pages (characters, symbols, taxons) be handled in such a regime? DCDuring TALK 23:18, 8 April 2013 (UTC)
  • Translingual entries would presumably be at the */mul subpage, so [[ni/mul]] in this example. My thought is that the top-level term page would only ever be the container, but perhaps some other arrangement might work better. -- Eiríkr Útlendi │ Tala við mig 23:39, 8 April 2013 (UTC)
    Translingual entries are supposed to be useful in many languages. Taxons are usable by biology professionals even in languages using non-Latin script. Characters and symbols have similar broad reach. That's the justification for having them at the top of the page, so that there is no need to repeat the content in every (applicable) language. We would need some kind of obvious way of reminding users of these things and might have to be much more explicit about the precise scope of the Translingual terms and each particular sense thereof. We have largely finessed this point. DCDuring TALK 02:03, 9 April 2013 (UTC)
  • Agreed. I'm not sure how this affects this proposal, however? The idea is that anyone looking at [[ni]] (or any other page) as a reader would see exactly what's already there. -- Eiríkr Útlendi │ Tala við mig 05:08, 9 April 2013 (UTC)
    If all the content is transcluded, then the page-load time improvement doesn't apply. In fact page-loads would be slightly worsened. DCDuring TALK 11:06, 9 April 2013 (UTC)
Both arrangements have advantages. When it comes to templates or modules, it doesn't really matter because we can extract both the base name and the subpage name. One advantage of putting the word first is that it matches our current arrangement somewhat more, because each base page would then have one subpage for every language. On the other hand, the reverse arrangement with the language first is more like how we treat Appendix entries. I don't think either one really has any clear advantages or disadvantages, it's more down to our own preference and logical approach to entries. Another question we need to answer though is whether we use language names or codes in the title. And what to do with the thousands of bare links and uses of {{term}} without a lang= parameter, which will break? —CodeCat 00:46, 9 April 2013 (UTC)
  • Lang codes would be shorter. Lang names would be more human-friendly.
Why not use both? Lang names could redirect to the lang code subpages, or the reverse, as deemed appropriate.
  • {{term}} would presumably just link to the bare entry, [[ni]] in this case, which would be the container into which all of the language pages would be transcluded. [[ni/mul]] would go at the top if it exists, followed then by [[ni/en]], and then all the other langs in alphabetical order. -- Eiríkr Útlendi │ Tala við mig 00:51, 9 April 2013 (UTC)
The only way to make redirects like that work is to have a redirect for every entry. I don't see that happening... —CodeCat 01:21, 9 April 2013 (UTC)
  • We have bots for basic maintenance stuff, no?
Assuming we're putting the data under the lang code, then a bot would check for each [[ni/langcode]], to see if there is a corresponding [[ni/langname]]. If it's missing, the bot would create it as a redirect to [[ni/langcode]].
But that's even assuming that we'd want both lang code and lang name URLs. -- Eiríkr Útlendi │ Tala við mig 05:12, 9 April 2013 (UTC)
  • I don't see any significant possible benefit, and a lot of potential downsides. Anyway, we have no real technical means to do this. --Yair rand (talk) 01:01, 9 April 2013 (UTC)
    Could you add more detail to that? I'm not aware of the downsides, which is partly why I asked.
    I'm also a bit confused about your comment that "we have no real technical means to do this" -- this would be very bot-able, as there's nothing that complicated involved in changing the structure of existing entries, just the tedium of actually doing so. -- Eiríkr Útlendi │ Tala við mig 01:19, 9 April 2013 (UTC)
    We don't have any way that I know of to have each page display the contents of each of its subpages. The downsides: Categories would be severely messed up. The category pages would be packed with languages codes, and would link to problematic "part-entries". The entries either wouldn't list categories at all at the bottom, or would be doubled in the category page. Special:WhatLinksHere would also go to the "part-entries", and would additionally contain main entry duplicates for every link. The search bar would be clogged with extra suggestions that wouldn't mean much to the users. --Yair rand (talk) 01:34, 9 April 2013 (UTC)
Blue Glass Arrow.svg Splitting these various issues up for reply.
  • "Categories would be severely messed up."
    • "The category pages would be packed with languages codes,"
      Not necessarily a problem. I'd actually prefer it if I could tell at a glance whether a term in a given language were present in a category.
      Is this helpful to our readers? Afaik, all mainspace categories are single-language. Having "/French" added to every link would just cause confusion. --Yair rand (talk) 06:44, 9 April 2013 (UTC)
    • "[The category pages] would link to problematic "part-entries"."
      I think maybe you've misunderstood what I'm proposing? Or maybe I've misunderstood what you mean by "part-entries"? The idea is that the whole Japanese entry would be moved to [[ni/ja]] or [[ni/Japanese]], depending on whether folks prefer lang codes or lang names. There wouldn't be any part-entries.
      If you think that having the entire Japanese entry for ni at [[ni/Japanese]] would be problematic, I don't understand what would be problematic about that. Could you explain?
      I thought you were suggesting that the viewing of entries would still take place through a central non-single-language page so that there could still be quick switching between languages. If so, then links that go directly to single-language entries ("part-entries") as though they were full entries would be problematic. --Yair rand (talk) 06:44, 9 April 2013 (UTC)
      • Unclear what the problem would be. If a reader clicks a link that is intended by both the reader and the editor who added the link to lead the reader to the Romanian entry, then I fail to see any problem at all if the user does not see the Welsh entry. Actually, not seeing the Welsh entry could be argued to be a bonus, rather than problematic.
      If instead you mean that the problem is ease of changing between languages for any given term, what I'm envisioning would ultimately be a combination of a parent page (see sample at [[User:Eirikr/Sandbox3/ni]]) that would be identical to our current format for readers, and the individual language subpages (such as [[User:Eirikr/Sandbox3/ni/sq]] or [[User:Eirikr/Sandbox3/ni/cy]]) which would ultimately be quite similar in appearance to Tabbed Languages. One would provide an all-in-one view, the other would provide just the target language, with links at the top to the others. {{subpages}} already offers a basis from which to create such a header. -- Eiríkr Útlendi │ Tala við mig 02:49, 10 April 2013 (UTC)
    • "The entries either wouldn't list categories at all at the bottom,"
      I just created a test sample at [[User:Eirikr/Sandbox3/ni]]. All cats on the various sub-pages appear on the parent page (excluding those cats where the including template has logic to check the namespace and only include the cat if in the main namespace).
    • "...or [the entries] would be doubled in the category page."
      Category:Swahili_terms_needing_attention is called from the test page. I do see both [[User:Eirikr/Sandbox3/ni]] and [[User:Eirikr/Sandbox3/ni/sw]] listed there now.
      While this is an issue, it seems little more than a minor nuisance, and is not insurmountable. For categories included by templates, the templates could contain logic to limit category inclusion to only lang-specific sub-pages.
      Fixing this problem would cause the one mentioned above, and vice versa. --Yair rand (talk) 06:44, 9 April 2013 (UTC)
    • "Special:WhatLinksHere would also go to the "part-entries","
      See above about "part-entries".
    • "and [Special:WhatLinksHere] would additionally contain main entry duplicates for every link. "
      Yes, this would be an issue, but again, it seems little more than a minor nuisance.
    • "The search bar would be clogged with extra suggestions that wouldn't mean much to the users."
      This could be at least partially addressed by using lang names instead of lang codes in the URLs. If a user searches for [[ni]] in search of the Zulu term, and [[ni/Zulu]] is one of the hits, they know right where to go.
      Past there, the search index hasn't yet been updated to include my test page.
Blue Glass Arrow.svg So aside from the "part-entries" bit where I'm not sure what you mean, it looks like the net negative effect would be a couple of minor nuisances.
ArrowGreen.svg In terms of positives, just on the surface of it, it would be much easier to tell what languages have an entry for any given term. This does away with a whole class of problems, including the script rejiggering required just earlier this week to handle lang-specific orange links. Does that really qualify as "[not] any significant possible benefit"?
-- Eiríkr Útlendi │ Tala við mig 06:18, 9 April 2013 (UTC)
That issue is bug 16561, which is probably far more likely to be fixed than the necessary changes to split entries into language subpages, largely because identifying broken section links is also a somewhat important issue for a certain large sister project of ours. --Yair rand (talk) 06:44, 9 April 2013 (UTC)
  • I see a lack of any comments in that thread since late 2010. It's also not entirely clear what "certain large sister project of ours" you intend; I assume you mean Wikipedia, but that isn't very clear from the thread. -- Eiríkr Útlendi │ Tala við mig 02:49, 10 April 2013 (UTC)
Sounds like a great idea to me.
But why transclude the entries into the root page? Just put an automatic index there, floated to the right of the the multilingual entry. Categories should collect entries okay, but can they display their titles correctly? The only problem I can see is displaying the title at the top of a language entry’s page.
If we’re reorganizing, is there a way to avoid adding a level of subheadings for “Etymology 1,” &c.? Michael Z. 2013-04-09 02:50 z
  • I proposed transclusion simply because that can be done right now, whereas an automatic index would presumably require that someone code one up first. Now that we have Lua, that should be easier to do. It could also have the side benefit of obviating the nuisance issues mentioned above.
However, I'm also keen to avoid any major disruption to readers, and transclusion into the parent page would result in an entry page visually identical to what we already have.
I mean while rat#Noun is an <h3> heading as expected, but, inconsistently, root#Noun is an <h4> because it is pushed down by the “Etymology 1” and “Etymology 2” heading. Bugs me. Michael Z. 2013-04-09 23:02 z
Why is it that WT:ELE has us do it the way we do? Is it a relic of a pre-CSS approach to making an entry's heading look good? DCDuring TALK 23:10, 9 April 2013 (UTC)
I think it’s an information-organization problem at its root. Our page/heading structure is term/language/etymology/p.o.s, e.g., Rat/English/Etymology/Noun. But we omit the Etymology 1 subheading when there is only one. There’s nothing really wrong with this, although it must add a layer of complication to some bots that need to find the subheadings. I think we could style the subheadings consistently by selecting on the IDs of etymology headings, if we can pick a reasonable style for the extra etymology heading itself.
HTML5 adds new elements (<article>, etc) and a new document structure model that could help make sense of this, but MediaWiki doesn’t support this yet. Michael Z. 2013-04-10 01:45 z
And then there are the 1,827 members of Category:Entries_with_Pronunciation_n_headers, which don't conform to WT:ELE and can't be reconciled with it, at least in EP's opinion. DCDuring TALK 01:50, 10 April 2013 (UTC)
I oppose splitting entries into subpages for the same reasons I opposed it the last time it was proposed. - -sche (discuss) 04:17, 9 April 2013 (UTC)
  • Reading that link, it sounds like you would not be opposed to changes provided the reader sees no difference. Is that still the case?
It also sounds like your opposition was partly based on different ideas to what I'm proposing here -- splitting into subpages as I'm imagining it would be based purely on language, and not have anything to do with page size. This is based more on my understanding of how our infrastructure works with terms on a by-language basis, and the kinds of workarounds required because finding out which languages have entries for any given term is more complicated than just looking at the existing URLs. -- Eiríkr Útlendi │ Tala við mig 06:26, 9 April 2013 (UTC)
I wouldn't like it as I would like to read the wikitext of all the entries in one go, and use the auto-formatting properties of User:Mglovesfun/vector.js. However, if there were a majority in favor of it, I'm sure I'd get used to it. Mglovesfun (talk) 12:08, 9 April 2013 (UTC)
I vehemently oppose the change; I think it would be an all-around bad move. It would not only make it more difficult for editors like me and Mg and Meta, who often edit all language sections in one go, it would make it more difficult for newcomers to start editing Wiktionary: they'd click the 'edit' link at the top of the page and see nothing but a few lines encased in curly brackets.
Furthermore, you propose to avoid changing how entries like [[foo]] look. That is good, inasmuch as fragmenting the displayed content would cause major problems separate from those that fragmenting the actual content/wikitext would cause. But ensuring that the display does not change requires double effort on the part of editors, to always edit foo after creating foo/bar, and requires constant vigilance on the part of other editors to ensure no-one forgets to transclude a foo/bar into a foo.
If any part of that vigilance is entrusted to a bot, the bot will have to be reliable enough that nothing slips through the cracks while the bot is down, and smart enough that it can handle or flag creations of [[foo/not-a-real-code]] and not mis-handle naming conflicts...
...because, as Liliana mentioned in the previous discussion, enough words are spelt with slashes that naming conflicts are inevitable. For example, s/he would seem (to a bot) to be a Hebrew entry (missing a Hebrew L2, no less!) that should be transcluded into s; it would also conflict with any real Hebrew entry [[s]], if one ever needed to be created.
This proposal would duplicate every main-namespace page, even that supermajority of NS:0 pages which have only one language section. It would move the first-person plural imperfect active indicative of fugio to fugiebamus/la, but leave a shell at fugiebamus to contain it; likewise it would have arrodillasen/es transcluded into an otherwise empty arrodillasen, with the aforementioned constant effort required to ensure that if [[arrodillasen/foo]] ever were created, it would be transcluded. It would be easier and IMO better to leave the content at fugiebamus and arrodillasen. - -sche (discuss) 03:58, 10 April 2013 (UTC)
Strongly oppose. It's unhelpful technically (as Yair explained above), it doesn't benefit the readers, and it's worse for editors like me who work with several related languages at a time. Looks like a classic lose-lose, and this format is one of the reasons why I don't edit on wiktionaries in other languages I speak. —Μετάknowledgediscuss/deeds 23:39, 9 April 2013 (UTC)
Why doesn't it benefit the readers? When we used to hear from normal users, one common complaint was that they found the presence of many languages on the page confusing. Another common complaint was about the incredible length of the table contents. Do you have anything to support your assertion? DCDuring TALK 00:03, 10 April 2013 (UTC)
It seems you misunderstand this. The presentation will still be the same. The page ni will still have all the languages and the long TOC. As for the long TOC, TabbedLanguages solves that, and the sooner it is implemented, the better. —Μετάknowledgediscuss/deeds 19:19, 11 April 2013 (UTC)
  • @Metaknowledge, one thing that is currently infeasible with our "all languages on one page" organization is linking reliably to any POS header for a given language.
For instance, the Portuguese noun on the [[ni]] page has an ID of Noun_4. If an additional language is added above the Portuguese entry, this now has an ID of Noun_5, and anything that previously pointed to the Portuguese noun that was at Noun_4 is now pointing at who knows what, quite possibly at the Italian noun instead. The link URL, containing only the obscure numbered target [[ni#Noun_4]], wouldn't give editors or savvy readers any clue as to what language was intended.
By splitting languages into their own subpages, we could have much more reliable linking: [[ni/pt#Noun]] will only ever point to the Portuguese noun, and will never inadvertently point to the Italian noun. Editors and savvy readers can also tell from the target URL what language the link points at.
But this technical benefit is only possible when we don't throw all the data into one undifferentiated bucket. -- Eiríkr Útlendi │ Tala við mig 02:20, 10 April 2013 (UTC)
Why would we want to link to POS sections? I understand wanting to link to specific senses, which we can do with {{senseid}}, but where would it be useful to link specifically to a POS header? --Yair rand (talk) 02:42, 10 April 2013 (UTC)
There is {{anchor}}, which can be used to allow links like foo#Dutch_noun to specific sections in those entries—that relatively minuscule minority of our 3,3 million entries—which have two Noun (or Verb, etc) sections. - -sche (discuss) 04:04, 10 April 2013 (UTC)
Also, e.g., the Swedish word en (meaning “juniper”, not en meaning “one”) is defined at en#Etymology_2_3Michael Z. 2013-04-10 03:16 z
Eirikr, it seems that the big advantage you are advertising already exists by means of {{anchor}}. Do you have any convincing argument that we don't already have the capabilities for? —Μετάknowledgediscuss/deeds 19:19, 11 April 2013 (UTC)

Etymtree [edit]

See Template:etymtree/Module:etymtree (which is horribly written at the moment, but ignoring that for a minute...), as used in Appendix:Proto-Indo-European/wódr̥ and Appendix:Proto-Germanic/watōr. The full tree is stored at Template:etymtree/ine-pro/wódr̥, but the relevant branches are pulled out by the template. Are there any serious downsides to using this system instead of the current system where trees are either duplicated across entries or missing some parts? --Yair rand (talk) 09:03, 9 April 2013 (UTC)

The main problem I see with your naming scheme is that words might have multiple sets of descendants, like *aljaną. How do you keep them apart? —CodeCat 13:05, 9 April 2013 (UTC)
If they're both roots of separate trees, then they could just be called Template:etymtree/gem-pro/aljaną/1 and Template:etymtree/gem-pro/aljaną/2 (or something like that), I guess, and that wouldn't really cause any additional problems.
If there are multiple words in one language with the same spelling on the same etymology tree, however... yeah, I have no idea how to deal with that. Hm... --Yair rand (talk) 22:42, 9 April 2013 (UTC)

Portuguese verb oddity [edit]

In the conjugation table for Portuguese -erir verbs (see ferir as an example) it says that investir, revestir and vestir have this conjugation. This is false, but the templates are so convoluted that I can't figure out how to fix it. SemperBlotto (talk) 15:36, 11 April 2013 (UTC)

They do. They are third-conjugation verbs where the e preceding the thematic vowel becomes i in some forms. — Ungoliant (Falai) 16:05, 11 April 2013 (UTC)
    • But they use {{pt-conj|xyz|vestir}}, not {{pt-conj|xyz|erir}}. SemperBlotto (talk) 16:08, 11 April 2013 (UTC)
Ok. I changed them. — Ungoliant (Falai) 18:53, 11 April 2013 (UTC)

A Question on Modules [edit]

I've been seeing contributions on modules lately. Could they replace templates, or could they both stay? Besides, I'm wondering, what are modules, and how are they used? --Lo Ximiendo (talk) 08:03, 12 April 2013 (UTC)

See WT:LUA, w:WP:LUA --Z 09:05, 12 April 2013 (UTC)
Short answer: Modules do not replace Templates, but they complement them when complex operations (such as string manipulations) are required. Dakdada (talk) 13:40, 12 April 2013 (UTC)

Double references [edit]

In the page herre, why does footnote 1 appear in two places? Isn't each <references/> supposed to list only those footnotes that were added since the last appearance of that tag? Has this changed and what is the cure? --LA2 (talk) 22:38, 12 April 2013 (UTC)

The page mw:Extension:Cite/Cite.php says "In the case of multiple references-tags on a page, each gives the references defined in the ref-tags from the previous references-tag", which is how I remember it used to function. --LA2 (talk) 22:51, 12 April 2013 (UTC)

Oh, my fault, the same <ref>...</ref> tag is indeed repeated in the next section. --LA2 (talk) 22:53, 12 April 2013 (UTC)

GSOC Proposal - DICT api to Wiktionary [edit]

There is now a project idea for the 2013 Google Summer of Code to make Wiktionary content available via the DICT protocol. This is in part due to bugzilla #36881.

There are currently more than 15 apps with a couple million downloads between them which use Wiktionary content for dictionary reference, but as far as I am aware each uses old data dumps which have been processed by a 3rd party, some many years ago. If this project is accepted and implemented in MediaWiki, we can expect our content reuse to climb dramatically as apps would be able to retrieve our latest data.

The WMF is looking for a community liaison for this project, just in case a student comes along looking to pick this one up. (This suggests to me there is interest at the foundation to see this implemented in the MediaWiki api, though that is plainly speculation on my part.) So, the first discussion point is: who would be interested in being the go-between with the developers?

- Amgine/ t·e 15:36, 16 April 2013 (UTC)

English category bug [edit]

Was just browsing Category:English words prefixed with cyno-. If you hover over the individual entries, they are linked to with an invalid hash anchor {{{{{lang}}}}}. I suppose this is from using {{prefix}} with no language. Mglovesfun (talk) 22:43, 16 April 2013 (UTC)

I've fixed that. But problems like that could probably be prevented in the future by changing this template. The language code should really be the first parameter, so that it's clear that it's mandatory. —CodeCat 23:01, 16 April 2013 (UTC)
Or made to be not mandatory at all, which is hypothetically at least what it does now. Mglovesfun (talk) 23:10, 16 April 2013 (UTC)
I firmly believe that using English as the default is a bad practice, so I can't agree with that. Furthermore, all our other category templates already take the language code as their first parameter. —CodeCat 23:16, 16 April 2013 (UTC)

Documentation tab of templates and modules [edit]

We've been slowly moving template documentation from /doc to /documentation. Currently, though, the "documentation" tab at the top of the page still links to /doc. How can this be changed? Also, modules should also have such a tab. —CodeCat 16:44, 18 April 2013 (UTC)

It's in Mediawiki:Common.js, under "Make tabs for citations-pages and template-documentation-pages". --Yair rand (talk) 16:45, 18 April 2013 (UTC)

Using a function in the same module [edit]

It may sound like a rather dumb question but how can I achieve this? Say the code is

p = {}
function p.a(f)
 text = mw.ustring.gsub(f.args[1],'.','a')
 return text
end

function p.b(f)
 text = p.a(f.args[1])
 return text
end

return p

Thanks. Wyang (talk) 23:12, 18 April 2013 (UTC)

I'm not sure what you're trying to achieve, but calling p.b won't work because f.args[1].args[1] probably doesn't exist. By passing the specific argument to p.a, the new f is set to the previous f.args[1], and since it doesn't itself have a args[1], it will break. Is text = p.a( f ) what you want to have? (Note that this would be basically basically the same thing as just setting p.b = p.a.) --Yair rand (talk) 00:19, 19 April 2013 (UTC)
I see. Should be text = p.a(f) in p.b(f). Thanks. Wyang (talk) 00:27, 19 April 2013 (UTC)

Adding a template to a category [edit]

I've added Template:ru-noun-anim-1-unc (will add other templates with -unc suffix) to Category:Russian uncountable nouns. Terms that use the template now show that they belong to Category:Russian uncountable nouns, e.g. Нептун. When I open the category, it just shows 8 terms! What is wrong? Is there a DB delay or something? --Anatoli (обсудить/вклад) 23:46, 18 April 2013 (UTC)

Yes it was a database delay. Problem is, there are some singular-only entries in that category now, like Иисус which isn't uncountable, just singular-only. Unless Russian grammar doesn't make any distinction between these two. Mglovesfun (talk) 14:42, 19 April 2013 (UTC)
Does any grammar make that distinction? —CodeCat 15:21, 19 April 2013 (UTC)
Well English does; you can't some "some Jesus" in the same way you can say "some grain" or "some water". French does too. Mglovesfun (talk) 15:31, 19 April 2013 (UTC)
"Yesterday I met some Jesus who was trying to sell me a watch." —CodeCat 15:43, 19 April 2013 (UTC)
MG said "in the same way you can say 'some grain' or 'some water'". That's a different sense of "some". Chuck Entz (talk) 19:13, 19 April 2013 (UTC)
Is the suggestion that "uncountable" means the same as "something you can have a quantity of"? To me, uncountable means that it can't consist of multiple individual instances. —CodeCat 19:24, 19 April 2013 (UTC)
Thanks for addressing this, guys. I'm having second thoughts about this categorisation, though. The templates with "-unc" suffix are used when there are no plurals. Theoretically, personal names, names of cities, gods, etc. all can have plurals. --Anatoli (обсудить/вклад) 09:49, 20 April 2013 (UTC)

Lua [edit]

{{#invoke:a|function|text||}} seems to be treated differently from {{#invoke:a|function|text|}}, as

if f.args[3] == '' then

or

if f.args[3] == nil then

treats the former code as false but the latter as true. Is there a way to solve this? Thanks. Wyang (talk) 05:16, 20 April 2013 (UTC)

It's not a bug, an empty string is not just not the same as a nil value. Just write
if f.args[3] == nil or f.args[3] == '' then
in your code. Dakdada (talk) 12:01, 20 April 2013 (UTC)
I usually write something like this: local param = args[3]; if param == "" then param = nil endCodeCat 12:37, 20 April 2013 (UTC)

Ugly font in taxonomic name inflection line [edit]

What template or other change led to the use of a hideous, too-small serif font for taxonomic name entries? See [[Datura]] or any other such entry. Who makes such decisions? DCDuring TALK 14:50, 21 April 2013 (UTC)

Looks the same as ever to me... sure it isn't on your end? —Μετάknowledgediscuss/deeds 15:03, 21 April 2013 (UTC)
Could be. Where would I look?
It seems to effect Translingual inflection lines, various other language inflection lines (eg. Greek), certain uses of {{term}}, and various template-sourced text such as the content of a show/hide bar for Greek declension. It appears using both Vector and Monobook skins. DCDuring TALK 15:29, 21 April 2013 (UTC)
Looks normal to me (Monobook under Chrome). SemperBlotto (talk) 15:34, 21 April 2013 (UTC)
I disabled Webfonts. Is that a possibility? [Apparently not].
It occurs with all my per-browser choices at default. I haven't noticed anything different at other websites. DCDuring TALK 15:36, 21 April 2013 (UTC)
It appears whenever I use Template:head in principal namespace or {{term}}: gratis
but not
  1. {{l/en}}: gratis
  2. {{t}}: gratis (en).
Does it have to do with the font selection used by Template:head and {{term}}? A CSS solution to fix my problem for me would help me, but would not be wise in case I am functioning as a miner's canary or the idiot in idiot-proofing. DCDuring TALK 15:59, 21 April 2013 (UTC)
I don’t see the problem in Safari/Mac or Firefox/Mac. Does it affect any of these lines?:
  1. lang="mul"
  2. class="mention-latin"
What browser/version are you using? If Firefox, check your Preferences > Content > Languages > Fonts & Colors > Default Font > Advanced > Fonts for. Make sure that under both “Western” and “Other Languages” you have “Sans-Serif” set to a sans-serif font, preferably the same one. Michael Z. 2013-04-21 16:17 z
Bingo! Thank you very much, MZ. I didn't remember making that change, though I remember visiting the page.
What advantage do we get from letting that kind of user preference affect the display of this website? No other site that I visited seemed to be affected by that mistaken selection, so they must exercise more control over fonts - and do so uniformly. DCDuring TALK 21:17, 21 April 2013 (UTC)
Lucky guess. 🐰
Few users ever touch those settings, especially now that browser support for UTF-8 has obsoleted separate code pages for each language or script. Safari only ever had default and monospace font settings, and has done away with those completely, although it does have a user style sheet. Firefox probably has a hard time dropping archaic features because of workflow.
I think few sites use lang attributes at all, except perhaps on the root html element, and a tiny proportion even have reason to use things like lang="mul" or lang="und". We are simply more langed up than any other website
But because we are trying to keep our lang codes correct, the few readers who want control will be able to use their browser’s language preferences and user style sheets. I anticipate browser makers will continue to improve tagged language support. Michael Z. 2013-04-21 22:55 z

Can Lua determine if a template is called from another template? [edit]

Lua uses the getParentFrame function to find out the parameters that were passed to the templated that invoked it. So, for example, say that {{term}} contains {{#invoke:term cleanup|cleanup}}. Then if {{term}} is called like {{term|word}}, then within Module:term cleanup, frame.getParentFrame().args[1] will equal "word". What I would like to know is... is it possible for the module to determine, in some way, which namespace {{term}} was called from, so that it can tell the difference between, for example, {{term}} being directly in a mainspace entry, and being called from another template. —CodeCat 15:21, 21 April 2013 (UTC)

I don't believe so, no. They don't want us inspecting the whole stack, just the topmost frame. (And TBH, I think that's a good thing. If I write a usage-note template that invokes {{term}}, I should be able to expect that it will behave the same way as if I'd put the usage-note directly in the entry.) —RuakhTALK 17:02, 21 April 2013 (UTC)
I understand that, but it would be very useful (for clearing out erroneous template uses) to be able to find which are being called through another template. Because fixing that template would probably fix many entries at the same time. It's a shame we can't use it... —CodeCat 17:08, 21 April 2013 (UTC)

Kassadbot not running? [edit]

There are over 4,000 entries in Category:Requests for autoformat. SemperBlotto (talk) 11:12, 23 April 2013 (UTC)

It was blocked due to a dispute regarding Japanese entries. So yeah. -- Liliana 19:56, 23 April 2013 (UTC)
There are some allegations that you, Liliana-60, deliberately changed the KassadBot's code to pick up entries Category:Japanese romaji after a consensus was reached on how to format romaji entries - a strict format, which definition lines are generated by the template, see kochira:
==Japanese==

===Romanization===
{{ja-romaji|こちら}}
Which produces:

Japanese

kochira

  1. See こちら
KassadBot didn't pick up any single romaji entry before that between 16 March and 7 April (after my edit to add # on a new line in Template:ja-romaji see diff). KassadBot started flagging romaji and adding them to Category:Japanese definitions needed after the 7th of April.
If you're not going to start working on Japanese entries, please consider changing the code back to what it was before 7th April or make an exception for Japanese and Gothic romanisation entries. It's possible and you know it. Editors working with Japanese have already expressed their strong opinion on this and converted all (nearly 7,000) romaji entries to a new style. I apologize in advance if you didn't do anything deliberately but you didn't sound very convincing when you denied changing KassadBot's code. --Anatoli (обсудить/вклад) 22:53, 23 April 2013 (UTC)
That's kind of the problem with negative proof; it's almost impossible to prove that one didn't do a certain thing. I could show the application's timestamp (with a last-modified date sometime in December 2012) but then people would say it's faked. I could say that anyone who runs the bot with the code I provided on this wiki (for a good reason!) would see it perform the same changes, but nobody would try and people would still not believe me. See what a difficult situation this is? -- Liliana 05:59, 24 April 2013 (UTC)
OK, I believe you. We have to move forward but I don't know what we'll do. --Anatoli (обсудить/вклад) 06:19, 24 April 2013 (UTC)

There are now over 7,000 entries in Category:Requests for autoformat. This needs to run. SemperBlotto (talk) 11:01, 1 May 2013 (UTC)

Returning the number of items under a Category [edit]

Hello. I'm an administrator in the Spanish Wikcionario and they're trying to create a template for a "language of the month" feature. A user is asking if we could display a sentence like "We currently have X number of entries in" (the language of the month). Does anybody here know if there is a magic word or parameter I can use to return the number of entries in a Category? (That way the template would show the number of items in the Languageofthemonth-Español Category). Thanks in advance for your help. If you could answer in my User talk:Edgefield page here, that would be even better. Best, --Edgefield (talk) 22:41, 23 April 2013 (UTC)

Ah, I just saw this {{PAGESINCATEGORY:categoryname}} word; I'll give it a try. Thanks, --Edgefield (talk) 22:42, 23 April 2013 (UTC)

MediaWiki:Gadget-RegexMenuFramework.js and MediaWiki:Gadget-HotCat.js [edit]

My browser refuses to load these two gadgets when browsing through HTTPS. Please copy the code from w:MediaWiki:Gadget-RegexMenuFramework.js and w:MediaWiki:Gadget-HotCat.js. Keφr (talk) 07:44, 24 April 2013 (UTC)

Module:headword [edit]

I have created this module as a replacement for {{head}}. Not all of it works yet, only the categories at the moment. I've moved that part over from the template to the module, and things seem to work. —CodeCat 14:14, 25 April 2013 (UTC)

On a side note, please don't forget to add comments in the code. There are already too many uncommented modules out there. Dakdada (talk) 14:36, 25 April 2013 (UTC)
Was this discussed anywhere before it was implemented? --Yair rand (talk) 15:47, 25 April 2013 (UTC)
Yes, here. —CodeCat 15:51, 25 April 2013 (UTC)

New idea about translations [edit]

The translation sections currently look like this:

{{trans-top|furniture}}
* Armenian: {{t-|hy|պահարան|tr=paharan}}
* Dutch: {{t+|nl|kast|m|f}}
{{trans-mid}}
* Greek: {{t+|el|ντουλάπι|n|tr=ntoulápi|sc=Grek}}
* Persian: {{t|fa|کمد|tr=komod|sc=fa-Arab}}
{{trans-bottom}}

It is possible to change it to something like this, with Lua (compare this):

{{trans|furniture|
* Armenian: պահարան [-]
* Dutch: kast mf [+]
* Greek: ντουλάπι n [+]
* Persian: کمد (komod)
}}

I want to know if this is considered helpful by the community and is worth it. The code is more readable and is easier to edit for newbies, and the users won't have to know the ISO code of languages to edit. --Z 16:07, 26 April 2013 (UTC)

Are you saying that the translation should not be wikilinked? You realise that we would lose the link to the translated word in the "foreign" Wiktionary. SemperBlotto (talk) 16:15, 26 April 2013 (UTC)
No, read it again. --Z 16:36, 26 April 2013 (UTC)
It is possible, but would it actually be more efficient? This effectively turns Lua into a parser, which may slow things down rather than speed them up. —CodeCat 16:16, 26 April 2013 (UTC)
That may be correct, we need to test it and see it in practice. I've compared {{l}} to {{l-list}}, l-list was not slower, but a bit faster; but my current Internet connection is too slow unfortunately, so my test may be inaccurate, I would be thankful if someone else test it too. We can generalize the result to {{t}} vs. Lua method. --Z 16:36, 26 April 2013 (UTC)
I see lots of potential for problems from different versions (including misspellings) of the language names, varying order of arguments, varying punctuation, etc. It looks simpler than it is: although there's no obvious inline code, everything has to be set up the way the module expects it, or you'll need complex, time-consuming code to allow for all the possibilities. We have bots doing this kind of parsing, but we don't have site visitors waiting for the bots to finish every time they view an entry. There's also the matter of coordinating with changes to the translation-adder and to bots, though that's secondary. Chuck Entz (talk) 17:20, 26 April 2013 (UTC)
Some of these problems like versions of language links are really easy to fix without making the code that complex and slow. Regarding order of arguments, that's true, making things easier to work with are usually at the cost of increasing risks of using it -- the easier you can edit and change, the more things will be unintentionally messed up (although fixed order for arguments has advantage too: the code will be more similar to the output). The question is do you think it is worth it overall? --Z 18:12, 26 April 2013 (UTC)

Never mind guys, I'm disappointed. This will, at best, become something like #Module:links which eventually went nowhere, even though it was nothing but improvement. Trying to improve things by changing older ways is just a waste of time here. --Z 18:22, 26 April 2013 (UTC)

Yea, my two cents on it here, it's surely a lot more trouble than it's worth. :/ Certainly I don't think it's worth fiddling around to change stuff that much just to be a little more newbie friendly...the translation adder built in trans tables is pretty good for that IMO. As for having to know ISO codes, people (specifically newbies) should just learn to search ethnologue or whatever it is, or even search on wiktionary for
  1. the entry for the language
  2. a safer bet "Category:X language".
User: PalkiaX50 talk to meh 18:38, 26 April 2013 (UTC)
I support it. Semper's comment doesn't make sense, CodeCat has raised a concern without testing it, and Chuck said the code will be "complex" without giving any real examples of why it will be any more complex than what we already have. Why don't you guys give it a chance or at least bring up a real, concrete problem with it? —Μετάknowledgediscuss/deeds 23:09, 26 April 2013 (UTC)
The complexity and the speed problems will happen because we will have to write a lot of code just to make a Lua module "understand" this new programming language that we will be creating. Essentially, we will be implementing a language within another language. So why should we reinvent the wheel when we already have a language that works and that everyone is familiar with: template code? The presence of the translation editor makes this even more redundant, because users won't even need to interact with the code. As long as there is no guarantee that this will not make things worse, I don't see how you could support it unless you do not actually have a full grasp of the implications of such a project. —CodeCat 23:24, 26 April 2013 (UTC)
"This will, at best, become something like #Module:links which eventually went nowhere" oh come on Lua is really, really new, it's way too early to say 'eventually went nowhere'. Mglovesfun (talk) 23:42, 26 April 2013 (UTC)
I did a quick search for names of languages indigenous to the British Isles to give a very rough idea of the complexity involved:
  • Irish, Erse, Irish Gaelic, Hibernian Gaelic, Gaeilge, Gaelige, Gaedhlag, Gaedhilge, Gaedhilic, Gaeilic, Gaeilig, Gaelic
  • Manx, Manks, Manx Gaelic, Gaelg, Gailck
  • Scots Gaelic, Scotch Gaelic, Scots-Gaelic, Scotch-Gaelic, Caledonian Gaelic, Erse, Gàidhlig, Gaelic
  • Welsh, Welch, Cambrian, Cambric, Cymric, Cymraeg
  • Cornish, Kernowek, Kernewek
  • Old English, Anglo-Saxon, Anglo Saxon, Anglosaxon, Englisc
  • Scots, Scotch, Inglis
True, more than half of these are obscure, and unlikely to be used here- though there are enough works with odd or old usage in places like Google Books that it's hard to be categorical. It's also true there are probably other names I missed, and variants in the Celtic languages due to consonant mutation.
What does the code do when someone puts "Gaelic", or "Scottish", or uses some misspelling that could be interpreted as more than one of the names for the Goidelic language- which all look very similar. There are several other cases where the same name is used for more than one language. It's true that template code has problems with people using se for sv, lt for lv or la, and so on- but at least, template code looks tricky, so people are more likely to look up the correct code. The new format looks like you can put just any old thing and the system will figure it out: in effect, it appears as if it's offering to do the nitpicky part for you. In reality, though, it will probably have its own set of constraints. It kind of reminds me of w:COBOL, and all the talk of how it was going to make coding just like writing English, and make it possible for computer-illiterates to understand it.
All of this can be dealt with, but it will take lots of work to educate both the system and the editors. Even if we keep both old and new going side-by-side to space out the conversion work, there's still going to be cleanup categories with the stuff the module hasn't figured out yet, and someone's going to have to go through them. It would have to be a pretty big improvement to justify the extra work. Chuck Entz (talk) 01:50, 27 April 2013 (UTC)
The part I think you don't get is that this is already a problem, and we already have a solution, just one that isn't utilised enough. Create Template:langrev/Welch and type cy into it (for example), and you will have found out how to solve this. Conrad's tool already relies on this infrastructure (which btw should likely be Luacised, but I have no idea how to do it), and I assume Z's would as well. —Μετάknowledgediscuss/deeds 05:34, 27 April 2013 (UTC)

My opinion is that it's not a terrible idea, but Conrad's translation adder already has it beat in terms of editor friendliness. Because of this, it seems to me like we would be doing a lot of work for no gain. Also, correct me if I'm wrong, as Lua is still very new to me and I don't fully understand it, but wouldn't Lua have to essentially recompile the whole thing every time someone adds a translation? Conrad's approach has the advantage of once and done, that is, the computation is done once and then hard-coded into the entry, and does not have to figured out again. -Atelaes λάλει ἐμοί 23:56, 26 April 2013 (UTC)

Every page is reprocessed from scratch whenever it's viewed and the cache is "old". Editing and saving a page forces a refresh, but it's also refreshed after a short time (maybe less than a day but I don't know for sure). —CodeCat 00:00, 27 April 2013 (UTC)
I think he was looking at the creation of the template code as sort of a precompiling into a more machine-friendly format which wouldn't have to go through all the trouble of parsing every time. Of course, the template code has its own overhead, so it probably wouldn't make that much of a difference. Chuck Entz (talk) 01:01, 27 April 2013 (UTC)

For the record, I oppose this proposal. The easiest way for humans to add translations is via Conrad's translation adding tool. The tool requires the editor to enter the language code, but that is a thing easy to pick by an editor in a single language, and the tool can be adjusted to help the user find the langauge code based on the language name. Furthermore, the current markup is unambiguous and well structured. A human can grasp it just from looking at examples of using the {{t}} template. The markup is machine-friendly for all the existing and future reusers of Wiktionary data. Going from a clear and unambiguous template markup to some sort of arbitrary syntax that does not make it clear what is going on is making things worse, IMHO. --Dan Polansky (talk) 09:09, 27 April 2013 (UTC)


I tested something similar on fr.wikt : compare fr:Utilisateur:Pamputt/eau and fr:Utilisateur:Darkdadaah/eau (~3000 translations). The second page is much faster than the first one to generate. Granted, the second one is simplified (and it uses parameters, so less parsing is needed), but there is still a big difference. This is worth considering for template-heavy pages. Dakdada (talk) 16:26, 27 April 2013 (UTC)

Help test the new account creation and login [edit]

Hi all,

After many weeks of testing, We (the editor engagement experiments team) are is getting close to enabling redesigns of the account creation and login pages. (There's more background about how we got here and why ‎our blog post.)

Right now are trying to identify any final bugs before we enable new defaults. This is where we really need your help: for now, we don't want to disrupt these critical functions if there are outstanding bugs or mistranslated interface messages. So for about a week, the new designs are opt-in only for testing purposes, and it would be wonderful if you could give them a try. Here's how:

If you have questions about how to test this or why something might be the way it is, I'd definitely check out our step-by-step testing guide and the general documentation.

Many thanks, Steven (WMF) (talk) 19:48, 26 April 2013 (UTC)

Template:ga-proper noun [edit]

Why did this edit result in codespill? I still can't see what stupid little thing I did wrong. —Μετάknowledgediscuss/deeds 23:30, 27 April 2013 (UTC)

The second #invoke is missing its closing brackets. —CodeCat 23:35, 27 April 2013 (UTC)
True, but when you preview it on a page (e.g. Gaeilge) it still produces headword-line codespill even after I add the forgotten curly brackets. —Μετάknowledgediscuss/deeds 23:39, 27 April 2013 (UTC)
Do you have a text editing program? Most higher-end editors (like Notepad++ for Windows or just about anything for Linux) will highlight matching brackets. You can use that feature to find which brackets are missing their counterpart as well. I did that and found the problem right away. —CodeCat 23:48, 27 April 2013 (UTC)
OK, will do next time. I usually don't think to use it for wiki markup. Thank you! —Μετάknowledgediscuss/deeds 00:30, 28 April 2013 (UTC)

Bot request for Ancient Greek edits [edit]

I need to make a series of minor edits to a number of Ancient Greek entries (somewhere between 100 and 250 of them). I would be quite grateful if someone who has a general purpose bot and some time could help me avoid a large number of very tedious edits. All the entries listed here need to have the 7th parameter of {{grc-conj-aorist-1}} removed, such that the eight parameter becomes the seventh, the ninth becomes the eighth, and so on. As close to concurrently as possible, {{grc-conj-aorist-1}} needs to be changed. If someone can do this while I'm online, I'd be quite happy to edit the template while the entry edits are taking place. Otherwise, I could do it beforehand, and then undo it, so that a simple further undo will make it right. Thanks very much. -Atelaes λάλει ἐμοί 22:46, 28 April 2013 (UTC)

Sure, I'll do it. Barring any problems, it should be done within the hour. —RuakhTALK 00:25, 29 April 2013 (UTC)
I've made the changes to the template, and sampled some of your bot's changes, all of which look correct. Thank you so much for your help with this! -Atelaes λάλει ἐμοί 01:09, 29 April 2013 (UTC)
Yes check.svg Done. But for future reference — this turned out to be a really terrible way to do this. I ended up having to make a large number of error-prone manual edits. The problem is that the only safe way to run this sort of bot is to make successive conservative passes — you don't want to risk breaking things that don't meet your initial assumptions, so what you do is, you write the bot to validate its assumptions, and only make an edit if its assumptions are satisfied. (For example, in my initial pass I wasn't aware that there were instances of {{grc-conj-aorist-1|7=...|8=...|9=...|...}}, where all the numbers would need to be fixed.) You then examine cases that the bot skipped due to failed validations, and you increase the scope of cases it supports, and you run it again. Except that in this case, there was no straightforward way to distinguish between a template that had already been edited, and one that hadn't, so I had to do a lot of manual bookkeeping, which is naturally error-prone. And the cases where the template occurred multiple times on a page — it's just good luck that I happened to notice, before running the bot, that this was even a possibility. I didn't bother trying to run the bot on those, it would never have worked out right. But manual edits were error-prone, and it's quite likely that I missed a few instances. More generally — chances are very good that there are a few entries that are now broken, and there is simply no way to find them. (If this bothers you, feel free to go through the relevant edits of Ruakh (talkcontribs) and Rukhabot (talkcontribs) and click "rollback" on all of them, and we can try this again, with a more robust migration strategy.) —RuakhTALK 02:06, 29 April 2013 (UTC)
No, that's ok. I'm sure there were plenty of mistakes to begin with. Ancient Greek verb conjugation is hideously complex, and while I try very hard to get them all right, I know that I don't. Fortunately, I'm working on something to validate all of the forms, which will find any relevant errors. It's difficult to say exactly how long it'll take me with my phrenetic, off and on approach to Wiktionary, but I think it should happen eventually. Thanks again and sorry that it ended up requiring a lot of manual work. -Atelaes λάλει ἐμοί 02:16, 29 April 2013 (UTC)

Template:af-personal pronouns [edit]

Stupid question, but how do I make the cells for u and dit take up two rows in height, so they don't have to be repeated? —Μετάknowledgediscuss/deeds 04:50, 29 April 2013 (UTC)

Yes check.svg Done. I also increased the collapsible box’s width, so it doesn’t require scrolling (feel free to revert). Also, consider using {{l-self}}. — Ungoliant (Falai) 10:29, 29 April 2013 (UTC)
Ah, width-fixing. That's what I forgot. Thank you! —Μετάknowledgediscuss/deeds 01:30, 30 April 2013 (UTC)

Main page spacing oddities [edit]

mispadding.png

On the main page (using Firefox 18.0.2 on OSX; see screenshot to the right):

  • The vertical space above "Behind the scenes" doesn't match the vertical space above the WOTD and FWOTD headers.
  • There's too little horizontal space to the left of "Community Portal" and "Discussion rooms".
  • There's more space between "Community Portal" (respectively "Discussion rooms") and the line beneath it than between Community Portal's description and "Discussion rooms", which visually associates things with each other incorrectly.

Does anyone know where/how these can be fixed?​—msh210 (talk) 17:46, 30 April 2013 (UTC)

These are caused by the border-collapse:collapse; CSS in the table. --Yair rand (talk) 18:29, 30 April 2013 (UTC)

Pages appearing empty [edit]

A few times now I've gone to an entry only to find the entire page empty. The page header would display, as well as the categories, but no actual content. The page source does contain everything, so this seems like a Javascript issue that is hiding everything. —CodeCat 21:58, 30 April 2013 (UTC)

This has happened to me too recently. Reloading the page has always restored the content. —Angr 10:24, 1 May 2013 (UTC)
Yes, reloading does bring it back. I've noticed it also happens on non-content pages like history pages. —CodeCat 15:51, 1 May 2013 (UTC)
It doesn't in Opera. When you reload the contents only appears for a split second, then it's back to headword+categories only. You have to go to the editing window to see what's there. --Thrissel (talk) 11:42, 5 May 2013 (UTC)
Has it ever happened outside the main namespace? --Yair rand (talk) 19:25, 1 May 2013 (UTC)
I haven't noticed it outside the main namespace (including page histories) yet; I'll keep an eye out for that. —Angr 19:36, 1 May 2013 (UTC)
bugzilla:47457 came to my mind (though that's about User pages on Commons), but unfortunately the comment so far is not yet very useful because it does not describe the problem well, e.g. mentioning the used browser and its version, and an example page to reproduce. --AKlapper (WMF) (talk) 10:55, 3 May 2013 (UTC)
I've only seen it happen on en.wiktionary though. Dakdada (talk) 13:03, 3 May 2013 (UTC)
I've also noticed on a few occasions that, if I load more than one page at a time and if one of the loaded pages appears empty, there is a high chance that the others will be empty as well. So there may be some connection there. —CodeCat 19:42, 3 May 2013 (UTC)
In the past ten days I've been keeping track of what namespace it happens in, and it has only been in the mainspace, including viewing diffs and previews in mainspace. I haven't seen it happen in any other namespace. —Angr 14:30, 11 May 2013 (UTC)
Happens to me as well. --Dan Polansky (talk) 19:28, 3 May 2013 (UTC)
This is still happening for me and it's quite annoying. Has anyone been able to find out more? —CodeCat 13:43, 11 May 2013 (UTC)
If someone technically adept could give an analysis of precisely how it's hidden, i.e. what item has been given the CSS property of display=none, that would likely help in figuring this out. As with Codecat, this is occasionally still happening to me as well, and it is damned annoying. I will, of course, post my findings the next time I see it, as I didn't think to do such an analysis previously. -Atelaes λάλει ἐμοί 14:04, 11 May 2013 (UTC)
Using the web console of Firefox, I have found that the empty pages contain the following error: "Exception thrown by ext.gadget.TabbedLanguages: newNode is not defined". I have the tabbed languages gadget enabled. I am using Firefox 20.0.1. --Dan Polansky (talk) 14:36, 11 May 2013 (UTC)
Ah ha! I had a feeling it might be tabbed languages. Whoever thought of that idea should be shot in the face. I'll take a look at the code, and see if I can figure out. I suspect Yair will do the same when they see this, as might some of our other JS ninjas. Thanks Dan. -Atelaes λάλει ἐμοί 14:46, 11 May 2013 (UTC)
Apparently Common.js isn't always loading. Or perhaps it just loads after it's supposed to. If the former, this is going to cause huge problems regardless of the TL situation, as parts of the content (translations, quotations, etc.) are only accessible after Common.js runs. Either way, I suppose we can't rely on newNode being available everywhere, so I've copied the whole thing into the tabbed languages script. This should fix the page blanking, but not the other issues. Is anyone experiencing broken expandable tables? --Yair rand (talk) 17:36, 12 May 2013 (UTC)
I haven't experienced any, even while I was experiencing empty pages. —Angr 18:38, 12 May 2013 (UTC)

May 2013

Plural gender in User:Conrad.Irwin/creation.js [edit]

Currently, when a word is plural and also has a gender, this script generates things like {{head|xx|yyy|g=f|g2=p}}. In other words, it splits the gender in two, treating plural as a gender of its own. This has always been the normal behaviour and normal practice, but now there are also dedicated templates for this, {{f-p}} and so on. I created them in anticipation to the adoption of the replacement, Module:gender and number, which indicates genders in this fashion to avoid ambiguity (Specifically: Does f|p mean feminine and/or plural, or feminine plural? If we use f-p to mean the latter, then f|p is unambiguous.). However, although I was able to fix the translation editor, this script still creates entries the "old" way. And I'm not sure what to do to change it, because it still really confuses me every time I try to change anything. Can someone else help out please? —CodeCat 23:12, 3 May 2013 (UTC)

Adding Template:delink to Template:trans-top [edit]

Am rather sick of removing links from {{trans-top}}, can we implement delink to it? It's Lua-based so I guess it should be quick, right? Mglovesfun (talk) 20:39, 4 May 2013 (UTC)

Yes check.svg Done. See User:Mglovesfun/sandbox, code seems to work for all link generating templates and not just square brackets. Mglovesfun (talk) 09:42, 7 May 2013 (UTC)

Module:si-translit - Sinhalese (Sinhala) transliterator [edit]

I've made a simple Sinhalese (Sinhala) transliterator. It does a very simple conversion but not too accurate. E.g. transliteration of සිංහල (siṁhala): {{#invoke:si-translit|tr|සිංහල}}: (currently produces "si ̃hl")

It ignores the inherent vowel "a", which is attached to every consonant (like Hindi and other abugida Indic languages). I have asked User:ZxxZxxZ, who has kindly fixed the module for Hindi Module:hi-translit (it still has some flaws, it uses too many "a"'s) but he is unavailable. User:CodeCat is also busy at the moment.

The transliteration rule is simpler in Sinhalese: consonants are transliterated as they are in the table but add an "a" if they are not followed by a diacritic or the the inherent vowel "a" is "killed" by "hal kirīma". So final consonants (unlike e.g. Hindi) still have an "a".

Could someone with a knowledge of Lua fix the module, so that all consonants have an "a" when not followed by a diacritic or "hal kirīma", please? You don't need to understand the Sinhalese script. Diacritics are a bit hard to work with, though. I used SC Unipad (free download) to work with diacritics - it breaks any string into components.

It's not urgent but would be great if someone could make it work, hopefully in a way so that the logic could be reused for other abugida scripts. Even Google Translate can't handle Sinhalese. A minor issue is anusvara, which marks nasalisation, which works similar to Hindi, see Wiktionary:Hindi_transliteration#Nasalisation. I used ̃ as a temporary solution. --Anatoli (обсудить/вклад) 06:24, 7 May 2013 (UTC)

Merging existing Wiktionary with imported entries [edit]

An existing dictionary is being made freely available, and now we need to import its entries into Wiktionary. But how? An example of an entry in XML format is shown at sv:Wiktionary:Teknikvinden#En tillgänglig ordlista i XML-format. It is easy enough to convert that XML to wiki markup. If there is no article, it can just be created. But when Wiktionary already has an article, the result needs to be merged with the existing Wiktionary entry. Are there any good tools or best practices for that? --LA2 (talk) 11:59, 8 May 2013 (UTC)

Probably bots. — Ungoliant (Falai) 14:55, 8 May 2013 (UTC)
Are you proposing adding these to the English wiktionary? I can take a look at it. DTLHS (talk) 02:10, 13 May 2013 (UTC)
In this case, all definitions are given in Swedish, so an import to Swedish Wiktionary is the primary goal. What I'm looking for is experience of merging such databases into Wiktionary. It might be easy to translate all definitions to English, for import here, and then we'd have the problem of merging the resulting list into English Wiktionary. But if nobody has ever done such a thing before, I'll have to develop my own methods from scratch. --LA2 (talk) 20:56, 14 May 2013 (UTC)

template:etyl black links [edit]

Why does {{etyl|kld|-}} generate Gamilaraay, a black link, instead of a link to w:Gamilaraay language? It certainly is not a well-known language. DCDuring TALK 20:31, 8 May 2013 (UTC)

A "black link"? You mean just a word written in black with no link at all? Because a while back the decision was made never to link language names regardless of how well known the language is deemed to be. —Angr 21:33, 8 May 2013 (UTC)
I'm actually wondering whether we shouldn't link the language all the time. I have occasionally come across etymologies that feature a language I haven't heard of, and a link to its definition would have been convenient. —CodeCat 21:46, 8 May 2013 (UTC)
I stole a line of code from Ruakh (I think from User:Ruakh/common.css) to always make them linked. I'd happily have that code work for everyone. Mglovesfun (talk) 13:57, 9 May 2013 (UTC)
I don't think CSS can make links out of nowhere, though. It must have been a JavaScript, but that doesn't seem like a proper solution to apply to the whole wiki. —CodeCat 14:11, 9 May 2013 (UTC)
It's in Wiktionary:Per-browser preferences, and yes, it depends on JavaScript. I've previously raised the possibility of turning it on for everyone (which wouldn't depend on JavaScript), and no one objected, but then I never actually got around to making the change. (Mea culpa.) If you are inclined to do so, please do! —RuakhTALK 05:19, 13 May 2013 (UTC)

it seems there is a bug [edit]

I am not a programmer, so I do not know if it is a bug. In thermos, the translation section is strange. When I entered a Uyghur word, instead of the correct چايدان (ug) (chaydan), it shows چايدان (ug) (chaydan){{#switch:ug|ru= ({Turkish --Hahahaha哈 (talk) 14:57, 10 May 2013 (UTC)

I suspect it was this edit to {{t}} that has caused the trouble. —Angr 15:09, 10 May 2013 (UTC)
I seem to have fixed it, but would appreciate a double-check from people who are better at templates than I am. —Angr 15:15, 10 May 2013 (UTC)

User:Hippietrail's recent additions to Template:l [edit]

I'm not really sure what motivated this sudden addition of even more parameters to this template. Their explanation was "it's to solve the long-standing problem where you can't associate several orthogonal chinese/japanese/korean variants to the same transliteration in a single entry" but I never considered that to be a problem. Has it been a problem for anyone else? And in any case, I really don't agree with their solution - two more parameters. Especially with Lua, we don't need these kinds of solutions at all, we can do better. —CodeCat 01:20, 13 May 2013 (UTC)

All additions are sudden. I've been annoyed by this missing functionality for a decade. Are you suggesting non-sudden additions such as submitting single character changes one by one? Or is "sudden" some kind of rhetoric or weasel word?
When you say you never considered the ugly workarounds for using {{l}} with Chinese, Japanese, and Korean a problem, does this mean that you have been using {{l}} with these languages at all?
In my talk page you used the terminology "does more harm than good" but you haven't touched on that topic here.
At last some sensible talk. I'm very much in favour of Lua solutions but I haven't started Lua hacking yet. As a programmer, I consider a lot of template stuff to be ugly hacks that could be elegantly solved in Lua, though I am a bit worried about people that want to parse our dump files to use our data in their own projects.
Is there a general consensus to halt all development on templates and move to Lua? I'd love to hear about it.
By the way I'm very keen to have a much less "turn taking" chat about all this stuff in the IRC cannel, #wiktionary on irc.freenode.net — hippietrail (talk) 01:31, 13 May 2013 (UTC)
It’s not sudden when you discuss the change with the community. — Ungoliant (Falai) 01:37, 13 May 2013 (UTC)
What I mean is that nobody else has ever had this problem, or at least not that I am aware. The current solution is to use two instances of the template; has that given any problems at all? What I meant by "more harm than good" is that it's better to discuss possible approaches to the problem rather than just picking one that might in the end not be very workable at all. The last thing {{l}} needs is more parameters; it and many other templates are already rather complex because they seem to be designed to cater to every possible situation. {{l}} isn't even the worst case, {{term}} is far worse with its lit= and pos= parameters. There is a general tendency to be like "this template doesn't do quite what I need... let's add another parameter!", but without much regard for whether it makes sense. Short term goals give long term problems. —CodeCat 01:37, 13 May 2013 (UTC)
I consider using two instances of the template to be a problem. I'm sure I'm not the only one that's been annoyed by that though I'm not sure why you might expect to be aware of it. Two instances of the template mean neither the wikitext, the HTML, nor the DOM associate the text as a cohesive unit for anybody trying to use our data. For human users there is confusion over whether to add the transliteration or other parameters to each instance, just the first, just the last, whether to put one inside parentheses, etc. At least one alternative template has been created to work around this for the Korean language, {{ko-inline}}. I'm not sure if you were or should've been aware of that template. I'm personally not sure whether there are other such workaround templates for Chinese, Japanese, or any other language. I wouldn't be surprised and I wouldn't be surprised if they all work differently and render differently and that the same people rarely use more than one of them. A consistent solution would be best.
I also dislike that we have these complex templates. In the early days I argues against using more of them but I lost the argument and became one of the ones implementing templates, way back when, though I haven't hacked them for some time now. Where they have been put into long-standing use though it's better to have capable templates.
I think you have to justify your stance that more parameters is inherently bad. Why is having more "the last thing it needs"? I actually don't find this template very complex, it's mostly inline where many other templates suffer from deep nesting.
I've already mentioned several times that I've been annoyed by this missing functionality for a decade so I find your use of the terminology "short term goals" to be quite ingenuous and you haven't elucidated what these "long term problems" are.
So let me put it to you: are you aiming to simplify templates by removing parameters that have "too many"? or are you advocating halting any further development of templates? Or do you have a Lua proposal to replace this template?
One more important question: Do you believe templates are to give editors less typing, or to bring uniformity to Wiktionary, or both? I personally feel that standardization is best for templates and I've seen them move that way. Parameters with the same name and same function have spread though there are still exceptions for instance.
irc://irc.freenode.net/wiktionary / irc://#wiktionary@irc.freenode.nethippietrail (talk) 02:01, 13 May 2013 (UTC)
  • @Hippitrail, might I suggest that you use / tweak {{ja-l}}, if your concerns are specific to Japanese? This offloads much of the problem from the much-used (and thus somewhat more complicated-to-change) {{l}}.
If you're looking for something that could be used for Chinese and / or Korean as well, perhaps use the code from {{ja-l}} to create {{cmn-l}} and {{ko-l}}? Cheers, -- Eiríkr Útlendi │ Tala við mig 03:31, 13 May 2013 (UTC)
As I mentioned there is already {{ko-inline}}. Are you really saying that having one template per language with unrelated names and dissimilar usage is a good thing? They certainly impair discoverability. Would you advocate that general purpose templates generally should be avoided in favour of developing separate templates for each language that users must learn? Surely uniform template names, uniform template parameters, uniform template use across languages is going to lower the barrier to entry. Does anybody know if there is a Chinese-specific equivalent to {{l}}, {{ko-inline}}, and {{ja-l}} - or for any other language for that matter? If disparate template names for individual attempts at solving the same problem is a good thing, how do we find those templates for the language we are not familiar with? — hippietrail (talk) 06:44, 13 May 2013 (UTC)
There seems to be a category for templates such as these, Category:Internal link templates - I've gone ahead and added the Korean and Japanese templates to it, I encourage anybody else to add such templates to the category that they know of or discover. — hippietrail (talk) 06:52, 13 May 2013 (UTC)
For consistency with all the other language-specific {{l}} templates, shouldn't {{ko-inline}} and {{ja-l}} be renamed {{l/ko}} and {{l/ja}}, or at least have redirect from those titles? Actually, upon preview I see we already have {{l/ja/Jpan}}; do we really need both that and {{ja-l}}? —Angr 12:15, 13 May 2013 (UTC)
The {{l/ja}} template should be a subset of {{l}}. Specifically, it should be possible to replace {{l/ja}} with {{l}} at any time without breaking anything. —CodeCat 14:17, 13 May 2013 (UTC)
  • @Hippietrail, the issues you're dealing with (as best I understand them) involve special considerations required for CJK languages, and not for any others. Rather that your changes to {{l}} in an attempt to code in these considerations have apparently caused some kind of problem, it might make sense to instead have some other related template that builds in these considerations. If you can code something that plays well for all CJK languages, then great; if C or J or K has some specific coding need that makes things difficult, then create something specific for that one. I'm not a big fan of one-size-fits-all, because invariably it doesn't.
{{l}} is complex, and used far and wide, so any changes made to that must not adversely affect those entries using {{l}}. Moreover, perusing the source for {{l}}, it looks like this template would be significantly more processor-intensive than {{ja-l}}. There are various reasons that could warrant creating a different template to handle different specific needs.
Whether that template is called {{ja-l}} or {{l/ja}} or {{cjk-inline}} or what have you, is a different matter entirely, and not one I have any strong opinions about. (Though CodeCat's point about {{l/ja}} certainly makes sense.) Cheers, -- Eiríkr Útlendi │ Tala við mig 20:56, 13 May 2013 (UTC)

Beta Code, Anyone? [edit]

I've accumulated a text file with all the character sequences necessary to generate all possible lemma forms used by Perseus for their Liddell, Scott & Jones Greek Lexicon. Eventually I plan to try my hand at Lua programming with a module to replace most of the innards of the {{R:LSJ}} with code that translates the headword of a Greek entry directly to the format Perseus expects, without requiring editors type the sequence in by hand.

In the meanwhile, though, it occurred to me that the encoding they use, w:Beta Code, would make a very handy alternative input method. Beta code was designed to represent all the characters in the Greek polytonic script- including accents, breathings, iota-subscripts, macrons, breves, and other diacritics, in any combination- using only sequences of characters found on every standard keyboard. It's used by the w:Thesaurus Linguae Graecae project, so you know it has to be pretty complete.

For example, the Ancient Greek noun εἰδωλολάτρης (eidōlolátrēs, idol-worshipper) can be entered using "ei)dwlola/trhs", and Ἑβραῖος (hebraîos) using "*(ebrai=os" Once you learn the conventions, it's relatively easy to type in any Greek word without using menus, palettes, etc. Many of the characters used overlap with wiki syntax, but it shouldn't be hard to keep the Beta Code input from contaminating the wikitext. There are also Beta Code versions for other scripts such as Hebrew, but I haven't used them and can't vouch for their usefulness.

I don't have the necessary knowledge of Javascript or of our user interface to create it myself, but I thought I would mention it in case anyone who does might feel so inclined. There's a pretty comprehensive manual linked to from the WP page which should have all the information you need. Thanks! Chuck Entz (talk) 02:15, 13 May 2013 (UTC)

You might want to mention this to the developers of the jQuery IME (mw:Milkshake). --Yair rand (talk) 03:03, 13 May 2013 (UTC)
For starters, I applaud this effort. It would be nice to not have to figure out the code. For what it's worth, if I were to take on this project, I would use javascript, and have it pull up the pagetitle, convert it into beta code, find the edit window, find the empty template, and insert the code into it. This is probably because I'm comfortable with JS, and have no experience with Lua. In any case, I support whatever approach works, and am happy to provide any assistance that I can. -Atelaes λάλει ἐμοί 01:02, 15 May 2013 (UTC)

Transwiki from Wikipedia [edit]

Hi there. Way back in October I nominated w:A leopard doesn't change its spots article for deletion, and the result of the discussion was transwiki to Wiktionary. The problem is that the transwiki tag (This page will be copied to Wiktionary using the automated transwiki process) has been on the article ever since, and no transwiki has taken place. I was told at the Village Pump that it is in fact not an automated process, and that it needs to be imported by admins over here. Although there is already a page for it over here, what happens next? Richard BB (talk) 08:50, 14 May 2013 (UTC)

I'd say replace it with w:Template:Wiktionary redirect. Since Wiktionary already has a page for it, we don't need the transwiki. I really wish people at Wikipedia would check whether something is present at Wiktionary before knee-jerk-reacting "Transwiki to Wiktionary" at AFD discussions. Wiktionary isn't Wikipedia's trashcan. —Angr 12:29, 14 May 2013 (UTC)
Of course Wiktionary isn't bound by the result of a Wikipedia AFD. Given the poor quality of the page in question and that we already have it under a more correct title, either delete it or soft redirect. Mglovesfun (talk) 08:30, 15 May 2013 (UTC)

Any MediaWiki hackers here? [edit]

I'm trying to help a dictionary project find some technical help in adapting MediaWiki for their use. They specifically asked if I knew anyone with experience on Wiktionary. I know that User:Conrad.Irwin has some knowledge, but instead of spamming him (and a few other likely looking candidates here), I thought I would just ask: Is anyone interested in a small gig like this? (Appologies if this is inappropriate for here.) -- MarkAHershberger(talk) 00:33, 15 May 2013 (UTC)

Lua - help required [edit]

I've programmed as a sideline for 35 years (Fortran, BASIC, C (not C++), Turbo Pascal) - but I'll still have the odd syntax problem with Lua! Please can someone tell me why function p.test1 in Module:User:Saltmarsh doesn't work. thanks — Saltmarshαπάντηση 05:46, 15 May 2013 (UTC)

What about it doesn't work exactly? —CodeCat 10:27, 15 May 2013 (UTC)
I received a "script error" message. — Saltmarshαπάντηση 15:58, 15 May 2013 (UTC)
el-translit’s tr function receives a frame as a parameter, not a string. — Ungoliant (Falai) 11:46, 15 May 2013 (UTC)
Thanks - onwards and upwards! — Saltmarshαπάντηση 15:58, 15 May 2013 (UTC)

Special redirects [edit]

I created a couple of special redirects that were deleted. It was suggested on my talkpage that I come here to see what folks thought about their utility. You can read about the issue on my talkpage, but essentially, the idea is to track usage of links from Wikipedia to Wiktionary. Currently, most links from WP to WT are via template, and there seems to be no direct way to track number of clicks on that kind of interwiki template. My solution was to create special redirects on WT: i.e., have "brand new (redirect)" redirect to "brand new", and then create a link from the Wikipedia page to the special redirect. In a month, we could look at the traffic stats for "brand new (redirect)", compare it to the stats for "brand new", and know how many people looked up "brand new" via Wikipedia versus other sources. The special redirects are designed to be few in number, temporary in nature (no more than a couple of months just to get the needed data), and essentially invisible to the reader. I know that redirects are highly unfavored at WT, but I thought this might be a worthwhile use of them. Thoughts? Dohn joe (talk) 17:58, 15 May 2013 (UTC)

Does MediaWiki not have web traffic statistics with referrer info? Michael Z. 2013-05-15 18:46 z
Good question. I've never dealt directly with MediaWiki. How would I go about doing so? Dohn joe (talk) 19:05, 15 May 2013 (UTC)
I would certainly like to know about the results of such research for a goodly sample of entries. DCDuring TALK 19:12, 15 May 2013 (UTC)
Here are some links, but I can’t figure out where the logs are:
 Michael Z. 2013-05-15 20:16 z

Merge {{trans-top}}, {{trans-mid}}, {{trans-bottom}} [edit]

With Lua it seems like it should be possible to merge these templates into a single template: {{temp|trans-table|(translations go here)}}. This would allow us to automatically do various kinds of validation (language matching, nesting, etc) and cleanup, and it would eliminate the need to manually balance the columns. Thoughts? DTLHS (talk) 20:07, 15 May 2013 (UTC)

If this is being revised, why not use CSS column-width, and dispense with all of the unnecessary tables and brute-force column balancing? Michael Z. 2013-05-15 20:19 z