Wiktionary:Beer parlour

Definition from Wiktionary, the free dictionary
(Redirected from Wiktionary:Beer Parlour)
Jump to: navigation, search

Wiktionary > Discussion rooms > Beer parlour

Lautrec a corner in a dance hall 1892.jpg

Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.

Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.

Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!

Beer parlour archives edit


October 2015

Templates for place names[edit]

I created a few templates for place names; they should be able to generate standardized definitions for them in all languages. I've been using these templates for entries of places in Brazil only, but this system should be usable for other countries by copying and adapting the existing templates.

I chose the format of "municipalities of São Paulo, Brazil" to copy Category:en:Municipalities of China and others. (see Place names and Earth modules for a complete list of place name categories; I should mention I've found some different, inconsistent naming formats that could hopefully be fixed eventually)

The templates I created:

Main template:

Known issue:

  • These templates generate simple standardized definitions like "A municipality in São Paulo, Brazil." They still lack the functionality of linking from states to state capitals, and vice-versa. I plan on implementing that feature soon.

Thoughts? --Daniel Carrero (talk) 13:14, 1 October 2015 (UTC)

I was expecting something more like {{surname}} and {{given name}}. —CodeCat 13:22, 1 October 2015 (UTC)
How so? --Daniel Carrero (talk) 13:26, 1 October 2015 (UTC)
Like {{municipality|São Paulo|Brazil|lang=pt}}. —CodeCat 20:05, 1 October 2015 (UTC)
I did start to develop something like this at User:Daniel Carrero/place (while is just a stub, it did work perfectly in this revision, see the code). But, in my opinion, I'd rather use hard-wired full definitions (no matter if they are stored in MW code or Lua) like "municipality of São Paulo, Brazil" for this reason: if we use parts of definitions and allow people to use these parts in any way they see fit, then it would be potentially impossible to make the whole system consistent (judging by all the current un-templatized entries for place names, which are inconsistent in various levels, from the act of categorizing or not some entries, to the internal logic of the category naming system itself!):
  1. There would be definitions like {{city|Florianópolis|Brazil}} (basically all second-level subdivisions in Brazil are "municipalities", presumably that's why multiple Wikipedias and Wiktionaries use "municipality" categories; yet I found many of those to be randomly defined as "cities" or "towns", which could mess up the categorization and definitions).
  2. There would be too much freedom to change levels, like {{municipality|São Paulo|Southeastearn Region|Brazil}} with a "Southeastearn Region" in the middle.
  3. And also just {{municipality|Florianópolis|Brazil}} does not account for the fact that Florianópolis is a state capital unless you add another parameter.
I don't mind changing the system, but since the current system restricts each of the full definitions and associates them with categories individually, it does not have any of the aforementioned limitations, so I would like any other proposed system to be safe from all these problems as well. --Daniel Carrero (talk) 20:32, 1 October 2015 (UTC)

Update: Rather than adding new parameters to the previous templates, I created different templates for capitals because I needed more specific definitions and categories (they use 2 categories each: state capitals of Brazil; and municipalities of each state). I've thought of somethiong along the lines of {{place:capital of São Paulo, Brazil}} or {{place:São Paulo (capital)}}, but that could change. They work well. The only problem I fear is having too many different templates, though that seems manageable. Naturally, I am open to different suggestions, such as using Lua or less templates with more parameters, but first I have one thing to say: The current system is simple and intuitive enough (at least, that's my opinion) and very customizable in case other countries have different needs. (like, provinces instead of states; or more or less comma-separated levels) For this reason, I'd suggest waiting some time before attempting to merge the current templates into any more condensed model, because that might not work for all countries. In the meantime, I plan to continue using these templates for new entries. As usual, I also request feedback of other people, too. --Daniel Carrero (talk) 19:14, 1 October 2015 (UTC)

Update: Using fewer templates for Brazil: I'm deleting state-specific and (oh God.) city-specific templates in favor of {{place:Brazil/municipality}} and {{place:Brazil/state capital}} (the last one I'm going to create in a moment.) --Daniel Carrero (talk) 08:19, 2 October 2015 (UTC)

Should all foreign-language place names have counterparts in English?[edit]

I've created most of the 853 entries for municipalities (a.k.a., cities/towns) of Minas Gerais (a state of Brazil) in Portuguese. See Category:pt:Municipalities of Minas Gerais, Brazil. I am thinking of doing the same to fill the English category completely as well. See Category:en:Municipalities of Minas Gerais, Brazil.

Should all place names in foreign languages have counterparts in English? Surely many of those (or all of those?) are citable anyway. I've done some cursory search of small Brazilian towns on Google Books and as of yet found all of them to be citable in English. Random example: Comercinho is citable on this book. What do you think? --Daniel Carrero (talk) 08:33, 2 October 2015 (UTC)

Might as well. I doubt anyone would stop you. --Zo3rWer (talk) 12:58, 2 October 2015 (UTC)
How can you be so sure that every single placename in every language has an English translation? Making all FL placenames link to English by default was not a good idea. — Ungoliant (falai) 15:42, 3 October 2015 (UTC)
Of course they have an English name. How else would English speakers refer to it? —CodeCat 15:47, 3 October 2015 (UTC)
I doubt that some little village up in a Chinese mountain has an English name. It will certainly have an English transliteration - but we don't accept those. SemperBlotto (talk) 16:04, 3 October 2015 (UTC)
Attestation in running English text is a prerequisite for an English entry, as per CFI. --Dan Polansky (talk) 16:13, 3 October 2015 (UTC)
Place names are not words, they're designations. If a person comes along and introduces themselves as Katherine, then the other party must use that name to refer to her. And that's true cross-linguistically; speakers of all languages must use that name, because that's what she said her name is. The same would apply to a random village in China. It can be assumed that foreign speakers will adopt the name that locals use, because that is the name of the village. Of course, there's exonyms, but that's a different story: with exonyms, there is a name, but certain speakers decide to use another. When there is no known name, it must be assumed that there is still at least one name. —CodeCat 17:33, 3 October 2015 (UTC)
That sounds almost like an argument to make such entries Translingual rather than pin them down to a specific language. —Aɴɢʀ (talk) 17:54, 3 October 2015 (UTC)
Maybe that's a good idea. —CodeCat 17:59, 3 October 2015 (UTC)
Names as "translingual" would seem to make it pretty difficult to deal with even things like Москва ‎(Moskva)/Moscow, let alone more divergent cases like Köln/Cologne. And in case you're only suggesting this approach for placenames that only exist in one language: suppose that we discover that a tiny Chinese village does have a distinct name in a minority language spoken nearby; would this turn it from translingual back into (Mandarin) Chinese? I'd say treat placenames as particular to the local language, and attestations in other languages as citation loans, unless there's some kind of evidence to the contrary (e.g. for a pronunciation or spelling particular to English). --Tropylium (talk) 18:17, 3 October 2015 (UTC)
Single-word place names (London, not New York) either are words per me and John Stuart Mill or in any case they behave like words: they get written down using alphabet, get pronounced, inflected and have an etymology. Having them as translingual very often does not work since they get language-specific inflection. But even if place names somehow were not "words", they still get included and regulated by CFI, and the criterion of attestation applies to them. We even had this vote Wiktionary:Votes/pl-2010-05/Placenames with linguistic information 2, so I do not think I am in a minority to think they need to be attested. --Dan Polansky (talk) 19:08, 3 October 2015 (UTC)
Then does this mean that many of the articles for places that exist in English Wikipedia actually have titles and describe places in running text, when their names are not words usable in English according to our own standards? I find that a bit bizarre. Let's take w:Tegal Buleud as a random obscure place. The article uses the name of the place twice in running text. Is "Tegal Buleud" not English if it's used in this way? If not, then what would make it English? —CodeCat 22:55, 3 October 2015 (UTC)
It's not that "their names are not words usable in English", it's that their names are not attestable in English by our standards. --WikiTiki89 15:54, 5 October 2015 (UTC)
I understand that, but then this does mean that our mission statement is false. We don't include all words, just the attestable ones. —CodeCat 16:18, 5 October 2015 (UTC)
And that's always been the case. --WikiTiki89 16:23, 5 October 2015 (UTC)
What I do (with Italian placenames) is generate an Italian entry then, if I can find an English translation, I add an English entry for the translation. If I can't find a translation, I have been known to add an English section if it is a well-known place. SemperBlotto (talk) 18:23, 3 October 2015 (UTC)
For Spanish places, I usually add an English and a Spanish, but sometimes just a Spanish or just an English. This is pure laziness from my part, but I suppose in theory we could have the same entry in loads of languages. I know that the French Wiktionnaire often does this - see [1] as an example of (excessive?) repetition. --Zo3rWer (talk) 10:31, 4 October 2015 (UTC)
Place names are designations, but these designations are words. There should be a section for a language only if the word is used in the language: attestations are required. These sections may be very useful, especially for pronunciation (have you ever heard of another dictionary with a pronunciation given for place names from all over the world?), homophones, examples/citations, usage notes, derived words, gentilics, anagrams, etc. In the example given above, most sections are repetition, sure, but these sections will be completed with time. Lmaltier (talk) 20:20, 5 October 2015 (UTC)
  • I don't see any reason to forbid creating English entries for foreign-language place names. I also don't see a need to rush out and create them all immediately. People can create them from time to time, though. Purplebackpack89 22:12, 5 October 2015 (UTC)
    Yes, especially when they see them used in the language. It's the same for Italian places names used in Spanish, etc. Lmaltier (talk) 17:58, 6 October 2015 (UTC)

Place name format: English (non-gloss) vs. foreign language (translation + gloss)[edit]

As I said in the discussions above, I created a few placename templates. I'd like to discuss about the results as they appear on the entries. See the entry Ouro Preto (a municipality in Brazil). It is currently defined in 3 languages using the same template. ({{place:Brazil/municipality}}) I checked on Google Books, it's attestable in all three.

If the language is English, the definition is formatted as a main (non-gloss) definition and if it's a foreign language, it's formatted as an English translation (linking back to the English entry or section) + {{gloss}}. It looks good IMO in the entry I linked, and it's a consistent system overall, but it also causes a problem: the translation still points back to the English section even if there's no English section to begin with, like in the entry Comercinho. (Which in my opinion, is a bad thing that should be fixed some way or other, but it's not extremely harmful. Ultimately, it's just a pointless link back to the same page.) I was just kind of expecting an English section to be present most of the time, like it happens randomly in entries like Laredo and Colorado (each of these entries have definitions for places in multiple countries, in the English section).

Note that the template uses the same syntax for all languages (only the language code changes), so it's supposed to make copypasting between languages easy. If you wanted to add a French section, you would just use the same code with "fr".

# {{place:Brazil/municipality|en|state=Minas Gerais}}
# {{place:Brazil/municipality|pt|state=Minas Gerais}}
# {{place:Brazil/municipality|es|state=Minas Gerais}}

This is one reason why I asked directly above "Should all foreign-language place names have counterparts in English?". If we could simply add English entries/sections for all placenames, the problem would be solved. But for cases where there's no language section in English, what should the template do? Should the template allow for non-gloss formatting (first letter capitalized and the period at the end) in foreign language entries? Can we use the Translingual section some way for place names? Most likely, any new functionality would be controlled by new parameters to the templates, so the system would become a bit more complex than it is now.

My favorite proposal is this:

  • For foreign language entries without an English translation, make the template keep the gloss format but without linking the main word [like this: Ouro Preto (municipality in the state of Minas Gerais, Brazil)]. Rationale: Consistent formatting in all entries, and when you translate Ouro Preto into English, you would use the original language name ("Ouro Preto"), even in cases where it's not attestable in permanently recorded media in English.

I plan to continue using the current system for the new entries I am creating. Edit {{meta-place}} if needed, to change how it works. --Daniel Carrero (talk) 08:21, 5 October 2015 (UTC)

Proposal: Extinct unwritten languages should not qualify for inclusion[edit]

WT:CFI includes some interesting clauses for the inclusion of terms from languages that are not "well-documented on the internet". Perhaps the most interesting is that entries may be created even on the basis of a single mention, without any attestation or evidence of attestability required.

Let's think a little bit about what are we even doing here. I would not say this criterion is always unreasonable; but obviously it does not exist just as a backdoor to document words even when they are not attestable per the usual standards, since we do not allow this method for creating entries for rare words in well-documented languages.

The impression I get — though this does not appear to be written out anywhere! — is that there's an underlying idea that some languages mainly exist elsewhere than on the Internet: e.g. as old literary languages that are not spoken anymore, or mainly as spoken languages which so far do not have much written materials available. And so we assume that if a word as been documented in a scholarly source or the like, it could in principle be also easily attested once the speakers of Pohnpeian get around to hanging online in greater amounts, or once people have uploaded enough Karakhanid Turkic materials on Wikisource, or so on forth. But as long as this is not the case, asking editors to provide attestations would be simply adding to a backlog.

"Mention implies potential attestation" however fails to hold for some languages. I propose that to qualify for inclusion in the main namespace, a language must have at least one of the following:

  1. A surviving written tradition.
  2. Continuing existence (≈ the potential for a written tradition to be established in the future).

This is relatively lenient still. If we consider epigraphic attestation a written tradition (but see below), languages like Oscan would continue to qualify under criterion #1 — alongside any more abundantly attested extinct languages, say Hittite, Old Tupi or Ubykh. In the absense of other updates to CFI, any individual words in these languages would also continue to qualify on the basis of mentions alone.

What this serves to exclude are languages like Crimean Gothic, Pumpokol or Tasmanian. Any material that is known of languages of this sort generally only exists in linguistic sources, in all but exceptional cases comes with glosses attached, and thus seems to fall clearly short of Wiktionary's general rule of inclusion, as stated at WT:CFI:

A term should be included if it's likely that someone would run across it and want to know what it means.

That is: it seems to me like entries in dead-and-buried languages do not exist to fulfill this need. They exist solely for the sake of linguistic curiosity. No one randomly runs across text in Crimean Gothic, and is left wondering what it might mean.

Linguistic curiosity is of course still a need, and Wiktionary is doing a good job at answering it, I believe. Some people might indeed wonder "so how does one count to ten in Crimean Gothic", or "has this Ket word of alleged Yeniseian ancestry been even recorded from the related languages" and want to look it up. Hence I am not proposing flat-out deleting what we currently have in languages that fail the current-or-potential-written-tradition test. A better solution probably would be the inclusion of recorded data from any natural language variety in the Appendix namespace. (Or, perhaps, the creation of a new Extinct namespace?)

You might ask what difference does it make to switch from regular entries to an appendix, other than make the terms not come up by default in search. One thing is that this would also seem to be grounds to diverge from the usual layout requirements. For example:

  • If a language's known corpus is something like twenty or fifty words, we can put them in a single appendix rather than sprinkle them across several stub entries.
    • Individual quotations could in such cases be replaced with a single references section.
  • If a language has only been recorded in phonetic transcription, we could accordingly provide only the pronunciation/transcription, and avoid implying that an orthography exists.
    • If competing transcription schemes exist, we could standardize one of them and cover the others by means of an equivalence table, rather than creating duplicate entries.
  • Missing information such as parts of speech could be left unknown.
  • If glosses are only available in a language other than English, and the precise meaning cannot be verified, we might leave the glosses in the original language and not risk mistranslating things.

--Tropylium (talk) 01:08, 6 October 2015 (UTC)

I support this proposal, if I understand that it essentially means "Mentions may only be used for attestation, if the language's corpus as a whole has actual attestations of other words. Languages whose entire corpus is mentions do not qualify for inclusion." —CodeCat 09:27, 6 October 2015 (UTC)
It's still more lenient than that: "Entries may be based on mentions only if the language's corpus includes actual attestations, or, if the language is not extinct." --Tropylium (talk) 16:35, 6 October 2015 (UTC)


The above argumentation could additionally be extended to exclude from mainspace also two further types of languages:

  • Extinct languages whose known corpus is highly limited and which does not allow directly establishing the language's grammar, pronunciation, the meaning of its words, etc. Often information of this type can be determined via the comparative method (a particularly good example might be Proto-Norse) — but it would seem to me that this is not too much different from entirely unattested reconstructed languages.
  • Moribund languages, for which all available material is linguistic documentation and no revitalization efforts exist. For such languages, we have effectively no foreseeable hope of ever gaining anything resembling a written standard that could be documented according to regular attestation criteria. An example might be Ter Sami.

I would however like to go on record as strongly in favor of continuing to include endangered languages for which even marginal natural transmission can be suspected to remain (e.g. Ishkashimi), or even elementary attempts at formulating a written standard are underway (e.g. Votic). Hence any exclusion of languages by these criteria should be probably "opt-in", i.e. with the burden of proof on the side claiming that a language is indeed poorly-documented enough to not merit inclusion. --Tropylium (talk) 01:08, 6 October 2015 (UTC)

I strongly oppose this. Your "impression" (i.e. assumption) recorded above is inaccurate; we have more lenient criteria for such languages because we would otherwise have no way of documenting them (the use-mention distinction is merely meant as a tool for us to determine whether a word is really used). There is no point to separating off certain natural languages thus, and it would be counterintuitive to users. —Μετάknowledgediscuss/deeds 01:25, 6 October 2015 (UTC)
An interesting interpretation. (And in the absense of explicit policy support, I could ask whether it is also merely an assumption.) It seems to carry fairly strong implications, though.
  • If Wiktionary's purpose is to document any use of language whatsoever, on an equal level;
  • if this includes extinct languages just as well as living ones;
  • and if we're not tied to direct attestation, but are to also allow documentation thru indirect inference such as mentions
— then this would actually seem to require that we must also document proto-languages in mainspace, to an extent! The most reliably "entirely reconstructed" words are at least on an equally probable footing as are words in otherwise attested languages that are presumed to have exist based on hapax attestations by non-native speakers, reconstructed semantics, etc.
You might also want to note that the proposals above are not in opposition to the documentation of anything at all, only on what should be presented as regular dictionary entries. Contrary to its name, WT:CFI is not actually the criteria for inclusion on the Wiktionary servers period, but merely the criteria for inclusion in mainspace.
(Also, should I assume that this reply is meant also in opposition to the main proposal, not just to the two subproposals?)
--Tropylium (talk) 16:31, 6 October 2015 (UTC)
I oppose this as well. Attested terms should be included, period. —CodeCat 09:27, 6 October 2015 (UTC)
Fair enough, though this counterargument appears to only cover only languages of the Proto-Norse type. With languages of the Ter Sami type, no attestations exist whatsoever. --Tropylium (talk) 16:31, 6 October 2015 (UTC)
What exists for Ter Sami then? —CodeCat 16:58, 6 October 2015 (UTC)
The only major source is a comparative dialect dictionary of Kola Sami (the majority of whose materials are Kildin Sami) rendered in phonetic transcription, i.e. a huge bunch of mentions. Accordingly, what entries we have essentially have been created only as a pronunciation. Consider e.g. лa̭i̭ja ≈ IPA [ɫʌi̝jɑ]. (I doubt if anyone with native knowledge of Ter Sami would recognize either of these written forms, if presented with it.)
It's not only moribund languages that suffer from this problem, though. Tons of endangered languages only have field research materials available so far. You may recall a BP discussion about a new "Languages without a Written Tradition" in February. Hence I draw a difference here not between written and unwritten languages, but on if Wiktionary inclusion could be expected actually benefit the speaker community, and if we could expect native speakers to contribute at some point. --Tropylium (talk) 19:39, 6 October 2015 (UTC)
  • Query: How would this proposal affect proto-languages? Do you suggest that we remove all of our proto-language appendix entries? ‑‑ Eiríkr Útlendi │Tala við mig 08:00, 6 October 2015 (UTC)
    None of this has any effect on reconstructed proto-languages, which are already excluded from mainspace inclusion. --Tropylium (talk) 16:31, 6 October 2015 (UTC)

Away until mid-October.[edit]

I will be away until mid-October. Please try to have this project completed by the time I return. Cheers! bd2412 T 15:28, 7 October 2015 (UTC)

You're not my supervisor! WurdSnatcher (talk)
The "get it completed" is a traditional Wiktionary joke that gets funnier every time it is reused. HTH. Equinox 01:29, 8 October 2015 (UTC)
Surely there's not much to do, right? How complete is Wiktionary right now? 90%? 95%? --Daniel Carrero (talk) 01:39, 8 October 2015 (UTC)
Not even 1%. — Ungoliant (falai) 01:42, 8 October 2015 (UTC)
Blasphemy! Everyone knows that Wiktionary is pretty much complete already. Any word we don't have is probably not worth the trouble anyway. --Daniel Carrero (talk) 01:51, 8 October 2015 (UTC)
You're not my supervisor! HTH --Catsidhe (verba, facta) 01:41, 8 October 2015 (UTC)
What's the etymology of the joke? --Dixtosa (talk) 14:34, 11 October 2015 (UTC)

I am stunned to find that in the entire week that I was absent, the collection of all words in all languages was not completed. bd2412 T 01:06, 14 October 2015 (UTC)

I ate them all. --Romanophile (contributions) 01:08, 14 October 2015 (UTC)
That must have given you a sour stomach. bd2412 T 13:21, 14 October 2015 (UTC)

The vote on adding a collocations or phrases namespace or section[edit]

Wiktionary:Votes/2015-09/Adding a collocations or phrases namespace or section has opened. The vote was prompted by Wiktionary:Beer parlour/2015/August#Adding_a_collocations_tab_or_section. - -sche (discuss) 17:10, 8 October 2015 (UTC)

WT:NORM and multiple spaces, tabs and indentation[edit]

Currently, the first rule says no leading or trailing spaces, while rule 4 says no leading or trailing space in templates. I came across many entries with leading spaces, such as apple. Many of them occurred in uses of the {{quote-book}} template, though presumably it may occur in any template call. apple breaks both rule 1 and rule 4, since line breaks are also leading/trailing space in a template, and furthermore the parameter names have whitespace around them. This whitespace could simply be removed, but is this desirable?

I have also come across many pages that include tabs; some of them at the start or end of a line, others in the middle of a line. The question is the same, what should be done to get rid of them? They could be replaced with a set amount of spaces, but multiple spaces are equivalent to a single space in Wikitext, so this seems a bit pointless. This makes me wonder if there needs to be a rule that says no multiple spaces in a row. Then tabs could just be replaced by a space. What do you think? —CodeCat 19:48, 8 October 2015 (UTC)

I think rule 4 should be amended to allow what apple is doing. I think the intention of rule 4 is to disallow things like {{ en-noun | - }} and [[ apple ]], but for {{quote-book}} and other templates that use a large number of parameters with long values, it is more readable to put in line breaks. I don't particularly like the extra spaces on each line of the {{quote-book}} template, but I can't see a reason to ban them. --WikiTiki89 19:57, 8 October 2015 (UTC)
But we already do ban them. | page = 537 goes counter to rule 4 whether you put it on the same line as the template name, or on a line of its own. The rule specifies that it should be compressed to |page=537. I don't think it would be any good to make an exception to this whitespace rule for parameters on their own line.
I have less objections to the practice of breaking long template calls up into multiple lines in general, as long as a set format is decided on for those as well. Right now, there are differences in the amount of leading whitespace (apple vs accost). Should the leading whitespace be removed, so that the line starts with |? Then rule 1 would be satisfied. An exception could be added to rule 4 that the | preceding a template parameter can be optionally preceded by a line break.
Also, the practice should be nuanced for positional parameters, because leading and trailing whitespace does not get stripped from them, unless the template does it itself explicitly. Stripping whitespace in a template requires a module, as template code can't modify strings. At the same time, templates that take an indefinite number of positional parameters, like {{compound}}, {{head}} or {{der3}}, should be written as modules anyway, for other reasons. —CodeCat 20:32, 8 October 2015 (UTC)
But there is a workaround for stripping whitespace in templates: pass the text as a named parameter to another template that returns the text back (like {{strip|={{{1|}}}}}). Anyway, I guess I would agree that the leading space in  | page = 537 is bad practice, but I think the spaces remaining in | page = 537 are harmless as long as it is consistent within the template. --WikiTiki89 21:09, 8 October 2015 (UTC)
I've now edited apple to conform with WT:NORM at least as far as the whitespace rules are concerned. —CodeCat 21:34, 8 October 2015 (UTC)
Maybe you should have waited for more opinions before doing that. --WikiTiki89 19:25, 9 October 2015 (UTC)
I didn't think following existing rules would be controversial. In fact, I did it to show the outcome of those rules. —CodeCat 20:15, 9 October 2015 (UTC)
Re: "it is more readable to put in line breaks" I strongly disagree and remove them regularly in favor of spaces. We have many instances of the entire edit frame being taken up with a single citation template and many pagedowns being required to proceed from one definition to another. The result is that substantive editing of definitions of highly polysemous, cited, ie, English, terms needs to be done offline and can only be done offline with difficulty. Perhaps it is time to use transclusion of citations from citation space to make such entries intelligible in the edit frame. DCDuring TALK 23:56, 8 October 2015 (UTC)
A few extra page-downs is not so bad compared to trying to read such a long template crammed into one line. I would agree that this should not be done when the parameters to the template are short enough. --WikiTiki89 19:25, 9 October 2015 (UTC)
FWIW, the Russian manual declension template {{ru-decl-noun}} is specifically intended to be used with linebreaks, e.g.:
|о пауке́-во́лке|о паука́х-волка́х}}
This way of formatting puts the singulars in the first column and the plurals in the second column, and reading down the rows you get nom, gen, dat, acc, ins, prep. Putting it all on one line sometimes happens but makes it much less readable. So we might want to amend things to allow linebreaks with the preceding vertical bar on the same line, while still excluding leading/trailing/embedded spaces. Benwing2 (talk) 09:29, 9 October 2015 (UTC)

Inflections identical to lemma form[edit]

I've noticed that in general, when an inflectional form of a word is identical to the lemma, there is no separate definition for it. For instance, pecūnia and pecūniā, the vocative and ablative singular forms of pecūnia, have no definitions, yet each inflected form would have its own definition if they weren't on the lemma page.

I've come across a fair number of entries (mostly for Latin) that have a definition for each inflection in the lemma entry. I can't find any at the moment, but they're like this:


1. A woolly ruminant of the genus Ovis.
2. plural of sheep

Is there some sort of policy on this? If not, I think there should be. I would vote for the second option, since it is most consistent with us having separate definitions for each inflected form under non-lemma entries, but I'd like to hear other people's thoughts on the matter. Andrew Sheedy (talk) 03:51, 9 October 2015 (UTC)

I'm opposed to listing inflected forms that are identical to the lemma form, unless (as with pecūniā) the diacriticized headword form is different from the lemma's own diacriticized headword form. Since [[sheep]] already says "(plural sheep)", there's no need to list it separately. —Aɴɢʀ (talk) 09:29, 9 October 2015 (UTC)
When I created Arabic non-lemma inflections I specifically added code to exclude adding non-lemma inflections to the same page as the lemma they're derived from. Because I was only creating verbal inflections, this principally applies to the 3sg masc past active (which is the same as the dictionary form), but also to 3sg masc past passives, which almost always have the same written form as the active, although the vowels are different. I did this because I felt it would create a lot of noise to include all those passives on the same page; among other things, almost every verb page would be formatted using "==Etymology 1==" and "==Etymology 2==" (I did it this way because the pronunciations are different, although maybe there's a better way). With nouns it would be a lot worse, since for every noun lemma there would be four or five non-lemma entries on the same page, each with different pronunciation but the same nonvocalized spelling. The assumption here is that if the user looks up an Arabic word by spelling, it's enough to get them to the right page for the lemma, and they can hopefully figure out by looking at the conjugation table for the verb that it has 3sg masc past active and passive that are spelled the same as the lemma form. Benwing2 (talk) 09:42, 9 October 2015 (UTC)
I do think it would be good to make this information more explicit, but I think it would be confusing like that. Maybe a usage note? (This is the base form of this term, and it is also identical to the genitive plural and the ablative singular) WurdSnatcher (talk) 16:04, 10 October 2015 (UTC)
I think it's made explicit enough in the inflection table itself. —CodeCat 17:24, 10 October 2015 (UTC)
"Genitive plural" and "ablative singular" are probably a lot more confusing to a user than what we have now. Equinox 21:38, 11 October 2015 (UTC)
So should definitions listing the other forms that are identical to the lemma be deleted, or left as is for now? The general consensus seems to be that they shouldn't be included. Does anyone else think that this should be made "official"? Andrew Sheedy (talk) 21:45, 11 October 2015 (UTC)
I think it would need at least a poll on this page showing near-consensus (80%+ support?) and possibly a vote if there were not near-consensus. DCDuring TALK 22:26, 11 October 2015 (UTC)
For Hungarian entries, I list identical lemma and non-lemma entries separated by their own Etymology header. See terem, a lemma noun, verb and non-lemma noun form. Because the non-lemma form has its own declension, pronunciation and hyphenation, I would like to continue to include them. --Panda10 (talk) 12:40, 12 October 2015 (UTC)
The possessive form is what's called a "sublemma": a form of another term that has some lemma-like properties, like having its own inflection. Participles also fit into this category. For fully non-lemma forms, that are not sublemmas either, there shouldn't be a separate entry if it's identical to the lemma form in all respects. If there is a difference (one that's not apparent from spelling), then it should have its own entry, like for Latin ablative singular forms. Non-lemmas, whether sublemmas or not, should not have etymologies unless the formation is irregular, and even then it's probably better to put the etymology on the lemma page. —CodeCat 13:21, 12 October 2015 (UTC)
Re "Non-lemmas, whether sublemmas or not, should not have etymologies": Etymologies may not be useful for English non-lemma entries, but they are for agglutinative languages. For Hungarian, the non-lemma etymology allows the user to click on the suffix for more information as in for example ésszel. --Panda10 (talk) 13:39, 12 October 2015 (UTC)

Please vote - Allowing matched-pair entries[edit]

Please vote on Wiktionary:Votes/2015-08/Allowing matched-pair entries — it ends in 3 days: 23:59, 12 October 2015 (UTC).

Current results:

  • Support: 6 (66,6%)
  • Oppose: 3 (33,3%)
  • Abstain: 0

--Daniel Carrero (talk) 20:05, 9 October 2015 (UTC)

Poll and discussion: table for seasons of the year[edit]

I created Template:table:seasons about 16 hours ago. It is a table for the 4 seasons of the year that can be used in any language. Disclaimer: This template does not have to be used for any languages which have seasons other than the 4-season system — see this article for some 6-season calendars. (Only if the template can be changed somehow to agree with the needs of such languages.)

I chose some icons to illustrate the table and started adding it in on entries of multiple languages. In diff, @Catsidhe reverted one of the entries with the edit summary: "Undo revision 34541571 by Daniel Carrero (talk) That is twee, insulting, and i would like you to stop that now. It's starting to feel like vandalism." I started the discussion Thread:User talk:Catsidhe/Table for seasons, where Catsidhe says more about what they don't like about the template. (After that discussion, I shrunk the images and changed the style to the current 1-row template, turning larger "cartoon"-style images into less conspicuous icons.)

First of all, I realized I did not discuss that season template beforehand, so I apologize if creating and using that table was not a good idea and I'm ready to undo all the changes if that's what people want. I opened this discussion to make sure what format the community prefers for this. I acknowledge Catsidhe's opposition, but I'd feel stupid if I re-edited the entries right now to return to the list format without any further discussions and other people wanted to discuss the issue or wanted the tables back.

(Other tables I created recently: Template:table:playing cards,Template:table:suits and also Template:table:poker hands. The first of all was Template:table:chess pieces, which I discussed in the BP in September. Please let me know if there's any problem with them. Template:table:colors was created by User:DTLHS and has been discussed a lot in the talk page and also revised multiple times.)

The previous state of the entries for seasons was:

  • Many season entries have been using the "list" system, as in Category:English list templates - "Template:list:blahblahblah" with text as opposed to tables and cross-linking many templates between languages. The list system was a previous project of mine, I created the initial design of lists a few years ago, though it has changed a lot since then. I started converting a few season lists into tables after creating the initial Template:table:seasons template. I also created a few new season tables that didn't exist before as lists, such as Template:table:seasons/ast.

My rationale for using tables in some cases, as opposed to lists:

  • Consistency in word order and illustrations if they are needed; in short, tables are supposed to help you know instantly the meaning of all the words even if you don't speak the language in question, without the need to check each entry.

Also I chose the icons because IMHO they represent well the ideas of each season in the table even in low resolutions. As alternative proposals, I suggest: 1) keeping the table with different images or 2) keeping the table with English -> Foreign-Language text translations without images. But, as I said, if people want the list format back, (or agree on some other idea) I'll undo my changes (I'd use AWB for that). So please let me know what the community wants. --Daniel Carrero (talk) 01:58, 11 October 2015 (UTC)

Proposal: Using the table for seasons[edit]

Proposal: Having a table for the 4 seasons of the year that can be used in all languages, (Template:table:seasons - check the template to see a list of which languages have the season table already, which includes Template:table:seasons/en for English, Template:table:seasons/fr for French, etc.) except those languages that don't agree with the 4-season system. This proposal is just about having the table, not about the exact images that are used or the exact format of the table.

See 3 examples of implementation:


  1. Symbol support vote.svg Support --Daniel Carrero (talk) 01:58, 11 October 2015 (UTC)
    Also I like the pictures, but it's okay if I'm in the minority. --Daniel Carrero (talk) 21:18, 11 October 2015 (UTC)
    Note: primavera is an interesting entry to look. It currently has the seasons table in the Asturian, Catalan, Galician, Interlingua, Italian, Portuguese and Spanish sections. --Daniel Carrero (talk) 21:43, 11 October 2015 (UTC)
  2. Symbol support vote.svg Support -- I like the pictures, though I wouldn't be too upset to lose them. I do generally like the idea though. WurdSnatcher (talk) 17:02, 11 October 2015 (UTC)
  3. Symbol support vote.svg Support -- The pictures are unnecessary, but they can be removed if the majority agrees. I think it's a good idea. Aryamanarora (talk) 21:21, 11 October 2015 (UTC)
  4. Symbol support vote.svg Support The tables look nicer than plain lists of words, and can have clarifying translations and/or pictures too. Much more effective than a list of foreign words alone, where you have to click to know what they mean. —CodeCat 21:36, 11 October 2015 (UTC)


  1. Symbol oppose vote.svg Oppose Simply hideous. Wrong for English wiktionary. Maybe OK for Simple Wiktionary? DCDuring TALK 02:15, 11 October 2015 (UTC)
  2. Symbol oppose vote.svg Oppose addition of Template:table:seasons (which has pictures intended to represent seasons) to entries for now. I'll wait to see what supporters are going to say to see whether I should change my mind. --Dan Polansky (talk) 11:07, 11 October 2015 (UTC)
  3. Symbol oppose vote.svg Oppose the pictures. Table or list, I don't mind, just not the daft cartoons. Catsidhe (verba, facta) 11:21, 11 October 2015 (UTC)
  4. Symbol oppose vote.svg Oppose these, although I like the ones that can be symbolised unambiguously. —Μετάknowledgediscuss/deeds 17:00, 11 October 2015 (UTC)
  5. Symbol oppose vote.svg Oppose per Equinox (below) - this wastes a lot of space to provide four words. - -sche (discuss) 19:38, 12 October 2015 (UTC)



  • I'd be fine with it if we got rid of the little pictures. —Aɴɢʀ (talk) 10:34, 11 October 2015 (UTC)
  • I don't really like the images and am not sure they are necessary. Equinox 18:02, 11 October 2015 (UTC)

When I created this poll, I said "This proposal is just about having the table, not about the exact images that are used or the exact format of the table.", but really the images seem to be the most controversial aspect of the table, so I've tried a different design: Template:table:seasons (without images).

I used it for:

What do you think? --Daniel Carrero (talk) 21:15, 11 October 2015 (UTC)

Honestly all the table/border stuff seems like a waste of space for only four words. Equinox 21:18, 11 October 2015 (UTC)
I oppose use of the table, per Equinox. (And Equinox's view should carry more weight than others' in a discussion about seasons. :-) )​—msh210 (talk) 17:49, 12 October 2015 (UTC)
A simple list of the three seasons (with gloss for FLs) that are coordinate terms of the headword is what seems appropriate to me. The Coordinate terms header is intended for just this kind of thing. I would think there would be some technical challenge in efficiently suppressing the season that was the headword. DCDuring TALK 23:56, 12 October 2015 (UTC)

Update: I removed the pictures from the table. Past revision with pictures: this revision.

The official support/oppose count is 4-5, but it does not do justice to the table/picture relation. Maybe I designed this poll poorly.

Some people opposed the table in its entirety, but other people explicitly opposed the pictures while either saying they would be fine with the table or they don't care if it's a table. This, regardless of whether those people voted "Support/Oppose/Abstain/Comments". Even among people who supported the table, a number of people said they don't mind the pictures at best. That confused the hell out of me, but it's clear that at the very least the pictures are unwanted by the current majority, probably the whole table too actually, it looks more like a "no consensus" right now. --Daniel Carrero (talk) 02:41, 13 October 2015 (UTC)

No consensus, I guess. (as of now) --Daniel Carrero (talk) 20:03, 19 October 2015 (UTC)
No consensus needed to implement under existing WT:ELE using coordinate terms header, omitting from the display the headword. Can't that be done technically? DCDuring TALK 22:39, 19 October 2015 (UTC)
IMHO, I'm against using direct links in each wiki page because that would involve undoing the work using list or table templates. We can (edit: I mean "I can", since I promised that, but I was expecting some consensus and this discussion has become too confusing for the reasons that I said. Probably I'll let the status quo linger for a while, then I'll create some new discussion later.), however, convert the current tables back to the list format. Interestingly, Template:list:seasons/lv seems to be the only "list:"-prefixed template that uses 2 lines for some reason.
The distinction between "Coordinate terms" and "See also" seems to be a moot point. It's true that "Coordinate terms" exists in ELE, but so does "See also". --Daniel Carrero (talk) 23:00, 19 October 2015 (UTC)

Sub-national countries in Wiktionary:Wiktionarians[edit]

Wiktionary:Wiktionarians currently has sections for the Basque Country, Catalonia, and Spain; the former two are divisions of the latter. This is inconsistent with the principle of according sections only to countries qua sovereign states. Nevertheless, this is a dictionary, and as such political considerations are not as important as linguistic ones. The Basque Country and Catalonia have their own languages (Basque [eu] and Catalan [ca], respectively), so it is relevant to us whether a person is from one of those sub-national countries. On this principle, I see validity in including sections for other sub-national countries, for example Flanders and Wallonia (Belgium), Quebec (Canada), Guangzhou, Inner Mongolia, Manchuria, Tibet, and Xinjiang (China), Lapland (Finland / Norway / Russia / Sweden), Brittany (France), Gaeltacht (Ireland), Hokkaido and Ryukyu Islands (Japan), Eastern Cape, Free State, Gauteng, Kwazulu Natal, Limpopo, Mpumalanga, Northern Cape, North West, and Western Cape (South Africa), and Cornwall and Wales (United Kingdom). What is the opinion of the community? — I.S.M.E.T.A. 10:25, 11 October 2015 (UTC)

Speaking only to the case I'm somewhat familiar with, I would not be in favor of a section for the Gaeltacht, as being from the Gaeltacht is neither a necessary nor a sufficient condition for being a native speaker of Irish (though there is a greater than chance correlation). Also, unlike the Basque Country and Catalonia, the Gaeltacht does not correspond to any political entity and could not be considered a "sub-national country" by any stretch of the imagination. —Aɴɢʀ (talk) 10:33, 11 October 2015 (UTC)
@Aɴɢʀ: I can understand where you're coming from, and I therefore agree that the Gaeltacht was a bad example. However, surely being from anywhere is neither a necessary nor a sufficient condition for being or for not being a speaker (native or otherwise) of any language; in every case, the country listings are only suggestive of language ability. — I.S.M.E.T.A. 20:54, 11 October 2015 (UTC)
I agree with you - also, India and Pakistan could definitely use that category system:
  • Kashmir (Kashmiri)
  • Bengal (Bengali)
  • Sindh (Sindhi)
  • South India (Tamil, Telugu, etc.)
Aryamanarora (talk) 21:24, 11 October 2015 (UTC)
@Aryamanarora: Yes, that was the sort of thing I was thinking. Should the sub-national states be listed separately, as the Basque Country and Catalonia currently are, or should they be listed as subsections of the sovereign states' sections? — I.S.M.E.T.A. 23:10, 11 October 2015 (UTC)
@I'm so meta even this acronym: Subsections will be best for organization. Aryamanarora (talk) 01:44, 12 October 2015 (UTC)
@Aryamanarora: Subsections it is! — I.S.M.E.T.A. 12:12, 12 October 2015 (UTC)
Eh, let people categorize themselves however they want. If they identify as Basque and not Spanish, or as Californian and not American, let 'em do it. There's no need for policy to be that restrictive. Purplebackpack89 05:25, 12 October 2015 (UTC)
@Purplebackpack89: I'm in favour of listing sub-national countries in Wiktionary:Wiktionarians. I just wanted to make sure that doing so has community support or, at least, lacks community opposition. — I.S.M.E.T.A. 12:12, 12 October 2015 (UTC)

Any objections to having subsections for sub-national countries in Wiktionary:Wiktionarians? — I.S.M.E.T.A. 12:12, 12 October 2015 (UTC)

I just listed myself as being from Scotland. --Zo3rWer (talk) 09:32, 14 October 2015 (UTC)

I've just reorganised the page and posted notice of this on Wiktionary:News for editors. — I.S.M.E.T.A. 21:54, 16 October 2015 (UTC)

Appendix:Capital letter[edit]

I created Appendix:Capital letter as an (incomplete) list of uses of a capital letter. Wikipedia already has Capitalization, but I created this page using the entry layout, which I find easier to navigate. A number of those is basically a list of senses that the entries A, B, C, etc. could have if we wanted. (like "found in the beginning of proper nouns" and "found in the beginning of sentences") But I think as a single page they look better. Feel free to expand the list with more languages or senses. --Daniel Carrero (talk) 09:31, 12 October 2015 (UTC)

Nice, it is certainly thorough. Aryamanarora (talk) 22:29, 12 October 2015 (UTC)
Thank you. --Daniel Carrero (talk) 08:55, 13 October 2015 (UTC)

I have a proposal:


  • Not only the whole page is formatted like an entry, if we assume that entries like A, B, C, etc. should have senses like "found in the beginning of proper nouns" and "found in the beginning of sentences", "found in the beginning of taxonomic names", etc., then the page Appendix:Capital letter suppresses the need for creating those definitions in every single letter. Think of it as a merger of all the entries for capital letters because they would have repeated information otherwise. The idea of "capital letter" is something of lexical significance, and completely able to be checked for attestations just like a normal entry. Also IMHO it is more important than the entry ] [.

Then again, I know it's an unprecedented idea, so I don't mind if someone disagrees. (not that I would usually mind otherwise) I used the appendix namespace because it would seem uncontroversial, but I was really aiming for the main namespace. Thoughts? --Daniel Carrero (talk) 08:55, 13 October 2015 (UTC)

I forgot to mention: moving Appendix:Capital letters into the main namespace also would serve the purpose of making it searchable. --Daniel Carrero (talk) 10:53, 24 October 2015 (UTC)
Nobody seems to have weighed in, but this has my support, whatever that's worth. Andrew Sheedy (talk) 01:08, 18 October 2015 (UTC)
True, thanks for your opinion. I suppose that makes us 2 Support; 0 Oppose; 0 Abstain!! :) In any event, I'll ask more people to join in the conversation. --Daniel Carrero (talk) 10:57, 24 October 2015 (UTC)

Tolkien's languages' copyright[edit]

I've recently started expanding Quenya entries and User:Chuck Entz suggested coming over here and making sure that the Wikimedia Foundation won't be sued by the Tolkien Estate for any copyright infringement. The main reference I'm using is Eldamo.org, which licenses all of Tolkien's languages' definitions and meanings under Creative Commons 4.0. What does the community think? Aryamanarora (talk) 22:49, 12 October 2015 (UTC)

[IFYPY] IANAL, but @BD2412 is. —Μετάknowledgediscuss/deeds 03:01, 13 October 2015 (UTC)
This is an area where I would recommend treading very carefully. The fact that another website maintains a compendium of words from a fictional language under an lenient license has no bearing on the copyright status of the creative work with respect to the estate of the original author. bd2412 T 01:15, 14 October 2015 (UTC)
To clarify: The other site can't license something it has no right to in the first place
In the US, looking at 17 U.S. Code § 102, it does not seem that languages are included under the list of things that are copyrightable; they are closer to a system (of communicating), which is explicitly excluded, than a literary work, which seems to be the closest thing that is included.--Prosfilaes (talk) 01:14, 16 October 2015 (UTC)
What makes this rather murky is that these languages are an integral part of a series of literary works, and play an important part in the effect of several passages on the reader. For instance, the words of the Dark Lord on the Ring might not sound that unusual to someone who speaks any of a number of languages (somewhere in Central Asia, perhaps?) with a similar phoneme inventory, but to English speakers their sound symbolism really gives an impression of something alien, barbaric and harsh.
Even though some of the Elvish languages were the source of the literary works, rather than the other way around, I doubt they would have come out the same in the end if the literary works had never been created. Besides, Tolkien was such a master of linguistic details that it's hard to escape the impression that the languages are as much original creations as a poem, a sculpture, or a painting.
As for the issues involved, it's not merely a matter of whether the WMF is at risk of being sued: as an enterprise that depends on the willing contributions of a great many people, we need to be very respectful of intellectual property rights. We should avoid violating anyone's legal rights, whether there's a likelihood of litigation or not. Chuck Entz (talk) 23:11, 16 October 2015 (UTC)
I would suggest some limiting principle, such as including only words that are at least discussed in some other work. bd2412 T 00:24, 17 October 2015 (UTC)

Matched-pair entries - follow-up proposal[edit]

Now that Wiktionary:Votes/2015-08/Allowing matched-pair entries passed, (thanks for the votes!) I have a follow-up proposal:

  • All unpaired entries can exist as separate entries -- ), (, [, {, etc. -- but they should not have any definitions on the likes of "begins X" and "ends X" and they should not repeat the same information of the matched-pair entries. For example, if ( ) is defined as "encloses supplemental information", then ) should not be defined as "ends supplemental information". The entry ) should only have the modicum of information to point the reader to ( ), and should be devoid of examples, multiple senses (math sense, chemistry [?] sense, typography senses, etc.), regional variations, synonyms, etc. In other words, ( ) should be lemmatized, with (/) pointing to it. (But the individual character entry could have some information specific to it, such as the Unicode box, the name "this is called 'left parenthesis' in English", perhaps even a picture.)

More considerations:

Sometimes, a component of a matched-pair has also standalone definitions. The entry ) could have both:

  1. Used in ( ).
  2. Separates a number or letter from an item in a list.
    1) New York, 2) London, 3) Paris.

Sometimes, a single character is the component of different matched-pairs, so it should point to all of them. The entry » could have, maybe:

  1. Used in « », » » and » «.


Also, feel free to change/adapt the proposal or propose something else if you'd like. --Daniel Carrero (talk) 01:17, 13 October 2015 (UTC)

Proposal: Only use lemma forms in etymologies[edit]

For Latin, some etymologies show multiple forms of a word. For verbs, both the infinitive and first-person singular present active indicative are sometimes shown. For nouns, some people seem to include the accusative singular or the genitive singular form.

This practice is pretty much unique to Latin etymologies, I haven't seen it for any other languages. It can be argued that mentioning this form makes the etymology more correct since Romance lemma forms derive from the infinitive and accusative singular. However, we don't seem to do this for any other languages where this might apply:

  • Bulgarian and Macedonian etymologies show verbs being derived from the Proto-Slavic infinitive, even though the modern languages have no infinitive and another form is used as lemma.
  • Irish and Scottish Gaelic verbs use the Old Irish third-person singular form, even though the lemma is another form in the modern languages.

Therefore, I want to propose that we use only the lemma forms in etymologies, regardless of whether the modern lemma form descends from the ancient lemma form. What is inherited is the entire paradigm, not the lemma form alone. The lemma form is merely a representative of that paradigm. So Latin cantō merely stands in for the entire paradigm, which includes cantāre. The choice of lemma form in any given language is completely irrelevant for the actual etymology. —CodeCat 12:38, 13 October 2015 (UTC)

  • Definitely support. This has long been a desideratum of mine. —Aɴɢʀ (talk) 12:52, 13 October 2015 (UTC)
It’s not the always the entire paradigm that that is inherited. Most Romance words are loaned or inherited from the accusative, but some are from the nominative, and some are from the accusative plural.
For most Portuguese verbs, it’s specifically the infinitive that was loaned or inherited, and other forms were formed from its stem. Look at ajo (not from agō), jogue (not from iacit).
I don’t support this as a general rule, as we risk losing important etymological information if it were followed to the letter. I think it would be better if each language had its own policy about how to link to etymons. — Ungoliant (falai) 14:40, 13 October 2015 (UTC)
That kind of analogical restructuring and reforming is normal in any language and doesn't need mentioning unless something unusual is going on. Paradigms may be inherited but that doesn't mean all forms must be inherited individually. As for borrowing, for example, English placate does come from plācō, even though the specific form was taken from the past participle plācātus. Many Germanic languages, meanwhile, borrow Latin and Romance verbs by using the original infinitive as the stem. —CodeCat 15:33, 13 October 2015 (UTC)
It is the entire paradigm that is inherited. It's just that all of the case forms other than the accusative are dropped, but the paradigm also contains all the definitions and connotations (even if these are also changed). --WikiTiki89 15:43, 13 October 2015 (UTC)
  • Why is this being discussed without any reference to ordinary users?
In English etymologies the practice of including, for example, the stem of or a form other than nominative singular of a Latin or Greek noun can help users see and accept the etymological we offer. Similarly for words derived from present or past participls.
Is this intended to make it simpler to impose some pan-lingual uniformity on entries? It seems to me to have little prima facie justification, so one naturally looks for other, unstated motives. DCDuring TALK 23:31, 13 October 2015 (UTC)
I oppose this. I would, however, support linking to the lemma, as is done in some cases already (what seems most common, though, is including both in the etymology, as in "from linking, gerund of link"). Andrew Sheedy (talk) 00:08, 14 October 2015 (UTC)
Does that mean you think the same should be done for Old Irish, Proto-Slavic and any other language where this applies? What to do when the lemma form didn't even exist in the ancestor language, like for PIE vs its descendants? —CodeCat 00:20, 14 October 2015 (UTC)
Assuming I am understanding correctly, then I would answer "yes" for the first question. One of the two practices that exist for Latin words in etymologies would be ideal, in my opinion. Part of it is, as DCDuring mentioned above,that an average user may not realize that they are not being given the actual form of the word from which the word they are looking at was derived. I know that was the case for me before I realized what was going on, and as a result, I thought that some of the etymologies were pretty far-fetched. I wouldn't throw a fit, however, if that standard wasn't adopted for languages like Old Irish, Proto-Slavic, etc.
I'm afraid I'm having difficulty parsing your second question in my sleepy state. Rather than me answering a question you didn't ask, could you please clarify what you mean first? Andrew Sheedy (talk) 01:02, 14 October 2015 (UTC)
Proto-Indo-European didn't have an infinitive, so for any language that uses the infinitive as lemma, there's a problem. There's no form to show it to be derived from. Latin ferō is inherited straight from PIE *bʰéroh₂, but Proto-Germanic *beraną didn't inherit from anything in PIE, it was formed after Germanic split off. The exact Germanic descendant of the PIE form is *berō, but that's not the lemma form. —CodeCat 01:09, 14 October 2015 (UTC)
I see. In that case, I would note the extra step, i.e. the formation of *beraną from whatever form, which in turn came from PIE *bʰéroh₂ (or whatever the exact inflection is). I realize that that information may not always be available, but the bottom line is that the intermediary step should be noted, or it should be made clear that the derivation was indirect (e.g. saying "ultimately derived from" rather than simply "from"). (Also, if PIE had no infinitive, why is *bʰer- defined as "to bear, carry"?) Andrew Sheedy (talk) 02:00, 14 October 2015 (UTC)
It's actually the root with all the inflectional/derivational bits removed, so it's impossible to translate directly into English. Using the infinitive is better than a long explanation to the effect that it depends on what inflectional/derivational state it's in as to how to translate it. Chuck Entz (talk) 02:25, 14 October 2015 (UTC)
Ah, OK. Proto-Indo-European is unfamiliar territory for me. Andrew Sheedy (talk) 02:33, 14 October 2015 (UTC)
  • Much as Andrew Sheedy, I oppose this, but I support linking to the lemma forms. I deal primarily with Japanese, and sometimes a term derives not from a lemma form, but from some inflection thereof. Listing only the lemma form in the etymology would be incomplete, and it invites confusion. ‑‑ Eiríkr Útlendi │Tala við mig 02:47, 14 October 2015 (UTC)
  • Oppose. Trying to standardize all cited forms as lemma forms would be overblown. Lemmatization is decided on other grounds entirely than etymological transparency. In particular, when lemmas consistently contain a particular suffix (e.g. an infinitive or nominative marker), I am in favor of quoting only the word stem, if this can be well-defined. (But I am, per Andrew S, in support of including at minimum a link to the lemma.) --Tropylium (talk) 09:19, 14 October 2015 (UTC)

Standardizing suffix entries[edit]

Looking at the suffix entries of different languages, I notice a great variety in how the definitions and examples are formed/formatted, and what terminology is used. This is especially visible on long multi-language pages such as -a, -t, -k. I think it would increase the quality of our dictionary if we could standardize certain aspects to make them appear more unified. The standards would be mainly recommendations and guidelines for formatting and terminology. Here are some simple examples:

  • Is there a preferred terminology in the definitions? E.g.:
    "verb suffix", "verbal suffix", "verb-forming suffix", or "verb-building suffix"?
    "plural suffix", "plural ending", or "plural marker"?
  • "Forms the..." or "Used to form the..."? Or {{non-gloss definition|Used to form the…}}
  • How to format the FL definition when the FL entry has an English equivalent?
  • How to format it when there is no English equivalent?
  • The order of terms within a definition, e.g. "third-person singular indicative past indefinite" or "third-person singular past indicative indefinite"?
  • For the examples listed below the definition, should we use the format recommended by {{suffixusex}}?

These are just initial thoughts and observations, I'm sure there will be others, I just wanted to find out whether other editors see a need for such recommendations. --Panda10 (talk) 14:24, 13 October 2015 (UTC)

@Panda10: I also struggle to find good ways to define suffixes. I’ll try to explain what I usually do:
  • If the suffix has an exact English equivalent, I use the typical FL format of translation (gloss). Scientific suffixes such as -metro and -algia are in this category.
  • Otherwise, I use the format {{n-g|explanation}}; possible_translations.
  • The explanation is “forms parts of speech, from parts of speech [qualifier], indicating/denoting/meaning [...]” (see -eiro for an example)
  • Sometimes the wording makes it clear what is the part of speech, e.g. “forms the names of lakes” doesn’t need to say it forms proper nouns
  • I think I’m the only one who has ever used {{suffixusex}}.
Ungoliant (falai) 13:54, 16 October 2015 (UTC)
@Ungoliant MMDCCLXIV: Thanks. I often read FL suffix entries for ideas, to see how others organized them. It is especially educational when I know nothing about that particular language. If I understand it easily and I like the layout, I will use the same or similar in Hungarian suffix entries.
  • The -eiro entry looks very thorough and well organized. I'm not sure about using {{n-g}}. Personally, it's a little hard for me to read italics. Then there is the problem with the similarity of "form" and "from", not to mention that "forms" is also a plural noun. I checked other online and paper dictionaries, some use "forming nouns from verbs" to make it clearer. Others use a label such as {in nouns} before the definition. I also use a label (saw it first in Finnish entries). See -ás.
  • I considered using {{suffixusex}} before because its output is very close to what I've been using, but in the end I didn't. Mainly because I didn't want the suffix to be repeated in the example, instead I bold the suffix in the derived word. This way the example appears more compact to me and the suffix is still clear. See -ás.
I understand that there will always be differences due to the nature of a specific FL. For example, Portuguese will not require noun suffix inflection tables, only the headline forms for feminine and plural. But I'm convinced that certain things could be standardized. --Panda10 (talk) 19:59, 16 October 2015 (UTC)
There's also {{usex-suffix}}, for some reason (one of them should probably be deleted) DTLHS (talk) 20:05, 16 October 2015 (UTC)
I've converted existing uses to {{suffixusex}}, and turned it into a redirect. —CodeCat 20:43, 16 October 2015 (UTC)

Last week for comments about IEG project[edit]

Hi beer chatters,

The call for Individual Engagement Grants is closed and there is only one week left for comments from the communities. I haven't seen any notice about this and it's quite sad looking at the nice list of projects and number of people involded. So, I invite you to spend some time to look at the project and to let endorsement notes to encourage the participants. I particularly invite you to look at Wikiproject Siriono, a project I built for Wiktionaries. I'll be glad to receive advice and comments about it. Of course, please read the others projects as well, there is really very exciting stuffs! Eölen/Noé (talk) 21:24, 13 October 2015 (UTC)

Cities of Norway vs Cities in Norway[edit]

I have noticed not too long ago. There are two very similar categories: Category:en:Cities in Norway and Category:en:Cities of Norway.

So which one is better? "Cities of Norway" or "Cities in Norway"?

--KoreanQuoter (talk) 02:27, 14 October 2015 (UTC)

I'd choose "IN", that's what Wikipedia does for cities.
Wikipedia has: w:Category:Cities and towns in Norway. --Daniel Carrero (talk) 02:34, 14 October 2015 (UTC)
I like "of" better, but our existing entries (and a bunch I just added to the module) are for "in". Of course, we're not talking about a lot of categories, or even of entries- but it's better not to rearrange everything if we don't have to. Chuck Entz (talk) 03:24, 14 October 2015 (UTC)


Civilocity is a neologism which describes a form of government where the people can watch and listen to the leader of their country for the entire time that person is leading their country. In 2007 Nathaniel Wenger took it upon himself to coin, classify, and copyright this pragmatic philosophy. Nathaniel began talking about civilocity, which he often calls wengerocracy as it remains in its neologism phase, to emphasize the importance for countries to watch the leader of their country no matter where they live. Civilocity can be defined as a form of government where the people can watch the leader of their country 24/7, 365 days a year, including the extra day once every leap year broadcasted live on public television to the entire world. Civilocity allows you to know every single thing the leader of your country did and having it all online.

The exact definition of civilocity is literally, behaving in the dwelling. Civilocity is derived from the Latin term civilis and the Medieval Latin term civitat in the early of the 21st century AD to improve the political systems existing in some American city-states, notably Washington, DC.

Add to WT:LOP if you like. Equinox 15:28, 15 October 2015 (UTC)

Lingwa de Planeta[edit]

Lingwa de Planeta (Lidepla) is a constructed language made in 2010. They have a sizable lexicon and I think we should include the language in Wiktionary. My question is - should it go in the appendix or main namespace? It only has 15 fluent speakers, but so does Volapük. Edit: 25 fluent speakers as per Wikipedia. —This unsigned comment was added by Aryamanarora (talkcontribs) at 19:58, 15 October 2015 (UTC).

Does this language have a sufficient amount of published textual material from which we can cite words to meet our attestation criteria given at WT:CFI? --WikiTiki89 20:03, 15 October 2015 (UTC)
Since Wiktionary:Criteria for inclusion#Constructed languages shows that languages without ISO codes have no consensus, I'd think this would need to be decided by someone familiar with this. w:Lingwa de Planeta shows many literary works translated into Lidepla - [2] has a list, notably Alice in Wonderland is translated. There are around 3,000 words in the lexicon. There is a Swadesh list in the aforementioned Wikipedia article. There's also this [3], a translator from Esperanto to Lidepla. Aryamanarora (talk) 20:36, 15 October 2015 (UTC)
Is there any "durably archived" (i.e. used in "permanently recorded media"; see WT:CFI for clarification) written material produced originally in this language (i.e. not tranlations)? Note that even though Volapük currently has about 20 speakers (according to Wikipedia), it had more in the past and has existed for much longer. I don't know where we find Volapük citations, but I'm sure someone here knows. --WikiTiki89 20:54, 15 October 2015 (UTC)
As far as I know, it does not. I just learned about a few days ago, however, so my knowledge may be limited. I think should we decide to add it it should remain in the appendix namespace. Aryamanarora (talk) 21:06, 15 October 2015 (UTC)
Is there anything at WT:CFI saying that durably archived cites have to be produced originally in the language and not translated? I can't find anything to that effect, but maybe I missed it. The Alice translation certainly exists in a dead-tree edition, which is available from Amazon. —Aɴɢʀ (talk) 21:11, 15 October 2015 (UTC)
I'm just trying to find out more information about the language. Since there would need to be consensus to include this language (a requirement that CFI does mention), editors need information to base their votes on. --WikiTiki89 21:17, 15 October 2015 (UTC)
I do not believe languages should be added without an ISO code.
More directly, Volapük had close to a million speakers and some non-trivial publications. This language has 25 speakers, and a translation of Alice. (The list of translations seems to either consist of short works or translations of a few chapters.) Evertype has published lots of translations and versions of Alice, including several into tiny conlangs and rare scripts, including one in a script with one user. He currently offers three books in Volapük. His books, at least his Alice's, are print on demand.--Prosfilaes (talk) 01:37, 16 October 2015 (UTC)
  • It's a fairly obscure IAL, even within the obscure world of those who know and create conlangs. I certainly see no reason why it should be in mainspace, but I don't think we have any limitations on conlangs in appendix-space (although perhaps we should). —Μετάknowledgediscuss/deeds 02:44, 16 October 2015 (UTC)
  • My opinion is that Lingwa de Planeta has not matured enough to be included in mainspace. The translation of Alice really exists as an example of the language rather than a usage of the language, since I doubt that there has ever existed even one human being who would have felt more comfortable reading Alice in Lingwa de Planeta than in English. As time goes on, this language might take hold and its community may start producing legitimate uses of the language and then the language would be on track for inclusion in mainspace here. If this language takes off, it will take decades for it be ready to be included here. --WikiTiki89 15:24, 16 October 2015 (UTC)
I agree with Wikitiki that this is not suitable for inclusion in the main namespace. As WT:CFI says, most constructed languages "do not meet the basic requirement that one might run across them and want to know the meaning of their words, since they are only used in a narrow context in which further material on the language is readily available." However, we do seem to let most conlangs have a minimal, not copyright-infringing appendix. (Is the language copyrighted? The website has a copyright notice.) Inclusion of an appendix doesn't seem to require that the language be given a code, e.g. the Sindarin appendix doesn't seem to use its code (its code seems to be included in the module only so that Category:Terms derived from Sindarin works). - -sche (discuss) 18:39, 25 October 2015 (UTC)


Just a note: Template:place was created, together with Module:place, for use in placenames in all languages. Thanks to Ungoliant MMDCCLXIV (talkcontribs), who developed the module completely. (Also I credit myself as a beta-tester.) See Module talk:User:Ungoliant MMDCCLXIV/archive1 for conversations during the development of the module. --Daniel Carrero (talk) 03:13, 16 October 2015 (UTC)

Appendix:okay sign[edit]

Okay, so, I’m hardly proficient at any sign language, so I simply put this in the appendix. I don’t think that we have any other entries that are extremely similar to this one, so I more‐or‐less made up my own format. If you people have any suggestions or comments, I’d like to read them. I think that this sign, if nothing else, merits inclusion somewhere. --Romanophile (contributions) 07:29, 17 October 2015 (UTC)

I made another one: appendix:V-sign. --Romanophile (contributions) 15:38, 17 October 2015 (UTC)

I created Appendix:finger gun, Appendix:thumbs up, Appendix:thumbs down, Appendix:shushing and Appendix:air quotes. --Daniel Carrero (talk) 16:10, 17 October 2015 (UTC)

No comments? Well, qui tacet consentire videtur, as the Romans would say. --Romanophile (contributions) 23:59, 17 October 2015 (UTC)

To avoid having random bits and pieces all over the place, I think they should be made subpages of a common parent, like Appendix:Gestures/thumbs up etc. Equinox 00:02, 18 October 2015 (UTC)
(edit conflict) I think all of those should be in the mainspace, (to be searchable as normal entries) but I have yet to learn that long notation that we use for them.
Note: Appendix:okay sign = Sign gloss:OK and O@Side-PalmForward K@Side-PalmForward.
But, as long as they are in the appendix namespace, I agree with Equinox's idea of Appendix:Gestures/thumbs up. --Daniel Carrero (talk)
Yeah, there might be some sort of technical code that describes these, but I’m not sure how to write it. Had I known it, I would have just put them in the main space. --Romanophile (contributions) 00:14, 18 October 2015 (UTC)
FWIW, I think descriptive names such as "V-sign" are better than the long technical names. --Daniel Carrero (talk) 09:54, 18 October 2015 (UTC)
This is a great idea, love it! As others have already pointed out, finding gestures might be tricky (visual index?) Jberkel (talk) 00:28, 20 October 2015 (UTC)

Wiktionary:Votes/2015-10/Matched-pair naming format: left, space, right[edit]

A note: I created Wiktionary:Votes/2015-10/Matched-pair naming format: left, space, right as a follow-up to Wiktionary:Votes/2015-08/Allowing matched-pair entries. --Daniel Carrero (talk) 08:38, 17 October 2015 (UTC)

Categories for places that are not cities?[edit]

People have been creating a variety of "cities in..." categories, which is nice. But the category is a bit misnamed, because there are also attestable place names that aren't cities. Furthermore, in many countries/languages, "city" is merely an unofficial name for any large place, and is not strictly defined, whereas in others, even small places can be cities if they have city rights. So these should really be renamed to something more neutral and less subject to uncertainty. —CodeCat 20:26, 17 October 2015 (UTC)

I'm not sure I understand why this is a problem. Every country has its own hierarchy of polities, so we shouldn't be trying to make everything conform to some one-size-fits-all scheme. Trying to coordinate placename types between countries can only lead to madness. Besides, what are you going to replace it with? A municipality can range from a part of a city to a regional jurisdiction containing multiple cities. A metropolitan area can stretch across large areas and include numerous cities. Even if you have very specifically-defined entities, deciding which to use entails a rather research-intensive, subjective process, since cities can vary so much. The only halfway-reliable fact is what something is called, so we should categorize using that and suppress the impulse to make sense of it all. Chuck Entz (talk) 22:15, 17 October 2015 (UTC)
I think we should use a single term that can apply to all of them regardless of size. We don't need to subdivide it further into whatever definitions apply. My gripe is that "city" doesn't cover all we might want to put in the category, so we will need Category:nl:Villages in the Netherlands, Category:nl:Villages in Belgium and so on. I'm trying to avoid that situation by suggesting we use a neutral term. —CodeCat 22:25, 17 October 2015 (UTC)
In my opinion, we really should not use "city" for everything if that's inaccurate. Using inaccurate qualifiers just makes us an inaccurate dictionary, thus a less trustworthy one. One idea that Wikipedia uses is "First-level subdivision" (state, province, county), "Second-level subdivision", etc. In fact, Wikipedia takes this one step further, since each of those contains specific categories by country such as "Provinces of Algeria", "Districts of Azerbaijan", etc.
I created WT:Place names, which I propose to be a list of types of place names that all countries use, to help our current categorization system, though most countries are to be filled with information yet. That page also has links to the "x-level" Wikipedia categories I mentioned. --Daniel Carrero (talk) 22:48, 17 October 2015 (UTC)
Countries may not treat cities, towns, villages, hamlets etc. as legal entities. In the Netherlands for example there are only municipalities, but they can have many different villages in them. What Dutch people usually go by is whether the place has its own road sign that gives the name of the place when you enter it. The sign has legal significance, but only for road users (it means a 50 km/h speed limit). Addresses also use places rather than municipalities. The Dutch term for a generic group of houses in one place, regardless of size, is woonplaats ‎(literally living-place). An English equivalent of that would be ideal for these category names. —CodeCat 22:52, 17 October 2015 (UTC)
Wikipedia calls it a "settlement": w:Human settlement. —CodeCat 23:05, 17 October 2015 (UTC)
That will run into the problem of different countries having different levels of complexity, organized in different hierarchies. To connect a term in use with one of your abstractions requires knowledge of the how the country is organized. Your description of Brazil took up most of a screen- multiple that by dozens. In the US, you have states, except for the District of Columbia and various territories. States are divided into counties, except for the ones that have independent cities that are their own counties, or the ones that are instead divided into parishes, or into boroughs. In Alaska, the borough is the equivalent of a county, and can contain multiple cities. In New York, the city of New York is divided into boroughs. As you subdivide further, there's virtually no correlation between size/importance of a given polity in a metropolitan area and anything at the same hierarchical level in a rural area: w:Los Angeles County, where I live, is larger than some states (not to mention countries), while some rural counties are smaller than the the smallest subdivision of the city of w:Los Angeles, where I live (which isn't in any of your "x-level" Wikipedia categories). No matter what criteria you use, consistency is a pipe dream without making things too complicated and too impractical for mere mortals. Chuck Entz (talk) 00:20, 18 October 2015 (UTC)
Hey, I didn't propose any "consistent" system or any specific change to the categories, I don't know what exactly would be the categories for most countries. But I'd like to know more, that's why I created what WT page. You don't have to help if you don't want to. But your comments about US are something that would fit well there. Heck, I'm not even saying that we are going to have a perfect, ideal system eventually, but I bet that information could help even the current system some way nonetheless. --Daniel Carrero (talk) 00:33, 18 October 2015 (UTC)
Wikipedia has another name for certain categories that I don't suggest using here, but I'm going to mention anyway: "Category:Populated places in (place)". --Daniel Carrero (talk) 08:49, 19 October 2015 (UTC)

Nominalized Adjectives[edit]

Regarding nominalized adjectives (adjectives which are used as nouns, such as rich) are we to add a section "Noun"? For example, rich has only an "Adjective" section.SoSivr (talk) 09:07, 18 October 2015 (UTC)

This is a productive process—I'm having trouble thinking of any English adjectives referring to a class of people that cannot be used this way, aside from words like "Catholic" that are already used as countable nouns. "The deaf", "the living", "the hidden", "the tall", "the ill", "the healthy", "the infirm", etc. can all be used in the appropriate context. Giving that seemingly all adjectives can be used this way as long as the meaning makes sense, I don't think we should add noun senses to their entries.
On the other hand, deaf, poor, wealthy, and ill do have a noun sense to cover this usage, though I think they probably shouldn't. —Mr. Granger (talkcontribs) 17:03, 18 October 2015 (UTC)
Some dictionaries have some "nominalized adjectives" as nouns, but most do not. Some of those that include it as a noun assert the rich to be an idiom or use the entry to say that rich takes a plural verb. To add and verify the corresponding information for every adjective for which such information would apply seems like a long run for a short slide. DCDuring TALK 17:59, 18 October 2015 (UTC)
Yes, this is just a feature of English grammar. We should only have noun entries if there exists translations that are different to those of the adjective. SemperBlotto (talk) 05:28, 19 October 2015 (UTC)
Some previous discussions: Talk:Irish, Talk:deaf, Talk:wicked. We do have "Used before an adjective, indicating all things (especially persons) described by that adjective." as a sense of the. - -sche (discuss) 16:41, 21 October 2015 (UTC)
So example sentences of these adjectives where they are used as nouns will probably be put inside the relevant Adjective section, together with a usage note perhaps.SoSivr (talk) 09:06, 23 October 2015 (UTC)

Wiktionary:Votes/2015-10/Internet ≠ Internet slang[edit]

Note: Created Wiktionary:Votes/2015-10/Internet ≠ Internet slang, based on the 2012 discussion Wiktionary:Beer parlour/2012/January#Internet =/= Internet slang. --Daniel Carrero (talk) 11:04, 18 October 2015 (UTC)

Does this need a vote? I feel it’s already the modern consensus and common practice (among people who know what they are doing). — Ungoliant (falai) 13:20, 18 October 2015 (UTC)
I doubt this needs a vote. It needs definition-by-definition review to correct existing entries. Possibly it could use a bit of discussion to clarify the distinction., especially in borderline cases and in cases where both might seem applicable. I'd assume that internet referred to the mostly technical jargon concerning the internet and internet slang referred to slang used on the internet. Slang used on the internet about the internet probably belongs in internet. DCDuring TALK 18:05, 18 October 2015 (UTC)
@Daniel Carrero: I agree with DCD, and I think this kind of response shows that a vote is unnecessary. —Μετάknowledgediscuss/deeds 18:44, 18 October 2015 (UTC)
OK, I retract the vote. --Daniel Carrero (talk) 18:52, 18 October 2015 (UTC)

Context label in the form "often medicine"[edit]

@CodeCat and I had a discussion about whether a context label at cacoethic in the following form was correct: "(obsolete, often medicine)" ({{context|obsolete|often|_|medicine|lang=en}}). I had used this label to indicate that cacoethic was often, but not always, used in a medical context. CodeCat said "often medicine" made no sense. I pointed out that this sort of label seemed fine for dictionary entries: see, for example, [4]. The matter was resolved by splitting up the medical and non-medical senses, but for future reference I'd appreciate some guidance on whether labels like "often medicine" are appropriate. Smuconlaw (talk) 14:40, 19 October 2015 (UTC)

I'd say in cases like this it would make more sense to say {{lb|en|chiefly|medicine}} to indicate a term is used chiefly but not exclusively in medicine. —Aɴɢʀ (talk) 15:12, 19 October 2015 (UTC)
So constructions like "chiefly medicine" and "often medicine" are acceptable as context labels, even though they are not, of course (as CodeCat pointed out), strictly grammatical? Smuconlaw (talk) 15:52, 19 October 2015 (UTC)
They are perfectly grammatical in the subgrammar of labels. --WikiTiki89 15:55, 19 October 2015 (UTC)
OK, great. Thanks. Smuconlaw (talk) 13:55, 21 October 2015 (UTC)

Boring cleanup work for money[edit]

I lost my job in July, that's how I've been able to be more active on Wiktionary in the last few months. Though I'm still looking for other jobs.

My money is almost gone. Can I do boring cleanup work on Wiktionary for money?

I got the idea from Wiktionary:Beer parlour/2012/July#Reward or bounty board, in which someone said "I see no reason why a person doing boring cleanup work should not be paid with money if someone offers that money." and "appropriately clean up Category:Translation table header lacks gloss" is mentioned as one possibility.

My plan, if no one objects:

  1. I set up a Patreon account. (I never did that before, I did my research but I apologize if I understood wrong how it works)
  2. Let's say the goal is specifically: appropriately clean up Category:Translation table header lacks gloss.
  3. I set up a goal of "receiving tips every 100 entries", with some minimum amount ($1?). If other people are willing to help, they use my Patreon page and every 100 entries I receive that amount of money, up to the maximum amount that people choose.

There are probably other types of boring cleanup work to do, I'm open to suggestions. --Daniel Carrero (talk) 12:09, 20 October 2015 (UTC)

I'll take you out for a meal and let you crash on my couch if you run a Spanish verb bot and empty all these categories. --Zo3rWer (talk) 13:49, 20 October 2015 (UTC)
Thanks! :p Running a bot? I could do it, but that seems a bit different from my original idea, I was thinking of working for money by doing some manual labor that no one wants to do, and this is one of those inherently repetitive tasks that a bot could do better, as you mentioned. (other tasks require individual consideration of each entry and thus would be better done by humans) I wonder who was the last bot used to create forms for Spanish verbs. User:TheDaveBot (2006-2013)? --Daniel Carrero (talk) 11:54, 21 October 2015 (UTC)
I could be a gazillionaire if I'd got paid for every edit I ever made. Even if you included the fines I'd have gotten for being blocked. --Zo3rWer (talk) 13:54, 20 October 2015 (UTC)
I'd pay you $1 for every 100 entries cleared out of Category:term cleanup and all its subcategories. —Aɴɢʀ (talk) 10:52, 21 October 2015 (UTC)
Thanks for the idea, sounds good! If no one minds, I think I'll create a Patreon account saying more broadly "$1 every 100 entries for boring cleanup work, to be decided by consensus." so that I could start working on Category:term cleanup now per your idea and the job could be changed if people voted/decided/discussed on something else (since there's Category:Translation table header lacks gloss mentioned above and probably other jobs). --Daniel Carrero (talk) 11:54, 21 October 2015 (UTC)
Does it violate any of Patreon's terms and conditions that what you're doing isn't really art? —Aɴɢʀ (talk) 12:38, 21 October 2015 (UTC)
No, I believe. I've read the Community Guidelines, the legalese Terms of Use and the Help Center. They talk repeatedly about "artists and creators."
Just to be sure, I've sent them an e-mail today.
My name is Daniel Carrero, I'm an editor/administrator at Wiktionary, a dictionary wiki which is a sister project to Wikipedia.
We are all content creators, but often there is content on the wikis that need maintenance and cleanup for quality and standards.
Can I use Patreon to crowdsource specific cleanup and quality work and, like "pledge $1 for every 100 pages cleaned up according to criteria X and Y"? I predict that only other Wiktionary members would be interested in paying for that project.
Thank you,
Daniel Carrero
Also, I've found some specific Patreon projects of creating a wiki:
--Daniel Carrero (talk) 15:36, 21 October 2015 (UTC)
They replied to my e-mail. They've suggested doing a monthly campaign (as opposed to "per creation"). What do other people think? I think "per creation" is still better as measurable progress, I don't mind listing all the entries when I finish the work.
Also: +1 point for Patreon because they did their homework, apparently. They said "Wiktionary entries" and I never said "entries" in my message, so they must know at least a bit about our work and correct terminology.
Hey Daniel,
Thanks for writing in and stoked to hear you're thinking about starting a Patreon campaign for Wiktionary!
I think this is a great idea and totally something that would work well on Patreon. To make it easier on you, I might even recommend doing a monthly campaign (as opposed to "per creation,") so that you're not having to keep a constant count of the pages cleaned up. I think that a lot of people would love to support such a great cause and it would be really cool to share fun new Wiktionary entries with your supporters on your Patreon page.
Happy to answer any other questions that come up as you familiarize yourself more with Patreon, so feel free to shoot me a note as they come up!
All the best,
--Daniel Carrero (talk) 09:30, 22 October 2015 (UTC)
Wikipedia says in Patreon: "In October 2015, the site was the target of a massive hacking attack with almost 15 gigabytes' worth of password data, donation records, and source code taken and published. The breach exposed more than 2.3M unique e-mail addresses and millions of private messages." Is there a safer method? --Panda10 (talk) 13:06, 21 October 2015 (UTC)
I don't know, should I use any other platform listed at Category:Crowdfunding platforms? I believe Kickstarter would work only for huge, expensive unstarted projects, while Patreon should be usable for small tips according to measurable progress done (or per month if the worker chooses that option), like I proposed above.
Here's the links to the 2 websites that Wikipedia uses as sources to that information: [5] and [6].
This [7] is the official statement from Patreon, which I also found quoted in a number of other sites. It says: "There was unauthorized access to registered names, email addresses, posts, and some shipping addresses. Additionally, some billing addresses that were added prior to 2014 were also accessed. We do not store full credit card numbers on our servers and no credit card numbers were compromised. Although accessed, all passwords, social security numbers and tax form information remain safely encrypted with a 2048-bit RSA key. No specific action is required of our users, but as a precaution I recommend that all users update their passwords on Patreon." --Daniel Carrero (talk) 15:36, 21 October 2015 (UTC)
Also: "The unauthorized access was confirmed to have taken place on September 28th via a debug version of our website that was visible to the public. Once we identified this, we shut down the server and moved all of our non-production servers behind our firewall." --Daniel Carrero (talk) 15:41, 21 October 2015 (UTC)

Plain links to non-English entries[edit]

Another big problem are plain wiki links to non-English entries, e.g. [[non-english lemma]] (or is this included in your proposal?), but I think this type of cleanup could be semi-automated. Jberkel (talk) 12:56, 3 November 2015 (UTC)
@Jberkel: That sounds like a good idea! It's not included in my current work (User:Daniel Carrero/term cleanup), but it's something that I could do as a separate project if people want, after I finish the current one.
Just to be clear -- Surely we want every plain link to be converted to either {{m}} (mentions of words), {{l}} (in synonyms lists, etc.) or other templates, right, no matter if the link is to an English or non-English section? I'd guess that probably most of the 2,1 million "gloss definition" (according to WT:STATS) entries have plain links in one way or another.
If one or more people are willing to pay for that as a separate cleanup project later, I don't have any problem with editing as many entries as possible for that purpose. But bear in mind that it would be basically revising all the existing entries, (a process that I would try to speed up using CSS to spot plain links quickly or something) so of all the possible options for a future cleanup project, this might be one of the longest ones. --Daniel Carrero (talk) 13:31, 3 November 2015 (UTC)
WingerBot has gone through all Russian entries and wrapped plain links within {{l}} under all section except parts of speech and etymology.--DixtosaBOT (talk) 13:46, 3 November 2015 (UTC)
A possible plan could be like that:
  1. Let bots wrap automatically all links in all entries where it can be done faithfully (synonyms, coordinate terms, derived terms, etc.) Supposedly, bots would be unable to fix etymology, POS sections, usage notes and other sections.
  2. Create some sort of dump listing all the pages that have plain links that bots are unable to fix, so that they could be done manually if people want.
--Daniel Carrero (talk) 13:52, 3 November 2015 (UTC)
Yes, I think every plain link should be converted to a template with an explicit link target, if we want the Wiktionary data to be useful and non-ambiguous. Whenever I edit entries I always try to get rid of any [[links]] I come across. Another problem I discovered are relative links ([[#English|foo]], if already on page foo. These are not so much usability problems right now but important if we look at Wiktionary outside of a website / wiki context.
If it worked well for Russian is there a reason not to use the same bot for other languages? A combination of bot + manual work sounds like a good plan of attack. Jberkel (talk) 16:06, 3 November 2015 (UTC)
I tend to do the opposite (convert to simple [[links]]) so we may be working against each other here. Equinox 16:44, 4 November 2015 (UTC)
Why would you do that? Please stop. —CodeCat 16:58, 4 November 2015 (UTC)
Because it makes it much harder to read and edit the source code. Equinox 15:43, 5 November 2015 (UTC)
I oppose templated links in definitions. --WikiTiki89 18:57, 4 November 2015 (UTC)
It's good to know that we can still tell people who use screen readers to fuck off. DTLHS (talk) 19:14, 4 November 2015 (UTC)
I'm sure the random English word in the middle of an English sentence would be very confusing to a screen reader if it is not marked as English. --WikiTiki89 21:14, 4 November 2015 (UTC)
Would that mean English links can remain in the normal wikilink style ([[links]]), even so, should all non-English links be templated? Maybe we should create a poll for these questions? --Daniel Carrero (talk) 21:27, 4 November 2015 (UTC)
In definitions, etymology sections, usage notes, and various other places, we have running English text. If we want to link a word that we happen to use in running English text, then I think plain links are the best choice in order for the wikitext to remain easy to read. But if we were to talk about a word or present an example of text, then we should use a template even if it is in English. The former situation only occurs in English (since this is the English Wiktionary), but the latter situation can occur with any language; therefore, non-English links should always be templatized, since they are always either mentions or examples. --WikiTiki89 23:00, 4 November 2015 (UTC)
I don't agree – why should English get treated differently (and only sometimes)? As I've already said, it leaves room for ambiguity, especially when there are non-English entries using the same headword. Having two different ways of linking (plain/template) is also confusing editors, which means we have a lot of entries which link to non-English words using plain links (which is definitely the bigger problem and should get fixed first).
However, by using templates for all links, regardless of language, we minimize the possibility of mistakes and add valuable semantic information. It can be very useful for tools (like mentioned screen readers) to know a) you're linking to a headword (and not a user page etc. b) what language the word is you're linking to. If this information is not given, these tools need to "guess" it (maybe based on the context or some implied knowledge) and apply arbitrary defaults, which could be wrong; these assumptions could become invalid any time. Sure, it could be an English word, but it could very well be a wrongly tagged Spanish word. If the link has already been marked as English (or any other language) there's no extra guesswork required. And as a nice side-effect all links will be generated by the same code/template which means formatting is automatically consistent and can be tweaked "after the fact". A plain link is just a 'dumb' plain link, nothing can be changed about it.
About readability: no doubts, wikitext is a mess and will never win a prize for readability, but I think we should worry more about user readability, not editor readability. The advantages outweigh the few extra character to type.
Jberkel (talk) 04:32, 5 November 2015 (UTC)
It's not English that should be treated differently, it's running text that should be treated differently, and all of our running text happens to be English because this is the English Wiktionary. --WikiTiki89 15:42, 5 November 2015 (UTC)
What exactly is the advantage of using templated links to people using screen readers? I never used a screen reader before, would it change the voice/accent according to each language when encountering sentences like "The word bread in Japanese is パン, from Portuguese pão."?
Apart from that, some advantages I know of using templated links are: proper formatting/scripting, both the standard formatting that we all see (as in MediaWiki:Common.css) and the user-side formatting (as in Special:MyPage/common.css).
Also, when you use plain links without language ([[example]]), it doesn't point to the correct section, plus the "orange links" gadget can't work for that reason. I remember sometimes seeing horrible plain links with languages back in the day ([[example#English|example]]), but I guess nobody does that anymore, it would be too much work to type. --Daniel Carrero (talk) 19:36, 4 November 2015 (UTC)
Most Anagrams sections, which are bot-generated, still use the [[example#English|example]] format. I always convert those to {{l|en|example}} when I see them, but normal links like [[example]] I leave alone for English words as I don't see what's wrong with them. (I do change them for other languages, though.) —Aɴɢʀ (talk) 19:57, 4 November 2015 (UTC)
Apart from Anagrams sections, I was thinking of older examples like this: here's a a 2010 version of the entry pizza with some links in the format of # [[#English|pizza]]. --Daniel Carrero (talk) 05:20, 5 November 2015 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── What I did with WingerBot for Russian is:

  • Only convert links in Russian sections.
  • Only convert links in lines beginning with *.
  • Only convert links containing no Latin text (I should probably fix this further to check specifically for Cyrillic text or non-alphabetic text).
  • I skip Etymology and Pronunciation sections. In Usage Notes sections, the links become {{m|...}}, otherwise {{l|...}}. This is a sort of compromise; probably I should either skip Usage Notes or go ahead and process Etymology and Pronunciation.
  • I skip links nested inside of templates and tables.
  • When converting links, if the link is of the form [[A|B]], then it becomes {{l|ru|B}} if A is the non-accented equivalent of B, otherwise it becomes {{l|ru|A|B}}. Links of the form [[A#Russian|B]] have the #Russian part removed in the process.
  • Currently the set of pages processed is those in the categories Category:Russian lemmas and Category:Russian non-lemma forms.

This could probably be automated for other languages. It's not clear to me that we need to convert plain English-language links, but definitely it should be done for non-English links esp. those in foreign scripts. Benwing2 (talk) 00:33, 5 November 2015 (UTC)

I believe a bot would definitely work for lists of links in synonyms/related terms, etc!
But I'd like to talk about: "probably I should either skip Usage Notes or go ahead and process Etymology and Pronunciation"
Correct me if I'm wrong, but I believe the bot would be unable to accurately edit all the links in etymology sections. What about sentences like this?
  • [[A]] is the [[gerund]] of [[B]]
If A and B are terms in, say, Old Spanish, then the final result would ideally be one of the 2 options below, with the correct language codes; we'd use Template:l for simple links and Template:m for mentions:
  • {{m|osp|A}} is the [[gerund]] of {{m|osp|B}} (plain link English word)
  • {{m|osp|A}} is the {{l|en|gerund}} of {{m|osp|B}} (templated English word) -- personally, I prefer this one!
If a bot can or could do the whole work, then it would defeat the point of me manually converting all the uses of {{term|ABC}} into {{m|xx|ABC}}, to be honest! :-) At Wiktionary:Beer parlour/2013/April#Template term and lang parameter the possibility of adding language codes to {{term}} through a bot run has been discussed. According to that discussion, User:CodeCat already did part of the job, which could be done reliably by bot: she used the bot to replace {{etyl|xx|yy}} {{term|word}} with {{etyl|xx|yy}} {{term|word|lang=xx}}.
Another idea:
After User:Daniel Carrero/term cleanup finishes, I can start editing Category:Entries with non-standard headers as a separate paid project -- i.e., manually converting "Initialism", "Abbreviation" sections into "Noun", "Proper noun", etc. in all entries. Would people want that?
I'd just ask to keep the rate I suggested initially of $1 every 100 entries. 8,457 entries = US$84.57. I don't mind if it's, say, 1 person paying the whole amount or 2 people paying $42.28 each. This helps me pay my bills. Thank you. :) --Daniel Carrero (talk) 06:16, 5 November 2015 (UTC)
Well, it only works for Russian because Russian and English have different character sets. It's much harder for languages like Old Spanish, as you point out; we'd have to skip Etymology and Usage notes and Pronunciation, no way around it. For such languages I might also want to restrict further the links that get templated to be only those in a line beginning with * that look like they're part of a list (using appropriate regexes and such to determine this). Benwing2 (talk) 09:50, 5 November 2015 (UTC)
@Benwing2: I understand. Do you think that what you have done for Russian can be done for many languages with different character sets? Maybe Hebrew, Arabic, Greek, Chinese, Gothic, Armenian, Korean, Georgian, etc.? That would sound like a good plan, if that's possible. --Daniel Carrero (talk) 02:04, 6 November 2015 (UTC)
@Daniel Carrero: I can run it on other languages. The main thing I'd need is a regexp that specifies characters within the given character sets. It would look something like %AW-XY-Z where %A gets all non-letter characters and W, X, Y and Z represent the endpoints of the Unicode ranges that contain the appropriate character sets for each language. If you could help construct these ranges it would make it a lot easier to run the bot. You might be able to just snarf the character set ranges from Module:scripts/data, with a bit of checking to make sure they're reasonable. Benwing2 (talk) 00:05, 7 November 2015 (UTC)
@Benwing2: Why do you need to get all non-letter characters separately? If you find a foreign equivalent for a hyphen, a comma or an apostrophe or something else, does it change how the bot must work?
I'm going to try getting codepoints for the first few scripts now. Is this reasonable? I added the "X script languages" because I figured this would help knowing which languages your bot could edit that use each script.
--Daniel Carrero (talk) 02:38, 9 November 2015 (UTC)
Thanks! I use a regexp that gets the correct script and also includes non-letter characters so it will still catch terms that have accents, macrons, hyphens, etc., but excludes letter characters from other scripts. I'll also have it print out warnings if it finds terms that it excluded but which have non-Latin characters in them, to make sure it's not excluding too much. I'm going to start on Armenian, we'll see how it works. Benwing2 (talk) 03:17, 9 November 2015 (UTC)
I have already done that for Georgian long time ago. --DixtosaBOT (talk) 06:13, 9 November 2015 (UTC)
@Daniel Carrero I ran this for Armenian, Greek and Ancient Greek. It required both the list of characters in each script and the entry-conversion regexps in Module:languages/data2 and such. I also had it look for transliteration in parentheses after the link and try to eliminate it (or incorporate into the link, if the language isn't a translit-overriding language). To determine whether something in parens is a transliteration, it transliterates the link and then computes the edit distance (Levenshtein distance) between the auto-generated translit and the explicit translit, and if it's small enough (depending on the length of the words in question), it's accepted, although there are additional checks. When those various checks fail, there's a warning issued. If you want to do a good deed, check the warnings listed in User:Benwing2/fix-links-grc-warnings and fix up the entries needing fixing. There are about 200 of them, and many of them can be ignored. You especially want to check the warnings that mention "Levenshtein distance ... not treating X as transliteration of Y" or "Upper/lower mismatch between explicit X and auto Y", where X is what's found in parens and Y is the automatic transliteration of the link (the Levenshtein distance warning is slightly misworded). For example, the warning "WARNING: Levenshtein distance 15 too big for length 6, not treating Arktos, “Ursa Major” as transliteration of Árktos" means that it found something like [[Ἄρκτος]] (Arktos, “Ursa Major”) and determined that the stuff in parens couldn't be a translit of the link; in this case, the translit should be removed and the gloss incorporated into the link. There are also warnings of the sort "Link contains non-Latin characters not in proper charset", which are links in various non-Greek charsets that could be converted to templated links in the proper language, although a few appear to be Greek script and must contain some non-Greek character in them, which could be fixed. Benwing2 (talk) 10:18, 11 November 2015 (UTC)
BTW for Ancient Greek it was a bit tricky, or at least I had to use modern Greek (code el) in the Descendants section of Ancient Greek entries, since they share the same charset. Benwing2 (talk) 10:46, 11 November 2015 (UTC)
I've seen a few of the recent contributions of User:WingerBot, they look good!
OK, you've been using information from Module:languages/data2, but I assume you still need the codepoints I'm looking for you, right?
I'll look at User:Benwing2/fix-links-grc-warnings with more attention later. For the moment, I'll leave a few more codepoints here for you. Some of the starting or ending codepoints are combining forms, is that a problem? I can get codepoints ignoring combining forms if you want.
--Daniel Carrero (talk) 10:55, 11 November 2015 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Thanks. I did Tamil, Telugu, Oriya, Punjabi, Gujarati, and I'm currently doing Hindi and Hebrew. The combining codepoints are OK for Python but cause problems in vim syntax highlighting, so I rewrote them using \u escapes. The language-specific info comes from a combination of Module:scripts/data, Module:languages/data2 (or data3/...), and Module:links (override_translit). The actual language-specific info looks like this:


def ar_remove_accents(text):
  text = re.sub(u"\u0671", u"\u0627", text)
  text = re.sub(u"[\u064B-\u0652\u0670\u0640]", "", text)
  return text

# Each element is full language name, function to remove accents to normalize
# an entry, character set range(s), and whether to ignore translit (info
# from [[Module:links]], or "notranslit" if the language doesn't do
# auto-translit)
languages = {
    'ru':["Russian", ru.remove_accents, u"Ѐ-џҊ-ԧꚀ-ꚗ", False],
    'hy':["Armenian", hy_remove_accents, u"Ա-֏ﬓ-ﬗ", True],
    'el':["Greek", lambda x:x, u"Ͱ-Ͽ", True],
    'grc':["Ancient Greek", grc_remove_accents, u"ἀ-῾Ͱ-Ͽ", True],
    'hi':["Hindi", lambda x:x, u"\u0900-\u097F\uA8E0-\uA8FD", False],
    'ta':["Tamil", lambda x:x, u"\u0B82-\u0BFA", True],
    'te':["Telugu", lambda x:x, u"\u0C00-\u0C7F", True],
    'gu':["Gujarati", lambda x:x, u"\u0A81-\u0AF9", "notranslit"],
    'or':["Oriya", lambda x:x, u"\u0B01-\u0B77", "notranslit"],
    'pa':["Punjabi", lambda x:x, u"\u0A01-\u0A75", "notranslit"],
    'he':["Hebrew", he_remove_accents, u"\u0590-\u05FF\uFB1D-\uFB4F", "notranslit"],
    'ar':["Arabic", ar_remove_accents, u"؀-ۿݐ-ݿࢠ-ࣿﭐ-﷽ﹰ-ﻼ", False],

It doesn't take too much work to find this info. However, it takes more effort to go through the warnings, so if you have time that's definitely something that would help. Here are warnings for modern Greek (98 of them) and from five Indian languages, not including Hindi (20 of them in total):

Again, not all of these warnings need to be fixed but they could use a once-over. Benwing2 (talk) 04:00, 12 November 2015 (UTC)

Here are the Hindi warnings (20 of them):

Benwing2 (talk) 04:07, 12 November 2015 (UTC)

Here are the Hebrew warnings (49 of them):

Benwing2 (talk) 04:19, 12 November 2015 (UTC)

@Benwing2 In the meantime after you've sent that message, I've been offline for a few days and also busy with other things, but I promise I'll look into these lists.
Also, I've inserted a subsection of the conversation since the subject changed kind of abruptly with the idea of converting plain links into templated links, which I consider unrelated to the discussions above. --Daniel Carrero (talk) 23:12, 1 December 2015 (UTC)
@Daniel Carrero Cool, thanks. Benwing2 (talk) 00:19, 2 December 2015 (UTC)

Code for Westrobothnian[edit]

Discussion moved from Wiktionary talk:Beer parlour#Code for Westrobothnian.

Apparently Westrobothnian is still considered a dialect of Swedish here, even though it's linguistically impossible. As I understand, the reason is that there is no language code for Westrobothnian here yet. On Swedish Wiktionary, we use gmq-bot. Languages like Jamtish and Scanian obviously also need their respective language codes, but at this time I'm only asking for Westrobothnian in order to stop people from adding it under Swedish (I kind of hurfes when I think of people doing this; it's like watching someone paint a norseman with a horned helmet, or hearing someone argue that we only use 10% of our brain). — Knyȝt (talk) 13:01, 20 October 2015 (UTC)

I think the word you're looking for is shudder.​—msh210 (talk) 17:13, 22 October 2015 (UTC)
I have no objection to a new code for Westrobothnian, I only object to the removal of Westrobothnian forms from ==Swedish== entries as long as there's no place to move them to. We may as well discuss the other "Swedish dialects" that could be considered separate languages. In addition to Westrobothnian, they seem to be:
  1. Dalecarlian (including Elfdalian), which had the ISO-639 code dlc until 2009 or so
  2. Jamtlandic, which had the ISO-639 code jmk until 2009 or so
  3. Scanian, which had the ISO-639 code scy until 2009 or so
  4. Gutnish, which (like Westrobothnian) has never had an ISO-639 code.
The ISO-639 codes were apparently removed at the request of the government of Sweden, which didn't like the idea of their not being dialects of Swedish; the decision was thus more political than linguistic. At the moment, Scanian seems to be the only one we accommodate at all: Category:Regional Swedish includes subcategories only for Finland Swedish, Scanian Swedish, and Swedish Swedish. —Aɴɢʀ (talk) 14:10, 20 October 2015 (UTC)
I think we should use the old codes. What the government of Sweden thinks isn't relevant to linguistic interests. —CodeCat 14:27, 20 October 2015 (UTC)
Of the five under discussion, only three have old codes. We can use those, but we should probably prefix them with gmq-. The other two will need codes of their own. Westrobothnian may as well use gmq-bot, as it does at sv-wikt. If Gutnish doesn't already have a code at sv-wikt, gmq-gut will work. —Aɴɢʀ (talk) 14:50, 20 October 2015 (UTC)
Well, what do you know, we already have dlc and gmq-gut, so it's just a matter of the other three. —Aɴɢʀ (talk) 18:11, 20 October 2015 (UTC)
Would an administrator please implement the codes gmq-jmk Jamtish, gmq-scy Scanian and gmq-bot Westrobothnian in Module:languages/datax? Thank you in advance -- 15:44, 4 November 2015 (UTC)
I oppose these codes, they should be jmk and scy, since those are the ISO codes for these languages. —CodeCat 16:03, 4 November 2015 (UTC)
That is fine - there is just a need for some codes as to avoid inserting simple links [[term]] in the Descendants sections of the Proto-Germanic entries. -- 16:08, 4 November 2015 (UTC)
I oppose those codes; those aren't the ISO codes for these languages, those were the codes. They are no longer valid to use for new works. gmq-jmk, gmq-scy and gmq-bot are more consistent with the ISO standards.--Prosfilaes (talk) 23:29, 10 November 2015 (UTC)
We use sh, don't we? —CodeCat 00:13, 11 November 2015 (UTC)
And we should probably use hbs instead. I'm more interested in us doing the right thing going forward then fighting that battle, though. "scy" and friends are not found on the official list of ISO 639-3 codes.--Prosfilaes (talk) 01:28, 11 November 2015 (UTC)

Unified multilingual Wiktionary[edit]

As Wiktionary:Project – Unified Wiktionary outreach and Wiktionary:Project – OmegaWiki are no longer active, OmegaWiki makes a multilingual dictionary to describe all words of all languages with definitions in all languages. However, without further progress at m:OmegaWiki, I have thought of a possibly simpler way to make a unified dictionary right here, possibly moving to wiktionary.org in the future:

  1. Making many heading from Wiktionary:Entry_layout#Additional headings, like English, pronunciation, nouns, etc., automatically translated to other languages per users' preferences, like on Wikimedia Commons, would be valuable to encourage merging smaller Wiktionaries in other languages to this largest site.
  2. Repeating many materials like pronunciation, synonyms, antonyms, derived terms, related terms, etc., is the drawback of keeping too many Wiktionaries in many languages. Unified Wiktionary would eliminate the duplicated materials.
  3. Smaller Wiktionaries lack sizable community to justify bureaucrats and possibly administrators. Merging them here would administer all contents much more efficiently.
  4. Considering Wiktionary:Entry_layout#Variations for languages other than English, only when Wiktionaries in certain other languages want to merge hereto, should we translate them to English then to third languages. For example, Chinese entries here should not yet be translated to third languages, until Chinese Wiktionary is going to merge hereto, which is what I dream of due to too few active users and too many entries with poor quality.
  5. As OmegaWiki already has definitions of any word in many languages, Unified Wiktionary would also give etymology, usage notes, references, etc. in many languages.

I propose inviting other Wiktionaries optional merger when we are ready here based on my proposals above. Any useful comments are welcome.--Jusjih (talk) 00:29, 22 October 2015 (UTC)

Data unification can be very beneficial, but we must be careful when deciding what can and what can’t be merged. Definition- or translation-based mergers (the sort that the Wikidata folk wants to force upon us) are an outright horrible idea because they ignore the existence of anisomorphism, and that language is not mathematics.
I think the least controversial content mergers would be pronunciations and inflection tables. — Ungoliant (falai) 01:06, 22 October 2015 (UTC)
Wikimedia Incubator has Translingual Wiktionary as a stub project with a few pages and unclear format/purpose, so if there's any effort directed to making a multilingual Wiktionary, I suggest using that. I added some Portuguese verbs there myself back in 2011.
Ungoliant is right about the problems he mentioned, though. --Daniel Carrero (talk) 09:50, 22 October 2015 (UTC)
A very obvious problem is the decision of what to treat as a language. Other Wiktionaries may not accommodate reconstructed languages for example. Lithuanian Wiktionarians may object to having Proto-Balto-Slavic. And I doubt the Croatian Wiktionary will want to merge "their" language into Serbo-Croatian for our sake. Wiktionaries may also differ in the classification of languages. Even pronunciation details may differ; consider how divided our own Wiktionary is on /a/ vs /æ/ for British English. —CodeCat 14:40, 22 October 2015 (UTC)
Thanks so much for all of your valuable comments. Wikimedia Commons already uses {{int:summary}}, {{int:support}}, {{int:oppose}}, etc. to automatically translate common phrases to many languages, and this is needed to go multilingual here. We should try a pilot program to phase in auto-translation as on Commons. For example, ks:ठूल is really Kashmiri-English, so once we auto-translate important layouts, we may be able to bring the contents of many smaller Wiktionaries here. As many minor languages lack their own Wiktionaries and some have been closed out, their users may want to come here to translate their words, compounds, and phrases to English.--Jusjih (talk) 16:54, 22 October 2015 (UTC)

Anyway, it's impossible: in a common project, common decisions must be taken, and there must be a common language for discussions. Discussions are in a different language for each wiktionary, and this principle cannot be changed: I'm not sure that you would be willing to close en.wikt in order to merge data with fr.wikt, and to change your discussion language to French. Wiktionaries are a success, Omegawiki is a failure (7 contributions a day: this is the average for the last 7 days), and there are good reasons for this situation. Trying to adopt Omegawiki principles would not be beneficial at all, it would kill projects adopting them. I am convinced that the best way to share data is through bots (bots importing data when possible, bots checking data consistency when possible, bots providing lists of words or list of translations, etc.) Lmaltier (talk) 20:49, 22 October 2015 (UTC) Of course, anybody can contibute to any wiktionary, not only in one's native language. This is already the case. And it would be beneficial to provide translation tables in all entries (except inflected forms), not only in English word entries. But this does not change the principle: one project for each discussion language. Nobody would propose a single unified Wikipedia. It's the same for wiktionaries. Lmaltier (talk) 21:00, 22 October 2015 (UTC)

I do not mean closing out English Wiktionary to go multilingual. I just suggest trying limited auto-translation here, as on Wikimedia Commons, to test merging minor languages in, while keeping major languages separate, as many Wiktionaries in minor languages have been closed out. If no consensus to go very multilingual, we need more global bots to coordinate Wiktionaries in different languages, maybe in connection with Wikidata.--Jusjih (talk) 00:59, 23 October 2015 (UTC)
Well, your title above is Unified multilingual Wiktionary... If the main idea is automatic translation of Wiktionary pages, tools already exist for major languages. For other languages, anyway, human translation of definitions, etc. would be needed, and there is no reason why there would be more contributors on a site in a foreign language than on a site in one's own language... Lmaltier (talk)
As files from Wikimedia Commons may be transcluded on other wikis with local descriptions possible, I am thinking of keeping language subdomains for language-dependent things, like categories, so even should the unified multilingual Wiktionary be approved through Meta request, only main articles will likely be imported there for internationalization of entry layouts, etc. for future transclusion on language subdomains. I will open further discussion on Meta later. Thanks.--Jusjih (talk) 00:44, 24 October 2015 (UTC)

Verbs that introduce a subordinate clause[edit]

Verb senses like “I think [that] she is here”, “I saw everyone go away”. Is there a grammatical term for them? — Ungoliant (falai) 19:09, 22 October 2015 (UTC)

Your two examples are completely different. "Think" is simply a transitive verb and the subordinate clause functions as a noun phrase: What do you think? I think thoughts. What are your thoughts? My thoughts are that she is here. "Saw" is a sensory verb, and it seems that sensory verbs have a special case where their direct objects can be modified by a bare infinitive in addition to a participle: I saw everyone. Everyone was going away. I saw everyone going away. I saw everyone go away. Everyone was go away. It does seem odd. I would like to know the reason behind this. --WikiTiki89 19:42, 22 October 2015 (UTC)
I'm unaware of a special term for verbs that can take clauses as their complement, but the difference between your two examples is the kind of clause they take. "Think" is followed by a noun clause, "(that) she is here", which (when that is removed) is by itself a complete sentence. "Saw", however, is followed by a small clause, "everyone go away", which is not a complete sentence and cannot be introduced by that. (You can tell it's not a complete sentence because everyone takes a singular verb, "everyone goes away", but in this case go is in the bare stem form.) When think means "hold an opinion" rather than "believe something to be the case", it can take either a noun clause or a small clause: I think that she is pretty or I think her pretty. —Aɴɢʀ (talk) 20:21, 22 October 2015 (UTC)
Thank you both. I guess I’ll continue to label them transitive. — Ungoliant (falai) 20:29, 22 October 2015 (UTC)
There is a category of verbs called "reporting verbs": Category:English reporting verbs, [8]. - -sche (discuss) 03:32, 23 October 2015 (UTC)
Yes, though neither "think" nor "see" is a reporting verb. —Aɴɢʀ (talk) 12:20, 23 October 2015 (UTC)
But it doesn't seem that reporting verbs are all that special grammatically. It seems more of a category that writers use to avoid saying "he said". --WikiTiki89 15:31, 23 October 2015 (UTC)
"See" and "think" are labelled as reporting verbs by the site I linked to. "Think", at least, seems to be able to report things about as well as "order", which our own category gives as a reporting verb:
John shouted "leave!" — [report:] (I left because) he ordered me to leave.
John told me I should leave. — [report:] (I left because) he thought I should leave.
The category does seem to be rather amorphous, as Wikitiki notes. - -sche (discuss) 20:11, 24 October 2015 (UTC)

Vote on disallowing extending of votes - 7 days remaining[edit]

FYI, you can still vote at Wiktionary:Votes/pl-2015-07/Disallowing extending of votes.

Current results:

  • Support: 8 - 66%
  • Oppose: 4 - 33%
  • Abstain: 2 - N/A

--Dan Polansky (talk) 09:24, 24 October 2015 (UTC)

Is this vote going to be extended? :) —CodeCat 13:42, 24 October 2015 (UTC)
<sarcasm>With 66% support, this vote could use more time to build consensus. Definitely extend.</sarcasm> --Daniel Carrero (talk) 13:46, 24 October 2015 (UTC)
From the vote: "Duration note: The vote is set for three months, and is not expected to be extended, to prevent discussions about circularity or recursiveness of the vote." --Dan Polansky (talk) 14:21, 24 October 2015 (UTC)

Please see - capital letter discussion[edit]

Please give your opinion on this discussion above, in which I proposed moving an appendix (already formatted as an entry) to the main namespace, to make it searchable. Current "results":

  • 2 support (me and Andrew Sheedy)
  • 0 oppose
  • 0 abstain

--Daniel Carrero (talk) 11:00, 24 October 2015 (UTC)

The principle for which this case would be a precedent is the placement in mainspace of material that can be presented in the form of an entry even though it does not share the common characteristics of dictionary entries, appearing typically in a style manual, usage guide, or grammar.
One conceptual difficulty is that the headword, proposed to bear the headword "Unsupported titles/Capital letter, with the displayed title as [capital letter]", does not have the same relationship to the content as a normal headword does to a normal entry.
One suggested advantage is that the page would be included in searches. I can't quite imagine how a normal user, ie, one not an active participant in discussions such as this, would ever enter terms that would find the article and put it on the first search page, let alone near the top.
I propose the following alternative. As we already have an entry for capital letter (to which I have added a "See also" link to the Appendix), every L2 section of every one of our entries for capital letters should have such a link, preferably to the L2-like section of Appendix:Capital letter. DCDuring TALK 12:11, 24 October 2015 (UTC)
Just a note: I'm going to paste here my original rationale about moving the appendix to the entry namespace, assuming that the discussion should continue here:
Not only the whole page is formatted like an entry, if we assume that entries like A, B, C, etc. should have senses like "found in the beginning of proper nouns" and "found in the beginning of sentences", "found in the beginning of taxonomic names", etc., then the page Appendix:Capital letter suppresses the need for creating those definitions in every single letter. Think of it as a merger of all the entries for capital letters because they would have repeated information otherwise. The idea of "capital letter" is something of lexical significance, and completely able to be checked for attestations just like a normal entry. Also IMHO it is more important than the entry ] [.
Re DCDuring: I take your points about the headword and also about the difficulty of this page appearing at the top of search results. I like your idea of linking from every entry of every capital letter to the appendix. --Daniel Carrero (talk) 12:23, 24 October 2015 (UTC)
To save space, I think we should link the letter entries from the headword line, as opposed to linking them from the see also section. See all the letter sections in the entry B. I edited a few templates to make the headword lines of capital letters link to the appendix, at least when they use separate templates like {{en-letter}} and not {{head|en|letter}} (changing that would require me to edit {{head}}). Feel free to discuss. --Daniel Carrero (talk) 12:37, 24 October 2015 (UTC)

Incomplete etymologies[edit]

Quite often, there are words where I can easily tell where the word eventually came from, but it's much harder to determine how it made its way into the language. For example, Northern Sami anánas originates from the same source as similar words in most other European languages, but it could have been borrowed through Norwegian, Swedish or Finnish (the three languages that most Northern Sami loanwords come from) and I can't tell which one. návli is from Germanic, but did it come from Norwegian or Swedish, from Old Norse, or straight from Proto-Germanic?

Sometimes, etymologies are doomed to be incomplete because there just isn't enough information. I generally just put in what I can figure out myself. But I think it would be useful if there was a way to tag an entry with an "incomplete etymology" tag of some sort. Currently I've used {{rfe}} for this, but I don't think that's really correct when there is some etymology, just not enough to really explain the origin of the word in the necessary detail. Any thoughts on this? —CodeCat 19:54, 24 October 2015 (UTC)

{{etystub}} is supposed to be for exactly that purpose. The case of anánas is different, in my opinion; we may never know the answer, and it would be best simply to list all three likely possibilities and give the ultimate etymon. —Μετάknowledgediscuss/deeds 20:12, 24 October 2015 (UTC)
{{etystub}} is a little bulky for the job. Something like {{rfelite}} would be better, but an expression of incompleteness like ultimately from Proto-Germanic or a more recent North Germanic language and placement in a category such as Category:Incomplete Northern Sami etymologies would seem to do the job. DCDuring TALK 00:37, 25 October 2015 (UTC)
I rephrased {{etystub}} a bit. —CodeCat 00:48, 25 October 2015 (UTC)
You sell yourself short. You've completely changed the visual appearance of the template. DCDuring TALK 00:54, 25 October 2015 (UTC)
I was expecting a change from a table to bare text or vice versa when I read that, but it's just plain text (which I think is good). I agree with CodeCat's edit summary, the wording is now more in line with what I'd expect from a "stub" template. - -sche (discuss) 05:37, 25 October 2015 (UTC)

Zipser German[edit]

I have words to add from Zipser German, a Central German lect which developed as such in the 1300s in Slovakia (where it is still spoken in Hopgarten as Outzäpsersch i.e. Altzipserisch) before being carried to Franzenthal, Wassert(h)al, elsewhere in northern Romania, and Bukovina, where it was over time increasingly influenced by Upper Austrian. You can see a sample at User:-sche/Zipser. I would like to give it the code gmw-zps and treat all of the dialects under the one code in accordance with the literature on the subject, which speaks of it as one language. - -sche (discuss) 06:03, 25 October 2015 (UTC)

Yes check.svg Done --Lo Ximiendo (talk) 18:46, 25 October 2015 (UTC)

Restoring WT:ELE[edit]

I request that WT:ELE (Wiktionary:Entry layout) is restored to the state of 13 October 2015. The subsequent changes seem rather subtantial to me, and require a vote, IMHO. For editing with abandon, there is Wiktionary:Entry layout/Editable. Thank you. --Dan Polansky (talk) 13:59, 25 October 2015 (UTC)

Which parts of the new version do you disagree with? —CodeCat 14:01, 25 October 2015 (UTC)
I edited the WT:EL. See the history and the specific diff from the date that Dan Polansky mentioned.
I tried editing with the current consensus in mind, i. e., I believe the new version reflects best our current practices.
That said, I am aware that the policy box says "It should not be modified without discussion and consensus. Any substantial or contested changes require a VOTE." That was substantial indeed, and now contested by Dan Polansky. --Daniel Carrero (talk) 14:16, 25 October 2015 (UTC)
The point is that the change is substantial. Per "substantial or contested" in "Any substantial or contested changes require a VOTE" (on the top of the page), it suffices that it is substantial; it does not even need to be contested. I really don't see what there is to discuss, unless I have woken up in some Orwellian world, again. --Dan Polansky (talk) 14:22, 25 October 2015 (UTC)
That's OK. I am going to restore the 2 policies to the point you requested and create a vote for them. --Daniel Carrero (talk) 14:24, 25 October 2015 (UTC)


I'd like to know if other people disagree with any of the changes/proposals. Thank you. --Daniel Carrero (talk) 15:24, 25 October 2015 (UTC)

Restoring WT:CFI[edit]

I request that WT:CFI is restored to the state from 5 September 2015‎. For free editing, there is Wiktionary:Criteria for inclusion/Editable. Thank you. --Dan Polansky (talk) 14:04, 25 October 2015 (UTC)

I restored CFI too per your request and I am going to create a vote for it, too. Though, FWIW, I'd like to say that while the changes to EL were going to be very substantial, I don't consider the changes to CFI to be substantial. Diffs:
  1. Wiktionary:Criteria for inclusion: 35064944 (diff)
  2. Wiktionary:Entry layout: 35064941 (diff)
--Daniel Carrero (talk) 14:40, 25 October 2015 (UTC)
The CFI change may be less substantial but is bad, and I am going to oppose it. It introduces phrasing "It has been voted" and it introduces rationales in "One reason for having separate pages ...". That is bad for a policy page, IMHO, and AFAIK some people agree with me in this regard. A policy page should state its shoulds AKA regulations and that's it. It should not state "It has been voted on"; we have refereces to votes for that. And rationales should be in the votes that lead to the policy, not in the policy itself, IMHO. --Dan Polansky (talk) 15:29, 25 October 2015 (UTC)
@Dan Polansky: Point taken. In my proposed revisions, there is 1 explicit mention to a vote in the CFI, and 1 in the ELE. I intend to remove those from the proposal, per your criticism. Are there many other points you would disagree with? Or: Would you support the proposal? If not, what could change in the proposal before you could consider supporting it?
Sometimes, I see you posting long, detailed arguments about a given issue. If you have time/interest to review the proposal to be voted, we could discuss any changes to be made before the vote starts. --Daniel Carrero (talk) 22:10, 26 October 2015 (UTC)

Suggestion: Edit to Template:policy[edit]

I propose editing Template:policy like this, to organize the different types of policies:

Application-certificate Gion.svg This is a Wiktionary policy, guideline or common practices page. It must not be modified without a VOTE.
Entries: CFI - EL - NORM - NPOV - QUOTE - DELETE. Languages: LT - AXX. Others: BLOCK - BOTS.

(I removed WT:REDIR recently as outdated and added WT:LT because I believe it's important, like WT:AXX.)

--Daniel Carrero (talk) 02:12, 27 October 2015 (UTC)

There are pages like WT:About English where parts of the pages are policy and parts aren't. I think this could be made clearer with an optional parameter in the template. Renard Migrant (talk) 14:27, 27 October 2015 (UTC)

Category:Regional Hebrew for diachronic varieties[edit]

Currently, the categories Category:Classical Hebrew, Category:Biblical Hebrew, Category:Mishnaic Hebrew, and Category:Israeli Hebrew are categorized under Category:Regional Hebrew. However, these are all diachronic varieties of Hebrew and all existed in essentially the same region. Is there a better way to categorize diachronic varieties of a language? What other languages have this problem? --WikiTiki89 22:24, 27 October 2015 (UTC)

Pretty much every old language that has varieties. Latin in particular. —CodeCat 22:46, 27 October 2015 (UTC)
Yes but we don't seem to have a Category:Regional Latin. --WikiTiki89 01:11, 28 October 2015 (UTC)

Documenting how to handle long s and ligatures[edit]

We exclude a number of graphical variants, such as long s (Talk:diſtinguiſh) and ligatures like f-i, s-t, f-f-l and so forth (Talk:fisherwoman, Talk:philerast), but these two practices are not explicitly documented on a Wiktionary-namespace page as far as I can tell. I'd like to know if these practices are still supported; if they are, I'll document them somewhere (perhaps in WT:CFI#Spellings in the vicinity of the line about combining characters?). By the way, can our javascript be made to redirect etc to fi etc, like it redirects ſ to s? (Go to [[ſiſter]] and you're sent to [[sister]] after a second, but go to [[fish]] and your browser sits on the blank page.) - -sche (discuss) 02:15, 28 October 2015 (UTC)

Wouldn't this be very language dependent? So maybe the specific language policy pages would be a better place. DTLHS (talk) 02:32, 28 October 2015 (UTC)
Not really. Is there any language where is a different letter from fi? —CodeCat 03:11, 28 October 2015 (UTC)
Make it policy (vote anyone? as much as I hate the v-word) and then add exceptions as we find them. If we find them, I agree with CodeCat there probably aren't any. A lot of our policies aren't documented because of the difficulty of getting stuff through a vote (votes failing with 65% support and whatnot) but it's rarely a problem because there are few enough of us we can just discuss it. Renard Migrant (talk) 13:20, 28 October 2015 (UTC)
The rules actually were very language-dependent, especially with regard to sequences of two or more S's. --WikiTiki89 14:49, 28 October 2015 (UTC)
Go on. Renard Migrant (talk) 15:03, 28 October 2015 (UTC)
I swear I thought there were differences, but I can no longer find any evidence of them. Perhaps I was thinking of v vs. u, where some languages used u even at the beginnings of words. --WikiTiki89 15:09, 29 October 2015 (UTC)
@Wikitiki89: here you go. --Romanophile (contributions) 15:30, 29 October 2015 (UTC)
That really is a good source supporting the idea that the long s is a typographical variant of s as opposed to another letter. Renard Migrant (talk) 16:50, 29 October 2015 (UTC)
I think that we should make an appendix for ſ, similar to how we have an appendix for capital letters. For the stylistic ligatures, though, there’s not much to say about them. Even as potential redirects, they wouldn’t be very utile considering that very few people know how to type them. Plus, Unicode discourages them anyway. Automatic redirections would be acceptable with me, though.
A policy prohibiting stylistic ligatures is okay with me. It would have been nicer if somebody made that years ago, though. --Romanophile (contributions) 13:21, 29 October 2015 (UTC)

A few proposed changes to WT:NORM[edit]

There are a few changes that are being proposed at Wiktionary talk:Normalization of entries. Since not many people have responded, I'm letting you know here. —CodeCat 20:13, 28 October 2015 (UTC)

Restoring WT:NORM[edit]

I ask that WT:NORM is restored to the state of 5 September 2015. Since then, substantive (meaning-changing) changes took place without a vote, and that cannot be per what it says at the top of WT:NORM: "Any substantial or contested changes require a VOTE." The addition of the template containing this text to the page is a consequence of Wiktionary:Votes/pl-2015-07/Normalization of entries 2.

Let me reiterate that my contesting the changes now is not necessary; the condition contains an or: "Any substantial or contested changes require a VOTE." --Dan Polansky (talk) 20:18, 28 October 2015 (UTC)

You haven't actually contested any changes. You just voiced a blanket "I don't like it"-style disagreement with no rationale. —CodeCat 20:20, 28 October 2015 (UTC)
I have made the restoration. I point the above editor and anyone else to the word "or" in the condition. I have my hopes. --Dan Polansky (talk) 20:20, 28 October 2015 (UTC)
I've restored the current version. The changes that were made didn't change the meaning, so this proposal is unconstructive and in bad faith. On that ground I reserve the right to reverse it. —CodeCat 20:23, 28 October 2015 (UTC)
I ask for restoration. I won't revert war on that page; someone else has to restore the page to a proper state. If the page will not get restored, it will ipso facto cease to be a policy. --Dan Polansky (talk) 20:27, 28 October 2015 (UTC)
What Dan Polansky is saying is that a voted-on policy shouldn't be altered without a vote. It is a bit of a pain when someone missed a comma out and nobody wanted to vote against a proposal merely on the basis of a missing comma, then you need another vote just to insert a bleeding comma where one is needed. Renard Migrant (talk) 16:43, 29 October 2015 (UTC)
Eats, shoots, and leaves. DCDuring TALK 17:52, 29 October 2015 (UTC)
That is not entirely accurate. You can correct a missing or misplaced comma without a vote as long as it does not change the meaning of the sentence. It follows from "Any substantial or contested changes require a VOTE", which was voted by the community to apply in Wiktionary:Votes/pl-2012-03/Vote requirements for policy changes. --Dan Polansky (talk) 09:32, 7 November 2015 (UTC)
As such pages are not assuredly on anyone's watchlist and certainly not on everyone's, shouldn't a contributor making such changes draw attention to them by giving notice here to expose them to being deemed "substantial or contested"? To me that seems like an efficient way of reducing acrimony. This search illustrates that commas have been involved in controversies of interpretation. DCDuring TALK 12:49, 7 November 2015 (UTC)
I think it is enough for such pages to be on the watchlist of the admins who care enough to enforce such policies. --WikiTiki89 14:51, 9 November 2015 (UTC)
What about those who would prefer that such policies were not altered and then imposed on them? What about new and occasional contributors, potential recruits to more substantial contribution?
We seem to have a more than sufficient number of folks whose principal contribution is finding rules to impose on content contributors. The rules can hardly be said to make it easier for new contributors to get involved in our efforts. DCDuring TALK 21:47, 9 November 2015 (UTC)
Which is why no one can make any changes to the rules (i.e. changes to the substance of a policy page) without a vote; and admins who care enough will watch the policy pages to make sure such changes are not made. --WikiTiki89 22:02, 9 November 2015 (UTC)

"Proper" codes for etymology-only languages part 2[edit]

In Wiktionary:Beer_parlour/2013/December#"Proper" codes for etymology-only languages, it has been proposed renaming the etymology-only language codes into the "proper" format of aaa-aaa, like this: Late Latin/LL. > "la-lat". Can we do that now? I take it would require a vote?

Rationale: Standardization. It is weird that the different codes work in different ways:


  • Late Latin: both {{etyl|LL.|en}} and {{etyl|Late Latin|en}} work (it does not matter if we use the code or name)
  • Latin: only {{etyl|la|en}} works, {{etyl|Latin|en}} does not work (we have to use the code, not the name)


  • Late Latin: only {{etyl|la-lat|en}} should work, {{etyl|Late Latin|en}} would not work anymore

It would require a bot changing the codes in all entries before full implementation.

That discussion also introduced the idea of leaving both old and new codes working together for a while (LL. and la-lat) as a transition period while people get used to them. What do you think? --Daniel Carrero (talk) 03:38, 29 October 2015 (UTC)

We've already been in the transitional period since then. The new codes already work. See Module:etymology languages/data. —CodeCat 13:35, 29 October 2015 (UTC)


This category claims that Singlish is an English-based creole. If that is true, it needs to be given its own language code and this category needs to stop being used in English entries. — Ungoliant (falai) 13:41, 29 October 2015 (UTC)

Wouldn't that force us to have separate Singlish sections for the thousands of nouns, adjectives, etc. that are used unaltered from English in Singlish? (Actually, is the same true of e.g. Scots?) Equinox 11:11, 30 October 2015 (UTC)
Yes, but if it’s a different language it’s a different language. — Ungoliant (falai) 13:47, 30 October 2015 (UTC)
Singlish lists three sources asserting that it's a creole, and looking at the example sentences, especially those labeled basilectal, I'm inclined to agree. (I certainly wouldn't consider "Dis guy Singrish si beh zai sia" to be an utterance of a dialect of English.) According to Category:Creole or pidgin languages, the code we use for creoles and pidgins is crp, so maybe crp-sng? Incidentally, why do we have both Category:Creole or pidgin languages and Category:Pidgins and creole languages? —Aɴɢʀ (talk) 14:54, 30 October 2015 (UTC)
Line 172 of Module:category tree/langcatboiler is the culprit. — Ungoliant (falai) 15:07, 30 October 2015 (UTC)
More specifically, that module is not in agreement with Module:families/data, which specifies "creole or pidgin" as the name of the language family. Can someone more versed in editing modules than I am please fix it? —Aɴɢʀ (talk) 19:59, 30 October 2015 (UTC)
I have merged the categories at Category:Creole or pidgin languages. - -sche (discuss) 23:15, 11 November 2015 (UTC)

Rollback in LenovoTest01[edit]

Add flag rollback LenovoTest01. admin group. LenovoTest01 (talk) 09:11, 30 October 2015 (UTC)

Why? Who are you? Equinox 11:07, 30 October 2015 (UTC)
According to WP, a blocked sockpuppet of w:User:Никита-Родин-2002, who progressed from vandalism to good-faith-but-disruptive cluelessness and incompetence, all using a host of IPs and sockpuppets. Chuck Entz (talk) 21:03, 11 November 2015 (UTC)
Request denied. Unknown user with zero edits. SemperBlotto (talk) 11:17, 30 October 2015 (UTC)

Wiktionary:Votes/pl-2015-10/Entry name section[edit]

FYI: I created Wiktionary:Votes/pl-2015-10/Entry name section. It is a vote about the "entry name" section of WT:EL. --Daniel Carrero (talk) 18:20, 30 October 2015 (UTC)

November 2015

"Headword line" vote started[edit]

FYI: Wiktionary:Votes/pl-2015-10/Headword line started today. --Daniel Carrero (talk) 02:05, 1 November 2015 (UTC)

Renaming Wiktionary:Normalization of entries (WT:NORM) → Wiktionary:Entry source code (WT:ESC)[edit]

What do you think about renaming Wiktionary:Normalization of entries (WT:NORM) into Wiktionary:Entry source code (WT:ESC)? (while keeping the old name and shortcut as usable redirects, naturally)

The name would be more intuitive as to the actual purpose of the policy.

I apologize since I'm the one who had chosen the current name of the policy ("Normalization of entries") in the first place, based on that old discussion called "Normalization of articles". But I've been thinking a lot about this policy and I believe it would be an improvement for the reason above.

It would also match the name of the policy Wiktionary:Entry layout. (rather than "layout of entries")

Finally, the name "Normalization of entries" is unclear. Since Wiktionary:Entry layout, too, provides rules for "normalization" of "entries", the uninformed reader cannot tell at a glance why we have two different policies currently named as "Entry layout" and "Normalization of entries". If the names were "Entry layout" and "Entry source code", it would be easier to make that distinction. --Daniel Carrero (talk) 03:32, 1 November 2015 (UTC)

I think a better name would be Wiktionary:Wikicode normalization. —CodeCat 14:13, 1 November 2015 (UTC)
Between "Entry source code" and "Wikicode normalization", I prefer the former. As a secondary, not much important reason, the acronym WT:ESC looks nice. The main reason is this:
IMO, the word "Normalization" is superfluous in the same way that the word "explained" was superfluous in WT:ELE. You could have: Wiktionary:Normalization of entry layout, Wiktionary:Normalization of criteria for inclusion, Wiktionary:Normalization of blocking policy, Wiktionary:Normalization of page deletion guidelines and Normalization of bots. --Daniel Carrero (talk) 17:15, 1 November 2015 (UTC)
@CodeCat: That said above, I'm going to add "Wikicode normalization" in the proposal of my vote per your suggestion. I am going to oppose "Wikicode normalization", but I don't know what other people will vote. I'm going to use "WCN" as the proposed abbreviation, please edit the page or propose another one if you disagree with "WCN". --Daniel Carrero (talk) 17:20, 1 November 2015 (UTC)
The presence of source code in the name may scare non-programmers from reading it. — Ungoliant (falai) 17:34, 1 November 2015 (UTC)
What about Wiktionary:Wikicode style guide? --WikiTiki89 17:36, 1 November 2015 (UTC)
Excellent. I support that. — Ungoliant (falai) 17:38, 1 November 2015 (UTC)
Ungoliant's comment about scaring non-programmers is true, maybe the name really should have "wikicode" and not "source code". A few possibilities:
--Daniel Carrero (talk) 17:47, 1 November 2015 (UTC)
the hell is wikicode?--Dixtosa (talk) 18:53, 2 November 2015 (UTC)
@Dixtosa: w:Help:Wiki markup. --WikiTiki89 19:01, 2 November 2015 (UTC)
wikicode (currently a redlink) = wikitext (currently a bluelink) --Daniel Carrero (talk) 19:05, 2 November 2015 (UTC)
The chances of wikicode to be understood as related to language code is high. Lets stick to plain easy wt:norm which coincidentally has a good shortcut ?:/
Also, the Wikipedia's page on wiki markup mentions wikicode once for a reason. --Dixtosa (talk) 19:30, 2 November 2015 (UTC)
If we're going to use "wiki-[something]", "wikitext" seems both more common and 'friendlier' to a non-programmer than "wikicode". "Normalization of entries" does sound like it should be about rules on e.g. using accents vs macrons vs nothing for Old English vowel length. - -sche (discuss) 19:42, 2 November 2015 (UTC)
What about using "wiki markup" or "markup" in the policy name?
--Daniel Carrero (talk) 19:56, 2 November 2015 (UTC)
I like "Entry markup". "Wiki markup" sounds a bit like a how-to guide, how to use wiki markup. - -sche (discuss) 20:20, 2 November 2015 (UTC)

Created Wiktionary:Votes/pl-2015-11/NORM: 10 proposals[edit]

I created Wiktionary:Votes/pl-2015-11/NORM: 10 proposals.

This vote is larger than average, so I've set it up to start in 14 days and last for 2 months.

Feel free to discuss. --Daniel Carrero (talk) 07:21, 1 November 2015 (UTC)

I got these 10 proposals from CodeCat's reverted edits, from the Beer parlour, the NORM talk page and the NORM votes. The "Discussions" part of the vote should be able to link all the sources. --Daniel Carrero (talk) 03:01, 2 November 2015 (UTC)

Proposal: exclude non-printing characters, encoding variants of characters, and combining characters[edit]

Combining diacritics are really just variants of their non-combining equivalents. They are not lexicographically different characters. They also pose many technical problems because of how they are handled. Non-printing characters like control codes are also difficult to work with, and also have no lexicographical value since they do not appear in text by definition. We currently have some control codes as redirects, but these redirects are themselves inaccessible and can't be edited (not even by a bot, as I found). As for "encoding variants", I'm talking about things like versus C. These are the same character, merely encoded with different code points in Unicode and displayed a bit differently by the font. I think should redirect to C. —CodeCat 14:12, 1 November 2015 (UTC)

Symbol support vote.svg Support Redirect ("Roman numeral" C), ("Fullwidth" C) and probably others to C, then place a complete list of codepoints for varities of "C" on the main entry. --Daniel Carrero (talk) 16:49, 1 November 2015 (UTC)
Are you talking about entries for the individual characters only? Some languages need combining characters because they use unencoded character + diacritic combinations. — Ungoliant (falai) 17:37, 1 November 2015 (UTC)
Yes, just for the single characters. Links to combining characters go wrong in all kinds of ways, just try clicking on this link: ̅. —CodeCat 17:39, 1 November 2015 (UTC)

Context label "North America"[edit]

Currently, {{lb|en|North America}} produces the text (Canada, US). I think this is wrong. If someone puts "North America" as a context label, it should show up as (North America); if they wanted to say "Canada, US", they would have. --WikiTiki89 16:36, 1 November 2015 (UTC)

It would be quite funny is {{lb|en|UK}} produced (England, Northern Ireland, Scotland, Wales). Seriously though, Wikitiki89 is right. Renard Migrant (talk) 23:53, 1 November 2015 (UTC)
I agree. This generates wrong information when, i.e., used in a sense is used in Mexican and US Spanish. — Ungoliant (falai) 00:01, 2 November 2015 (UTC)
The behaviour was implemented because template was being used as a shorthand for "Canada, US" in French and English entries, and it was agreed that it was bad to split and have only some entries in "Category:Canadian English"+"Category:American English" while others were in "Category:North American English". I oppose going back to "North American" and splitting the entries up again. Deprecating the label altogether and making bot or AWB runs periodically to clean up uses could work. - -sche (discuss) 01:15, 2 November 2015 (UTC)
I would be against a bot cleaning these things up. We could automatically put anything tagged with "North American" into all three categories. Either that or have "US" and "Canada" also put words into "North American English". If this is a categorization issue, it should be solved with categorization. --WikiTiki89 16:54, 2 November 2015 (UTC)

Changes to Template:votes[edit]

Template:votes has been recently edited to cause the enddate in past votes to be formatted yellow, and some icons have been added to past votes and votes near the enddate.

IMO it would look better without the icons. If there's no support for the icons, I request them to be removed.

Also, I don't know if the yellow text should stay. Making a distinction for past votes could be useful, and I know yellow text in this case means "past vote", but this meaning is not intuitive. Sometimes, a vote past the enddate remains open for voting because nobody closed it yet, so red text would still be appropriate and yellow text would be misleading. --Daniel Carrero (talk) 17:05, 1 November 2015 (UTC)

I agree, the icons should be removed. Perhaps the past votes should be red and the "ending soon" votes should be yellow? We would also need to use a darker yellow so that the date is actually readable on a white background. --WikiTiki89 17:14, 1 November 2015 (UTC)
I prefer to keep the icons. —CodeCat 17:40, 1 November 2015 (UTC)
The icons are too distracting. I went ahead and removed them and made the color scheme more intuitive. Now, votes ending soon will be orange, votes ending today will be red, and votes that have already ended will be gray. --WikiTiki89 18:41, 2 November 2015 (UTC)

Separate, simplified pages for letters[edit]

I suggest moving all the definitions of letters into separate, simplified pages to try and fix the problem of cluttered letter pages.

For instance, Letter:D or Appendix:Letters/D could have the following contents. I didn't take the trouble to link all the letters of the alphabet to other pages, but in the end they should be linked.

Capital and lowercase versions of D, in normal and italic type.

D uppercase, d lowercase

English: 4th letter. Name: dee.
  • Aa, Bb, Cc, Dd, Ee, Ff, Gg, Hh, Ii, Jj, Kk, Ll, Mm, Nn, Oo, Pp, Qq, Rr, Ss, Tt, Uu, Vv, Ww, Xx, Yy, Zz
Esperanto: 5th letter. Name: do.
  • Aa, Bb, Cc, Ĉĉ, Dd, Ee, Ff, Gg, Ĝĝ, Hh, Ĥĥ, Ii, Jj, Ĵĵ, Kk, Ll, Mm, Nn, Oo, Pp, Rr, Ss, Ŝŝ, Tt, Uu, Ŭŭ, Vv, Zz
Finnish: 4th letter. Name: dee.
  • Aa, Bb, Cc, Dd, Ee, Ff, Gg, Hh, Ii, Jj, Kk, Ll, Mm, Nn, Oo, Pp, Qq, Rr, Ss, Šš, Tt, Uu, Vv, Ww, Xx, Yy, Zz, Žž, Åå, Ää, Öö
Latvian: 6th letter. Name:
  • Aa, Āā, Bb, Cc, Čč, Dd, Ee, Ēē, Ff, Gg, Ģģ, Hh, Ii, Īī, Jj, Kk, Ķķ, Ll, Ļļ, Mm, Nn, Ņņ, Oo, Pp, Rr, Ss, Šš, Tt, Uu, Ūū, Vv, Zz, Žž
Portuguese: 4th letter. Name: .
  • Aa, Bb, Cc, Dd, Ee, Ff, Gg, Hh, Ii, Jj, Kk, Ll, Mm, Nn, Oo, Pp, Qq, Rr, Ss, Tt, Uu, Vv, Ww, Xx, Yy, Zz
Spanish: 4th letter. Name: de.
  • Aa, Bb, Cc, Dd, Ee, Ff, Gg, Hh, Ii, Jj, Kk, Ll, Mm, Nn, Ññ, Oo, Pp, Qq, Rr, Ss, Tt, Uu, Vv, Ww, Xx, Yy, Zz
Turkish: 5th letter. Name: de.
  • Aa, Bb, Cc, Çç, Dd, Ee, Ff, Gg, Ğğ, Hh, Iı, İi, Jj, Kk, Ll, Mm, Nn, Oo, Öö, Pp, Rr, Ss, Şş, Tt, Uu, Üü, Vv, Yy, Zz

Thoughts? I've added an image for neatness, it would be optional. --Daniel Carrero (talk) 02:26, 2 November 2015 (UTC)

I’d rather see appendices for every letter in a given language rather than the opposite (otherwise a page like Appendix:Letters/D will eventually end up having thousands of entries), but I support anything that stops them from cluttering the entries of real words. — Ungoliant (falai) 02:31, 2 November 2015 (UTC)
PS: with the exception of translingual. — Ungoliant (falai) 02:31, 2 November 2015 (UTC)
Perhaps we could keep the Translingual letter definition in all entries, and link the Translingual definitions to the separate pages like Letter:D or Appendix:Letters/D. --Daniel Carrero (talk) 02:35, 2 November 2015 (UTC)
I think it's a great idea! It would certainly clean up the letter entries. Aryamanarora (talk) 12:42, 2 November 2015 (UTC)
I prefer Ungoliant's approach, with per-language pages rather than per-letter pages. We should include links to all of these pages, or a category containing them, in the Translingual section. —CodeCat 17:23, 2 November 2015 (UTC)
Ungoliant is right that per-letter appendices won't reduce clutter (they will themselves become crowded), so per-language appendices seem better. I would definitely keep translingual sections on all of the letters, from which to link to the per-language appendices. (I would also include a ===See also=== link to a language's appendix any time we happened to have an entry for a single-letter word in that language, e.g. a#English, n#Aromanian, e#Hungarian.) - -sche (discuss) 17:58, 2 November 2015 (UTC)
Will, for instance, the entry Ğ link only to alphabet appendices that use that letter, or will it link to the full list of alphabet appendices?
Is there going to be some way to know what are the alphabets that use Ğ? Or Ñ? --Daniel Carrero (talk) 18:06, 2 November 2015 (UTC)
If we decide to link to individual appendices, then presumably Ğ#Translingual will link only to the appendices of languages that use Ğ. But then whatever section we put the links in (say, ===See also===) could easily grow to contain a hundred lines, and have to be updated any time a new appendix was created... we're probably better categorizing the appendices and then linking from entries to the category, an idea CodeCat mentioned above. Per-language appendices would allow us to give pronunciations / notes on orthography, letter names, etc... perhaps, especially for the many small languages where the info is just "the Foobarese alphabet is ABCD...YZ", the appendices could just be the WT:About X appendices. - -sche (discuss) 19:46, 2 November 2015 (UTC)
We could also put each language's appendix in a category for each character it uses, then on the entry for a, we could a put an {{also}} link to Category:Languages that use the character "a". --WikiTiki89 20:42, 2 November 2015 (UTC)
That sounds like the most workable solution. —CodeCat 21:03, 2 November 2015 (UTC)
But the English alphabet has 26 letters, from A to Z, so the categories would look like this:
Categories: Languages that use the character "a" | Languages that use the character "b" | Languages that use the character "c" | Languages that use the character "d" | Languages that use the character "e" | Languages that use the character "f" | Languages that use the character "g" | Languages that use the character "h" | Languages that use the character "i" | Languages that use the character "j" | Languages that use the character "k" | Languages that use the character "l" | Languages that use the character "m" | Languages that use the character "n" | Languages that use the character "o" | Languages that use the character "p" | Languages that use the character "q" | Languages that use the character "r" | Languages that use the character "s" | Languages that use the character "t" | Languages that use the character "u" | Languages that use the character "v" | Languages that use the character "w" | Languages that use the character "x" | Languages that use the character "y" | Languages that use the character "z"
That is too long, and also a bit hard to find the right letter inside that wall of text. Couldn't we use just Category:A, Category:B, etc. as the category names? That way, the categories would be:
Categories: A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z
--Daniel Carrero (talk) 21:39, 2 November 2015 (UTC)
Well the point is more that at each letter in the mainspace, there will be a link to just one category. They aren't intended to all be listed on each language's page. Of course this long list will appear on the bottom of the page, but that's not really a big deal. Maybe we should make them hidden categories? --WikiTiki89 03:27, 3 November 2015 (UTC)
But if we move all the letter definitions into individual language appendices like Appendix:Letters/French, then the categorization above would apply to the appendix, right? --Daniel Carrero (talk) 05:07, 3 November 2015 (UTC)
But also I don't see anything wrong with the single-letter category names either. --WikiTiki89 03:29, 3 November 2015 (UTC)

Suppose we create a simple appendix for all the letters in a single language.

Appendix:Letters/English could have:


Aa, Bb, Cc, Dd, Ee, Ff, Gg, Hh, Ii, Jj, Kk, Ll, Mm, Nn, Oo, Pp, Qq, Rr, Ss, Tt, Uu, Vv, Ww, Xx, Yy, Zz

Letter names

a, bee, cee, dee, e, ef, gee, aitch, i, jay, kay, el, em, en, o, pee, cue, ar, ess, tee, u, vee, double-u, ex, wye, zee/zed

A few questions and comments:

  1. Or should we use a table format?
  2. Should the appendix include IPA pronunciations? Sound files too?
  3. We already have a number of randomly-formatted alphabet appendices like Appendix:Mapudungun alphabet, Appendix:Polish alphabet and Appendix:Polish alphabet. Can all the appendices follow the same format?
  4. Appendix:Latvian alphabet has long texts with information and Appendix:English alphabet has a list of letters using certain ligatures, do we need all that in the appendices? I'd rather each alphabet appendix was basically just a simple, standardized list of letters. I mean, Wikipedia has w:English alphabet to fill with lots of historical information and we already have categories for words in English spelled with the ligatures.
  5. Suppose the appendix could be called any of those: Appendix:English letters, Appendix:English alphabet, Appendix:Letters/English, Appendix:Alphabet/English... But the words "Letters" and "Alphabet" don't apply to all languages, do they? What about Persian, Korean, Chinese, etc.? Is there a more comprehensive name for all writing systems? Appendix:Writing/English or Appendix:Characters/English, maybe?

--Daniel Carrero (talk) 05:48, 3 November 2015 (UTC)

Several issues with the categorisation of placenames[edit]

I’ve recently finished going through Portuguese-language toponyms, and in the process I came across several inconsistencies, issues and uncertainties in our category scheme:

Ungoliant (falai) 19:41, 2 November 2015 (UTC)

All of these are good points and I support fixing it up. —CodeCat 19:52, 2 November 2015 (UTC)
Yes, merge same-level subdivisions (so, Category:en:Provinces and territories of Canada; and merge Category:en:Autonomous oblasts of Russia into Category:en:Oblasts of Russia).
Yes, towns should also be categorized by country if cities are, and given that the distinction may be nebulous for many countries, or that inconsistent metrics may apply between countries, it may be best to lump them all into one category.
"State capitals of the United States", maybe?
Cities in the Crimea could go into both Ukrainian and Russian categories. (How do we handle cities in Palestine/Israel? In the Golan Heights?)
Modern usage is "in Ukraine" and we lemmatize [[Ukraine]], so let's be modern in the categories, too.
- -sche (discuss) 20:18, 2 November 2015 (UTC)
Russia also has krais, okrugs, autonomous okrugs and federal cities. I was hoping we could pick a name that encompasses all of them (Federal units of Russia?) — Ungoliant (falai) 20:21, 2 November 2015 (UTC)
In any case, I’ll have to start RFM discussions. I’m just checking if there is support for the general idea that there should be a single category for all subdivisions of a given country. — Ungoliant (falai) 20:27, 2 November 2015 (UTC)
(Two edit conflicts later) Be careful about cleaning them up: "States" and "Territories" in Australia are two different things in both law and common speech. Victoria, New South Wales and South Australia are states, Northern Territory, Australian Capital Territory are not states. (Although, to be extra confusing, there has been talk of promoting Northern Territory to statehood, but possibly keeping the name "Northern Territory".) --Catsidhe (verba, facta) 20:22, 2 November 2015 (UTC)
@Ungoliant MMDCCLXIV Just a note, you should check Special:WantedCategories, as there are a bunch of Portuguese-related place name categories that need to be created. Benwing2 (talk) 22:02, 2 November 2015 (UTC)
Perhaps we ought to make a project or appendix page of possibly difficult-to-categorize place names or types of place names. DCDuring TALK 17:00, 3 November 2015 (UTC)
  • Comment on a few of these:
  1. The oblast category you mention is in a gots-to-go situation; I'll even nominate it for deletion myself
  2. I don't really see the need to combine Canadian provinces and territories (or Australian states and territories). If we do, the resulting category, likely titled Category:en:Provinces and territories of Canada, would a) contain two different types of political entities, and b) be a longer title than the existing categories.
  3. Nothing "hideous" about Category:US State Capitals. a) There should be a category containing the 50 items currently in this category, and b) that's the shortest way to unambiguously title said category.

I would remind people that moving a category or template is not an action to be taken lightly; we've been a little too cavalier about it in the past. Purplebackpack89 21:16, 4 November 2015 (UTC)

  • "Category:States and territories of Australia" would be a better name. Although states and territories are legally distinct, they're almost always grouped together and serve the same function for addresses and whatnot. Just checked Wikipedia and they use the exact same category name. Combining them would also resolve any issues from the NT becoming a state, and the question of what you would make the parent category of these entities, or the user overlooking territories when viewing the category for states, etc. Pengo (talk) 07:43, 6 November 2015 (UTC)
    Symbol support vote.svg Support combining them into "States and territories of Australia". --Daniel Carrero (talk) 07:49, 6 November 2015 (UTC)

Script appendices[edit]

FYI: I populated Category:Script appendices with the help of a new template {{script appendix}}, added at the top of the appendix to generate the introduction and standardize the categories. I renamed some pages from "Appendix: X alphabet" into "Appendix:X script" for consistency with category names. Feel free to discuss. --Daniel Carrero (talk) 04:54, 3 November 2015 (UTC)

Written-out fractions[edit]

Pursuant to the discussion at Wiktionary:Requests for deletion#three quarters, I would like to propose the inclusion of a specific, limited set of eighteen written-out fractions (counting fourths and quarters as the same fraction). The reasons that I propose this particular set are as follows:

  1. These are probably the most common fractions in use (particularly given the tendency to use fifths in parliamentary procedure, and to divide inches into eighths for measurement).
  2. All of these are fractions for which the numerical form is available in Unicode, particularly ½, , , , ¼, ¾, , , , , , , , , , , , and (See Appendix:Unicode/Number Forms). If we have the Unicode form, we should have the written out form as an alternate spelling.
  3. I do not know if these exist as single-word forms in other languages, but I would not be at all surprised if at least some of them do, given their commonality.

The fractions I propose to include are in two groups:

Singular fractions from one half through one tenth:

Multiple fractions for thirds, fourths, fifths, sixths, and eighths:

I note that "one tenth" is often used in an idiomatic sense of asserting that person a is "not one tenth the man" (or other property) as person b. I also propose adding the hyphenated form of each as an adjective form (one-half, one-third, etc.). If we can agree on this specific set of eighteen fractions, that will set both the floor and a ceiling on the written-out fractions that can be included in Wiktionary. Cheers! bd2412 T 14:20, 3 November 2015 (UTC)

  • Go for it! By the way, is "three fourths" the US usage? It seems very strange to UK ears. SemperBlotto (talk) 14:34, 3 November 2015 (UTC)
    • Is "three quarters" the UK usage? I'm sure "fourths" is much more common in the U.S. bd2412 T 14:49, 3 November 2015 (UTC)
      • Ngrams suggests "three quarters" is more common than "three fourths" on both sides of the Herring Pond, but that "three fourths" represents a much larger minority in en-US than in en-GB. —Aɴɢʀ (talk) 15:41, 3 November 2015 (UTC)
        • So, {{context|chiefly American}} for the "fourths"? bd2412 T 16:10, 3 November 2015 (UTC)
          • I guess. Another thing to consider is that, although I personally consider it punctuation abuse, the hyphenated forms are also very common as nouns: one-sixth, three-quarters, five-eighths, etc. They should probably be listed at least as alternative spellings. —Aɴɢʀ (talk) 16:17, 3 November 2015 (UTC)
            • I would agree with calling that punctuation abuse. I'd be tempted to list hyphenated noun forms as a misspelling. bd2412 T 16:38, 3 November 2015 (UTC)
          • Based on Ngrams, the peak popularity of fourths in Britain was 1750-1850, but it took off much more in the US, where became pretty popular throughout the 1800s, but never quite overtook quarters and has been in a steady decline ever since. Maybe it's got something to do with avoiding confusion with a certain circular piece of metal that is not found too frequently in Britain. --WikiTiki89 16:24, 3 November 2015 (UTC)
  • Done with the basic entries. Hyphenated forms and translations will follow. Cheers! bd2412 T 21:03, 3 November 2015 (UTC)
  • Re: "the inclusion of a specific, limited set of eighteen written-out fractions" There is, of course, no reason to keep all of these if they are challenged in RfD and no reason to exclude any that are attestable if we decide that such fractions are, in general, to be excluded. It would be bad precedent to make some kind of exception to our policies without a vote.
An alternative would be for each language's (or group of languages') fraction-naming system to be documented in an appendix with redirects from attestable forms that someone finds necessary to add. DCDuring TALK 13:32, 4 November 2015 (UTC)
  • As noted above, these happen to be the 18 fractions for which Unicode versions of the numerical forms exist. I think, actually, that this speaks volumes about the set. They are, at least, alternative spellings to entries that we already have due to the inclusion of the Unicode forms. Arguably, we could have a few more fractions if they were particularly significant, like twenty-two sevenths (which is approximate pi), but I think that either the addition of any more or the exclusion of any of the above would be difficult to justify. I note that "one tenth" gets twenty times as many Google Books hits as "one eleventh", and (with the exception of the unusually popular "one twelfth") would presume the trend continues in that way. Of course, I have no objection to an appendix documenting fraction naming systems. However, we have many appendices that collect information that is also reflected in individual entries, a luxury that we have the storage space to afford. bd2412 T 16:02, 6 November 2015 (UTC)

Wiktionary-l mailing list[edit]

Hi everyone, I'm unsure if this is the correct board for this (if not, please do move it). The wiktionary-l mailing list[9] has not had active list administrators it seems and during a migration, we had to clear out a lot of emails form the list unfortunately (possibly legitimate ones). As such, we're asking for someone to take over the list so emails sent to the mailing list are moderated and handled within an appropriate time-frame. A phabricator bug[10] is open for this. Interest can be shown either here or there - both places will be monitored. If there is no interest shown by Friday, I will close the list as inactive and it can be re-opened in future if the community want it.

Thanks, John F. Lewis (talk) 16:47, 3 November 2015 (UTC)

I don't think any regular Wiktionarians still use that mailing list. We have a small community compared to en.Wikipedia, and we tend to communicate on-wiki or by direct e-mails. - -sche (discuss) 19:36, 6 November 2015 (UTC)
I didn't even know we had a mailing list. --WikiTiki89 19:54, 6 November 2015 (UTC)
Hypothetically, I believe it's supposed to encourage communication and cooperation between the various Wiktionaries, but that doesn't happen much in any case. I see no problem in letting it die. —Μετάknowledgediscuss/deeds 17:35, 7 November 2015 (UTC)

About the "Entry name section" vote[edit]

Wiktionary:Votes/pl-2015-10/Entry name section is scheduled to start in 7 days. (it was going to start in 2 days but I delayed the start a bit)

As some people know, the vote introduces an expanded "Entry name" section in the WT:EL.

Some changes to the proposed text have already been made based on suggestions in the talk page. Please review and see if you disagree with anything in the proposed text. I suggest making any changes before the vote starts, especially if it can avoid possible opposing votes in the future.

If there's any piece of text that's controversial or unwanted by some people, I'd rather cut that out and vote about the rest, so that it can be further discussed/voted later. --Daniel Carrero (talk) 23:15, 3 November 2015 (UTC)

Proposal: Add a text box to give a search term for "terms starting with" in categories[edit]

Currently, we have table-of-contents templates for a few languages, which let you click on letters to skip to that letter in the category. But this system is wholly inadequate for a dictionary. A paper dictionary lets you skip quickly to a particular term, often using indexing terms that appear at the top of the page. For example, a paper dictionary might show that a pair of pages includes terms ordered alphabetically between "lazy" and "lecture". Something like this is sorely needed for our categories, which are currently pretty much unsearchable. The only thing you can do is manually edit the URL to skip to a particular term, there's nothing on the page itself allowing for this, other than the aforementioned TOC templates. So I think that there should be a text box which lets you say "show me terms starting alphabetically from here".

A difficulty is that I have no idea how this might be implemented. The InputBox extension only has a fixed set of things it can do, and setting the start of a category listing isn't one of them. —CodeCat 15:45, 4 November 2015 (UTC)

Symbol support vote.svg Support It seems like we could do it fairly easily with JavaScript, but I really hope we can do it without JavaScript. Maybe we should ask the developers? --WikiTiki89 15:58, 4 November 2015 (UTC)
I wouldn't count on them too much. We've been waiting for years to have custom category collation orders implemented, but it's still not done. —CodeCat 15:59, 4 November 2015 (UTC)
German Wiktionary has been there, done that. See de:Kategorie:Aragonesisch for an example. We could take their script, examine it and take the good parts. It is also a good idea to show the range for a category... --Dixtosa (talk) 16:17, 4 November 2015 (UTC)
Perfect example of why I would prefer doing this without JavaScript. The text box only loads sometimes and I have to refresh the page. --WikiTiki89 16:28, 4 November 2015 (UTC)
Only loads sometimes? Hows that? Your javascript engine is stochastic? You must be using IE. --Dixtosa (talk) 16:53, 4 November 2015 (UTC)
I often have issues with scripts not working. Sometimes the buttons to expand inflection tables and translation tables go missing. —CodeCat 16:57, 4 November 2015 (UTC)
Nope, Chrome. Sometimes JS fails to load due to faulty internet connections, sometimes it fails to run because of random JS errors. JS always has lots of problems on all browsers. --WikiTiki89 18:56, 4 November 2015 (UTC)
Yes, something along these lines would be good. Equinox 16:45, 4 November 2015 (UTC)
  • Symbol support vote.svg Support, and consider similar measures for ending with and containing. Purplebackpack89 21:10, 4 November 2015 (UTC)
    I think "starting with" is the best we can do for now, it's the only thing that the software supports in the URL. I'd love it if categories could be dynamically sorted, either ascending or descending, and from beginning to end or end to beginning. Sorting by the end of the word is very useful especially for suffixes. —CodeCat 21:40, 4 November 2015 (UTC)

Name of the namespace for reconstructed terms[edit]

As the vote on whether to create a dedicated namespace for reconstructed terms is nearly over is will in all probability pass, I would like to draw attention to the poll on the vote's discussion page regarding the name of the namespace. If you have not voted and have an opinion, please do so at Wiktionary talk:Votes/2015-09/Creating a namespace for reconstructed terms#What should we name the namespace?. --WikiTiki89 21:17, 4 November 2015 (UTC)

Deleting trademark categories?[edit]

These categories have been emptied and deleted in 2014 with the text "empty category; should stay empty per WT:TM":

Some terms which pertained to these categories in the past: Durex, Air France, Skidoo, Spezi, Pampers, Pepsi and Coca-Cola.

But these categories still exist:

Should all trademarks categories be emptied and deleted because of WT:TM? --Daniel Carrero (talk) 09:33, 5 November 2015 (UTC)

  • I could see us having categories for words originating as trademarks - but we should not be in the business of maintaining the current trademark status of words. If this is all these categories do, they should be dispensed with. bd2412 T 14:05, 5 November 2015 (UTC)
Consider moving the trademark info to the entries' etymology sections, rather than just wiping the category out entirely. Equinox 15:39, 5 November 2015 (UTC)
I agree with BD, delete these categories: I have been deleting them, per WT:TM and the discussions which preceded it, in which a representative of the WMF legal team participated and it was concluded that current legal/trademark status should not be noted. For entries which originated as trademarks, that lexical information should be noted in the etymology, as Equinox says. - -sche (discuss) 21:34, 6 November 2015 (UTC)

All entries ending in -sman[edit]

Would it possible to find all English entries ending in -sman (-swoman, -speople, -sfolk...)? I think that most of these are really -s- + man, and I'd like to correct the etymologies (although of course not all of them are (talisman, chessman, hillsman are not) so an automatic change would be inappropriate). Smurrayinchester (talk) 16:17, 6 November 2015 (UTC)

Also, -shead, -sfoot and -stail. Smurrayinchester (talk) 16:20, 6 November 2015 (UTC)
I've always wished there was a Special:SuffixIndex like the existing Special:PrefixIndex. --WikiTiki89 16:24, 6 November 2015 (UTC)
I wish Special:PrefixIndex had a different name, since you can put any string there regardless of whether it's a prefix or not. And I also really wish Wiktionary had some functionality as a language-specific reverse dictionary. —Aɴɢʀ (talk) 18:34, 6 November 2015 (UTC)
They could call it Special:YouWillNeverGuessWhatThisDoes for all I care as long as we have the functionality. --WikiTiki89 20:04, 6 November 2015 (UTC)
Today I 've requested the permission to use WM databases on wmflabs. If they grant me a membership, that functionality is going to be a piece of cake.--Dixtosa (talk) 21:00, 6 November 2015 (UTC)
It would be preferable to add that functionality to individual categories, rather than to all pages on a Wiki. Then you could apply it to subsets of words, like English nouns alone. This was mentioned on the Grease Pit a few days ago too. —CodeCat 21:02, 6 November 2015 (UTC)
Which option is preferable depends on what you're using it for. --WikiTiki89 21:22, 6 November 2015 (UTC)
@Smurrayinchester User:DTLHS/sman DTLHS (talk) 19:52, 6 November 2015 (UTC)
Wow, thank you! Smurrayinchester (talk) 19:58, 6 November 2015 (UTC)
Cannot this be done using the DynamicPageList extension, which creates list of articles based on their category membership? - Amgine/ t·e 22:49, 18 November 2015 (UTC)

Formatting of ellipses to indicate elisions in attestations[edit]

Apologies if I've asked this in the wrong place, I'm still learning my way around Wiktionary.

I noticed that in egg on, where part of the passage in the attestation from In the South Seas was omitted, the periods in the ellipsis are spaced out . . . like so. I know that the English Wikipedia's Manual of Style prefers unspaced ... ellipses (and gives several reasons why), but I couldn't find the equivalent style guide for Wiktionary. Is there a preference here, and if so, where would I find it? —GrammarFascist (talk) 19:36, 6 November 2015 (UTC)

We have the template {{...}}, which seems to be a good way of achieving uniformity. —Aɴɢʀ (talk) 19:40, 6 November 2015 (UTC)
Thanks, Aɴɢʀ! Template applied, looks great. —GrammarFascist (talk) 22:25, 8 November 2015 (UTC)

Expanding on WT:COALMINE[edit]

Right now, WT:CFI has the following text: "Unidiomatic terms made up of multiple words are included if they are significantly more common than single-word spellings that meet criteria for inclusion". The canonical example is that coal mine is allowed because coalmine is. But I think the phrasing here misses the point of what it really should convey, which I think is:

If there is a lemma, of which at least one of the alternative forms is includable under the idiomaticity criterium, then all of them should be considered includable under that criterium.

The idea here is that idiomaticity shouldn't depend on orthographic representation, so if a form has different spellings, it makes sense to treat the collection of them as idiomatic if at least one of them is. Should CFI be modified to this effect? It would have the consequence of removing the "significantly more common" part, meaning that if coalmine were the most common form, then coal mine, being an alternative form of it, should also be includable. —CodeCat 00:48, 7 November 2015 (UTC)

Wouldn't that mean that coalmine would not be includable, since "coal mine" isn't idiomatic? DTLHS (talk) 01:08, 7 November 2015 (UTC)
No, it would work in both directions. coalmine is idiomatic, therefore all alternative forms of it are also includable, coal mine too. So it only expands on what's includable, it doesn't remove anything that's includable now. —CodeCat 01:10, 7 November 2015 (UTC)
  • Symbol support vote.svg Support We need to expand CFI. CFI as it's written is too restrictive and is costing us readers. Purplebackpack89 02:11, 7 November 2015 (UTC)
    @Purplebackpack89 Any evidence that CFI "is costing us readers"? DCDuring TALK 03:58, 7 November 2015 (UTC)
    @DCDuring The way I figure it, people come to an online dictionary to learn the meaning of a particular word. If that particular word isn't on Wiktionary, they go someplace else for that word...and there's a strong chance they never come back to Wiktionary. If we don't have entries people are looking for, particularly if other dictionaries have them, we will lose readers. And we know that we're not the most-read online dictionary. Purplebackpack89 13:57, 7 November 2015 (UTC)
    @Purplebackpack89 I get the argument, but I'd like some evidence to support your bold, unqualified presentation of a theory as a fact. I wonder if it isn't this and other violations of Gricean maxims that makes so many of your contributions to talk pages so irritating to me. DCDuring TALK 02:06, 8 November 2015 (UTC)
    1. Did you read what I wrote below about readership?
    2. Do you have any evidence to contradict my theory? (Because I seriously doubt you do) Purplebackpack89 02:55, 8 November 2015 (UTC)
    He also doesn't have evidence to contradict any theory that the readers are Presbyterians who like to take long walks. And you don't have any evidence for the assumptions that you're using to derive your theory. It may make perfect sense to you, but it has no more to back it up than an out-and-out assertion. Chuck Entz (talk) 03:51, 8 November 2015 (UTC)
    Same with DCDuring's theory, if he has one. The bottomline, @DCDuring @Chuck Entz is:
    1. What words we have and what words we don't should be based on how our readers think, but
    2. We don't know how our readers think

Purplebackpack89 04:08, 8 November 2015 (UTC)

  • I like the idea of clarifying the wording (as it stands, you could argue that COALMINE sanctions red car, since we have Redcar), and in general I think this is a pretty good way of explaining the logic behind the policy, and I wouldn't be opposed to loosening it up a bit (I'd be fine with bird song as an alternative form of birdsong, for instance). That said, this would allow lots of useless entries from contractions and acronyms (we don't need I am not a lawyer, even if we do have IANAL, and I don't think it's automatically implies we need it is). Smurrayinchester (talk) 09:19, 7 November 2015 (UTC)
    • If we don't want to allow those possibilities, we can change the rule. We probably want to limit it only to alternative spellings, since coal mine and coalmine are really the exact same thing, but it is and it's are further apart. —CodeCat 13:34, 7 November 2015 (UTC)
    Apparently we already have bird song. Huh. Smurrayinchester (talk) 09:20, 7 November 2015 (UTC)
  • COALMINE is clear and unambigous whereas the proposal depends on what "alternative form" means. I see no example of what would be newly included under the proposed change. A minor point, "criterium" is a rare spelling; the usual spelling is "criterion". --Dan Polansky (talk) 09:37, 7 November 2015 (UTC)
To recap history: The COALMINE vote was based on one of the set of Pawley criteria that in his opinion supported finding an MWE idiomatic. We made it a sufficient criterion to eliminate some of the more pointless debates about inclusion, without regard to occasional instances of what might be considered errors of inclusion. It has succeeded in this regard, IMO.
I'd be a bit concerned that such very rare alternative forms like cole mine might be deemed includable should they be found attested. Making them automatically includable seems like another invitation to obsessive compulsive would-be contributors to add valueless bulk to Wiktionary. DCDuring TALK 13:06, 7 November 2015 (UTC)
If cole mine meets CFI (at least three attestations from independent sources over the space of more than a year) I see no reason not to include it. We're not paper, and our usefulness is not determined by dividing the number of valuable entries by the number of total entries. —Aɴɢʀ (talk) 13:50, 7 November 2015 (UTC)
Can you give an example of something that would be included under this rule but not under the current WT:COALMINE? I lean towards oppose for the same reasons as Dan. On WT:RFD Dixtosa suggested that free throw percentage met COALMINE because of FT%, which Wikitiki noted (and I agree) is not the case. (A lot of unidiomatic phrases have abbreviations.) Btw, if something like this is added, I suggest rephrasing to "...at least one of the alternative forms meets the idiomaticity criterion, then of them should be considered to meet that criterion" (the term may still not be "includable", if it fails to meet other criteria). - -sche (discuss) 20:41, 7 November 2015 (UTC)
FWIW, @-sche, I think we need to explore having a CFI that allows us to include phrases, such as free throw percentage, that are commonly abbreviated. It seems odd that there are many instances where an abbreviation can be included, but the thing it's abbreviating can't. IMO both should be included. Would what you are suggesting achieve that? Purplebackpack89 20:56, 7 November 2015 (UTC)
As I said above, that would give us entries like I am not a lawyer and deformities, contusions, abrasions, punctures / penetrations, burns, lacerations, swelling, tenderness, instability, and crepitus. Would the extra effort it would take to verify these terms, clean and maintain them, and find some way to define them in a useful but non-encyclopedic way be the best way to spend our (often busy and rather tetchy) volunteers' time? Smurrayinchester (talk) 22:02, 7 November 2015 (UTC)
If people want to spend their time that way, they should be allowed to. It's not like expanding CFI automatically means a lot of work for all concerned. It doesn't mean people automatically have to go out and create entries; rather, they are allowed to do it at whatever pace they want. And if your question is "does it improve the project to have those entries", IMO it does. As I said above, we lose editors who are looking for things like "I am not a lawyer" and can't find it. Purplebackpack89 23:31, 7 November 2015 (UTC)
How do you know we lose them, or that people want to look up full sentences? DCDuring asked for evidence and you seemed to dodge the question. Equinox 23:32, 7 November 2015 (UTC)
How do you know we don't, Equinox? Follow my logic here: UrbanDictionary has more readers than we do. However, UrbanDictionary has (on average) lower-quality entries than we do. If UrbanDictionary has lower-quality entries than we do, how come they have more readers? The answer seems to me to be because UrbanDictionary has entries for words and phrases we don't, in particular whole-phrase and whole-sentence entries. And the conclusion from this seems to be that a lot of editorsreaders value quantity over quality; they would rather have a bad entry than no entry at all. Purplebackpack89 23:40, 7 November 2015 (UTC)
You have no logic. Because we value quantity over quality we aren't including more phrases? And the entire reason that urban dictionary has more users than us is because they have a CFI, nevermind the host of other reasons that that could be the case? It couldn't be that urban dictionary was designed from the ground up to actually be usable, as opposed to trying to shoehorn a dictionary into software that was designed for creating an encylopedia. That would be crazy. DTLHS (talk) 23:58, 7 November 2015 (UTC)
'Cuse me, did I say editors? I meant readers. Also, I think you have us and UD mixed up, we have a CFI and they don't. Also, I'm not sure I blame the lack of success of this project as being too much like Wikipedia; if anything, I think we're not enough like Wikipedia. After all, Wikipedia is like Wikipedia and they're viewed more than UD. I'd also reiterate that CFI is something that makes us less usable. @DTLHS, it's never made sense to me that somehow we'd have more readers with fewer entries, at least not at the point in the project we are now. Worry only about quality at this point in our project will not get us increased readership, or increased editorship. Purplebackpack89 00:06, 8 November 2015 (UTC)
Statistics just tell you how many people visit either site, not why they go there. UD gets a lot of its readers by being outrageous, provocative, and entertaining. People make up words and definitions there for the fun of it, or for the attention, and that's more entertaining than information about plain old ordinary usage.
We may be losing readers of the same sort as the TV viewers that watch car chases and reality shows, but that's unavoidable if you want to be a reference work that focuses on accuracy and reliability- and I, for one, don't miss them. Chuck Entz (talk) 01:45, 8 November 2015 (UTC)
I fully agree with Chuck. I like us to be sensitive to serious learners at virtually any level, but not to those whom we often end up blocking as vandals because they insert UD-quality entries and definitions. DCDuring TALK 01:59, 8 November 2015 (UTC)
OK, where's the evidence that most of the people who choose to read (yes, read, not edit) UD over Wiktionary are sensationalist vandals? That seems like a quite extraordinary claim, much more so than the claims I've advanced in this thread. Do you really believe that there's no middle ground between sensationalist vandals and you yourself, i.e. are there not people who don't create or read the most questionable of UD entries, but would like to see more two-word entries at Wiktionary, because they don't know what those words mean and would like to? Also, your attitude toward UD-but-not-Wiktionary users is quite elitist...it seems you want a well-written dictionary, and you care not if anybody reads it. Isn't having a well-written dictionary pointless if nobody reads it? Purplebackpack89 04:08, 8 November 2015 (UTC)
It looks to me like he was referring to a certain group of undesirable editors as an example of the sort of user he wouldn't be concerned with losing, not making blanket assertions about everybody. At any rate, the answer is, as usual, somewhere in between: if we try to be just like Urban Dictionary, we'll lose out to them, anyway, since they're better at being Urban Dictionary than we are. Even if we are losing some readers, we're also retaining some who would be turned off by lowered standards and/or pandering to Urban Dictionary constituencies. I also question how many people would be looking up "I am not a lawyer" that would be satisfied with a wordier paraphrase as a definition- after all, the best definition of "I am not a lawyer" is "literally: 'I am not a lawyer'". A lot of our proverb entries have this problem: the proverb is usually the most concise and efficient way to express the idea, so the definition tends to be much more wordy and technical-sounding, often to the point of being unreadable sludge. Chuck Entz (talk) 04:47, 8 November 2015 (UTC)
I think you and DC overestimate the amount of people who prefer no definition to a poor definition, and underestimate those who prefer a poor definition to no definition. Again, I say, "If there are so many people who care about quality of definitions out there, then why do more people use UD than us?" I don't think that UD's success can be explained away by its different format (which, IMO, is MORE confusing than ours) or by the fact that some of its entries are jocular. A great deal of it has to be due to its lack of an arbitrary, overly-strict CFI. Purplebackpack89 05:16, 8 November 2015 (UTC)
Also, on your I am not a lawyer argument, I'm looking at CFI/RFD from a different angle than you are. You look at the things you think we don't need and say CFI is fine. I look at the things we've deleted or come way too close to deleting (television show was deleted for over a year, field goal percentage will probably get deleted next week) and am appalled. And if you think I'm pushing for lots and lots of bad definitions that have to be kept, remember that we'll still have RFV. Purplebackpack89 12:04, 8 November 2015 (UTC)
COALMINE aside, I agree that “idiomaticity shouldn't depend on orthographic representation”. Our non-following of this idea leads to some problems:
  • words written without a space are considered idiomatic by default. While this works well enough for English (with some caveats), it falls short for languages with more rigid orthographic standards, such as German (compounds written together, no matter how unidiomatic), Spanish (clitic pronouns written together), Latin (-que) as well as for ancient languages that used scriptio continua.
  • words that would otherwise be citable fail RFV. I specifically remember two cases: years ago I was trying to find citations for Copenhagenisation/-ization, I found that one spelling had one cite and the other had two; the other case was a slang two-word compound for clitoris that was being RFVed (I don’t remember the exact word), I found two citations using XY, two using X-Y and two using X Y. The word failed RFV because each spelling had fewer than three cites, even though the word itself had 6.
About Urban Dictionary: even if Wiktionary had a readership of zero, that would be a situation preferable to becoming more like UD. They can market themselves however they like, but UD is as much a dictionary as a porn website is an educational website about anatomy. — Ungoliant (falai) 18:16, 8 November 2015 (UTC)
If Wiktionary's readership is zero, we've all wasted our time. And there are definitions on UD that not only could easily pass RfD and RfV, they are better-worded than some of the definitions we have. People around here paint UD with far too broad a brush. Purplebackpack89 23:33, 8 November 2015 (UTC)
Talk:gaplapper is similar to what you're talking about. - -sche (discuss) 07:04, 9 November 2015 (UTC)
Can't support the proposal in its current form. There have been too many counter examples given to the proposed rule, such as "free throw percentage", "cole mine", DCAPBLSTIC, and the lawyer one. Perhaps CFI could be expanded in a similar but more restrictive way to the proposal, such as by explicitly excluding acronym expansions and very rare forms (or making them case-by-case), but there clearly is far from unanimous support for their inclusion. Also, I would not be sad if no one ever engaged in a back-and-forth about what is or is not "costing us readers" ever again. Please just stop. —Pengo (talk) 07:19, 9 November 2015 (UTC)
I didn't read this proposal as including acronym forms. bd2412 T 15:00, 9 November 2015 (UTC)
Pengo, it's important to have a strong theoretical basis when voting in proposals like these. There's nothing wrong with saying you want policy changed to increase readership; there's nothing really wrong with saying you want policy changed to increase quality either. Purplebackpack89 15:34, 9 November 2015 (UTC)
Symbol oppose vote.svg Oppose since we do not have a clear definition of "alternative form" and this would then lead to the inclusion of all kinds of things that should not be included. --WikiTiki89 15:03, 9 November 2015 (UTC)
@Wikitiki89, your comments beg the questions "what things shouldn't be included", and "why shouldn't they?" Purplebackpack89 15:34, 9 November 2015 (UTC)

Terms with audio links and IPA pronunciation -- where should they go in the category tree?[edit]

There are now two category types, e.g. Category:Russian terms with audio links and Category:Russian terms with IPA pronunciation, that are currently placed under "entry maintenance" , but this doesn't seem right, since that mostly concerns mistakes that ought to be corrected. Where should they go? Should we create a new category tree section for pronunciation? Benwing2 (talk) 07:36, 8 November 2015 (UTC)

Are phrases not listed in lemmas?[edit]

Most categories are listed under lemmas, but I can't find one for phrases. Are they hidden somewhere? [11] Donnanz (talk) 13:16, 8 November 2015 (UTC)

Added. — Ungoliant (falai) 17:58, 8 November 2015 (UTC)
Thanks a lot. It only appears in the "subcategories" list though, and not in the list at the top. I checked English lemmas as well. Donnanz (talk) 21:08, 8 November 2015 (UTC)
  • Obviously done without further comment. Donnanz (talk) 12:47, 10 November 2015 (UTC)

Community Wishlist Survey[edit]

Hi everyone!

The Community Tech team at the Wikimedia Foundation is focused on building improved curation and moderation tools for experienced Wikimedia contributors. We're now starting a Community Wishlist Survey to find the most useful projects that we can work on.

For phase 1 of the survey, we're inviting all active contributors to submit brief proposals, explaining the project that you'd like us to work on, and why it's important. Phase 1 will last for 2 weeks. In phase 2, we'll ask you to vote on the proposals. Afterwards, we'll analyze the top 10 proposals and create a prioritized wishlist.

While most of this process will be conducted in English, we're inviting people from any Wikimedia wiki to submit proposals. We'll also invite volunteer translators to help translate proposals into English.

Your proposal should include: the problem that you want to solve, who would benefit, and a proposed solution, if you have one. You can submit your proposal on the Community Wishlist Survey page, using the entry field and the big blue button. We will be accepting proposals for 2 weeks, ending on November 23.

We're looking forward to hearing your ideas!

MediaWiki message delivery (talk) 21:30, 9 November 2015 (UTC)

Proposal: Disallow votes to be created by someone who intends to vote oppose[edit]

Wiktionary:Votes/pl-2015-07/Nesting inflected form definition lines was created by User:Dan Polansky who immediately voted oppose. Although he claimed that "the supporters were given enough time to tweak the vote", the vote was not crafted in a way that enough editors would have supported it. Several of the voters who voted oppose or abstain expressed that they would have supported a similar but less restrictive vote. Thus, the measure was not given a fair chance at succeeding. Therefore, I would like to propose that votes must be written by an editor or editors who intend to support the vote and thus should be willing to put enough effort into its phrasing to make it passable. --WikiTiki89 22:35, 9 November 2015 (UTC)

I wonder whether there is any precedent in real-world political systems for banning this kind of voting strategy. Equinox 22:38, 9 November 2015 (UTC)
Long ago when I was in Middle School, we were voting for class representatives and one student asked if they were allowed to vote for themselves. The teacher replied that on the contrary if they do not vote for themselves then they should not be running. I'm sure someone more into politics than I am would be able to give more concrete examples from real politics. --WikiTiki89 22:51, 9 November 2015 (UTC)
  • This is votes only? It doesn't apply to RfDs and RfVs? Purplebackpack89 23:05, 9 November 2015 (UTC)
    Votes only. We already have a similar unofficial policy at RFD that you should not nominate something you intend to vote keep for (of course you're allowed to change your mind after the fact); and there is no voting involved at RFV. --WikiTiki89 23:11, 9 November 2015 (UTC)
    And we don't enforce it lol. I've seen a number of "courtesy" or "administrative" RfDs and RfVs that were filled out by editors after somebody tagged an entry for RfD or RfV but never filled out the RfD or RfV. Purplebackpack89 23:14, 9 November 2015 (UTC)
    That's different, because whoever tagged the entry is effectively the one nominating it. But this discussion is not meant to concern RFD, so let's please stay on topic. --WikiTiki89 23:16, 9 November 2015 (UTC)
    • People raise Rf[DV]s all the time because they honestly don't think a term is attest(ed|able), but are giving people a chance to prove them wrong. Indeed, that's kind of the whole point of them. This proposal, on the other hand... I can see the point being made, that people shouldn't raise a vote in order to make it fail in order to create a precedent they like, or to make a political point, or to grief their opponents. (The Australian Republican Referendum comes to mind, where exactly that happened: it was set up by people who wanted it to fail, in such a way to force it to fail, when most people actually supported the idea.) But then, making the legality of a proposal dependant on the intent of the proposer is a really, really bad idea. Because then we'll get counter-griefers claiming that "the proposer wasn't doing this in good faith: this vote is void" if it looks like going against them. Will you nullify a vote if the proposer didn't write it well enough? If they don't defend it well enough? (How do you decide "enough"?) If they change their mind? --Catsidhe (verba, facta) 23:19, 9 November 2015 (UTC)
      • This rule does not have to affect the outcome of votes. We can start with having this be an unenforceable, but official, request to anyone creating a vote. If it remains a problem we can discuss how we can penalize creators of such votes. Keep in mind though, I am not assuming any ill-will on the part of creators of such votes. Even if they have good intentions, they do not have the same incentives to make reasonable proposals or compromises. --WikiTiki89 00:11, 10 November 2015 (UTC)
I agree it is problematic, but I'm opposed to making a hard rule against it. What if the proposer changes his or her mind? What if the proposal is simply written in the negative saying we "shouldn't" do something, either because it makes sense to do so or to avoid this rule? What if the question is very simple and already discussed elsewhere? I think it might make a good guideline to suggest proponents of a change be the one to put it forward, as should have been done in this case, but I don't think it would make sense to make a hard rule about it without some very careful consideration. Pengo (talk) 01:40, 10 November 2015 (UTC)
Where did I say above that the proposer can't change his mind? --WikiTiki89 01:57, 10 November 2015 (UTC)
@Wikitiki89: If the proposer can change their mind, then they can get around this rule by proposing something, saying they're supporting it, and then changing their mind shortly afterwards. It might seem silly, but it demonstrates a problem with this rule. Pengo (talk) 09:32, 11 November 2015 (UTC)
That's assuming bad faith on the part of the proposer. I don't see it as a huge problem, since most editors involved in votes generally act in good faith. --WikiTiki89 15:24, 11 November 2015 (UTC)
I think it's worrisome, but I don't know if it's really a big deal. The current vote in question makes the criteria rather narrow, yes, but that means the opposition votes are equally narrow in scope. As it stands now, we could just change the separator in the template from ; to something else, to comply with the vote. The vote outcome only indicate that people don't want the particular combination Dan proposed, but it says nothing about any others. —CodeCat 01:53, 10 November 2015 (UTC)
It wastes people's time and it misleads people who look back at it in the future. It's better to have the supporters craft a better proposal to begin with. --WikiTiki89 01:57, 10 November 2015 (UTC)
  • This is a generally good idea, but it could be abused if made a rule. I think at present we really only have one editor who does this, and making legislation against him individually is absurdly heavy-handed. But as an informal expectation, I agree that I'd prefer if all editors were to avoid doing this. —Μετάknowledgediscuss/deeds 04:19, 10 November 2015 (UTC)
  • I naturally oppose that "votes must be written by an editor or editors who intend to support the vote". I have never intentionally made a bad wording for a vote. Actually, specific defects in the wordings that I have created have not been stated, only that, in principle, as an opposer, I have an objectively existing interest in creating a bad wording. People can still create other votes, with their preferred wording. Even now. They can make wording improvements. By creating votes for CodeCat's proposals and for CodeCat's undiscussed (not even proposed) changes visible in the mainspace, I make sure non-consensual volume changes by CodeCat are limited at least to an extent. Wiktionary:Votes/pl-2015-07/Nesting inflected form definition lines is a great example of such a vote; I did not really expect so many opposes, and we how have clear objective evidence of the scope of support for that proposal. CodeCat proceeded to make changes anyway, e.g. in diff, evidence of the need to determine the scope of support in a transparent manner enabled by the vote. By the way, I support to use my continuing extension algorithm on Wiktionary:Votes/pl-2015-07/Nesting inflected form definition lines to give it a chance rather than closing it as failed. --Dan Polansky (talk) 09:11, 14 November 2015 (UTC)
    @Dan Polansky: Of you course you wouldn't intentionally provide bad wording for a vote. My whole point is that if you don't support the vote, then you are not in a good position to provide the best wording. Even unintentionally, your wording will most likely be worse. --WikiTiki89 16:08, 14 November 2015 (UTC)
    @Wikitiki89: "your wording will most likely be worse": I don't think that to be true. My wording can turn out to be worse, but not "most likely". Whether my wording will be better depends not only on my potential bias but also on my drafting skills. For the discussed vote, no specific proposal of alternative wording has been made. We really have no tangible material or evidence to support the hypothesis that I create poorly drafted votes that fail not because of lack of support but rather because of poor drafting. In fact, Wiktionary:Votes/pl-2015-08/Templatizing usage examples looks like passing despite my opposition. --Dan Polansky (talk) 16:26, 14 November 2015 (UTC)
    @Dan Polansky: Good point about Wiktionary:Votes/pl-2015-08/Templatizing usage examples, it does look passable.
    That said, I am worried about the "Nested inflection lines" vote. I am inclined to agree with @Wikitiki89 on the statement: "My whole point is that if you don't support the vote, then you are not in a good position to provide the best wording." At diff, Dan Polansky argued that the supporters had enough time to improve the vote but they failed to do so. I think this further proves Wikitiki89's point: If someone is planning to create a vote with the intention of opposing it, then help from supporters of the vote is a requirement, otherwise it is likely that the vote is not going to have the best wording.
    Suggestion: @Dan Polansky, if you want to create a vote with the intention of preventing something to happen, IMO it would be okay being straightforward and proposing just that. See example below.
    "Voting on:
    • Disallowing the mass creation of nested inflected forms, until clear consensus is reached on the acceptance of nested inflected forms, and also what exactly should be the wikicode for that, and what exactly should appear on the entries as a result."
    I think this way you could express your points of view more clearly and openly. The fact that you oppose edits being done en masse without consensus would be the real rationale for the creation of your vote, which you could further elaborate in your words, as you have done to justify the votes you created with the intention of voting oppose. --Daniel Carrero (talk) 16:50, 14 November 2015 (UTC)
    • That is not an acceptable proposal. The whole point is that I do not need consensus to prevent non-consensual changes. The person making a volume deviation from status quo ante needs to gain consensus. When that person does not create the required vote, and does not provide wording input to a vote created by someone else, I end up with little option left other than creating a vote that I oppose. The proposal plays into CodeCat's non-consensual-volume-change cards in ways that really make it not workable, and that contradict the consensus principle. --Dan Polansky (talk) 16:57, 14 November 2015 (UTC)
    • Another thing is that if none of the supporters are willing to draft a vote themselves, then the vote is unlikely to have enough supporters to pass. If a vote is unlikely to pass, then it is a waste of time. I also agree with Daniel Carrero that a potential solution is to reword the vote in the negative. A negative vote that passes is entirely different from a positive vote that fails (and vice versa). --WikiTiki89 16:58, 14 November 2015 (UTC)
      • I really think this discussion is trying to solve a non-existent problem. Wiktionary:Votes/pl-2015-07/Nesting inflected form definition lines failed. Now its supporters should pull up their sleves and draft a proposal that can pass. --Dan Polansky (talk) 17:01, 14 November 2015 (UTC)
        • That's exactly the problem, it wasted everyone's time because it failed and accomplished nothing. --WikiTiki89 17:11, 14 November 2015 (UTC)
          • It accomplished expansion of our knowledge, which was its main purpose. We now know that people do not support the proposal. People also left specific comments on why they do not support it, from which opposition from other similar proposals can be inferred. --Dan Polansky (talk) 17:26, 14 November 2015 (UTC)
    Let me register my disappointment with the lack of disciplinary action against CodeCat, for the likes of diff. Under the regular circumstances of the rule by consensus, there really would be no need for me to create such votes. The votes are a slightly unusual measure to deal with the problem of non-consensual volume changes that for some reason that I do not know have not been deal with by the bureaucrats and admins. I believe CodeCat's actions exeplified by diff generally constitute a blockable offence. --Dan Polansky (talk) 09:15, 14 November 2015 (UTC)
  • Let me note that the wording is not so narrow as CodeCat above suggested. The specific wikicode format is not part of the voted proposal, since the vote text says "possibly like" before it gives the example, emphasis on "possibly". What was part of the proposal was "creating the new formatting by means of a single template invocation instead of multiple ones as before". --Dan Polansky (talk) 10:06, 14 November 2015 (UTC)
  • I don't think we need to ban this. Discourage is using dialogue, sure, outright ban, no. Renard Migrant (talk) 16:55, 14 November 2015 (UTC)

Closing RfDs...we need floors[edit]

At present, we have no floors for when an RfD is closed. I'd like to propose the following:

  • If there are three votes of unanimous opinion, or five votes of unanimous opinion with only one dissent, the RfD can be closed immediately.
  • If an RfD has gone on at least a week and has at least five votes, it may be closed if 65% or more of the votes are of the same opinion
  • Any RfD can be closed after a month, regardless of the number of votes or how the votes are distributed.

Of course, we could go longer if we wanted, but we wouldn't have to. This would have the added benefit of de-cluttering RfD of discussions that haven't been commented on in months, or have clear outcomes. Purplebackpack89 23:41, 9 November 2015 (UTC)

Oppose. People commenting on RFDs should be encouraged to take their time to analyse the situation and form an opinion. I mean, it’s one thing to try to speed up the bureaucracy, but your suggestions (especially the first one) will mean that a discussion can be closed within minutes of being opened, or after a new point is posted. The practical result of this is that people will tend to rush to a conclusion and skip careful reading and research. — Ungoliant (falai) 00:02, 10 November 2015 (UTC)
Ungoliant, would you like me to up the number of votes on the first one, or the percentage on the second? Also, I doubt that the result of this will be lots of deletion discussions being closed very quickly...after all, half of deletion discussions don't even get five votes in the first week. If a discussion is 5-0 or 5-1 one way, is it likely to go the other way? No! For one, you'd need about 13-14 total votes to get them to go the other way, and 80-90% of the RfDs on this project don't even have ten votes. There's no point in prolonging the inevitable. Purplebackpack89 00:08, 10 November 2015 (UTC)
Furthermore, the "careful reading and research" should take place before a person votes, so if something is closed quickly, it's because the people who voted came to a decision to vote quickly. And in many RfD cases, people can and do vote quickly, because questions of "research" generally end up at RfV rather than RfD. Purplebackpack89 00:14, 10 November 2015 (UTC)
It’s no wonder you think that, since rushing to a conclusion and skipping careful reading and research is what you already do anyway. — Ungoliant (falai) 00:16, 10 November 2015 (UTC)
Oppose. I would favor a rule stating that an RFD discussion cannot be closed if there has been a new vote or a new contribution to the discussion (barring filibustering) in the past seven days. --WikiTiki89 00:25, 10 November 2015 (UTC)
But, Wikitiki, that means that if seven people vote delete on an article in the first 72 hours, you still can't close it until a week after the last vote. Why is it so necessary that a discussion be dead for a week? Isn't how long the discussion is open more relevant then how long it's dead? Purplebackpack89 00:29, 10 November 2015 (UTC)
Partly to make sure you don't get a team of editors to quickly put in seven delete votes, close the discussion, and delete the entry before anyone else notices. If it's been dead for a weak, there is a good chance no one else has anything to say. --WikiTiki89 00:32, 10 November 2015 (UTC)
Wikitiki, that particular example ignores the practicalities of the situation: a) Seven votes will carry the vast, vast majority of RfD discussions, regardless of length; b) considering the make-up of RfD participants, a team of seven editors is bound to include a sysop who could delete the entry whenever anyway. Purplebackpack89 00:38, 10 November 2015 (UTC)
It's better to be on the safe side. It's not harmful to let a 7-0 lead sit there for a week. It is, however, harmful if the discussion is closed before someone who wanted to vote got a chance to. Keep in mind also that the ratio of votes also matters, there is a significant difference between a word deleted 7-3 from one deleted 10-0. --WikiTiki89 00:44, 10 November 2015 (UTC)
That's why you let them run a week unless there's a huge pile-on consensus. The way to make sure people have time to vote is to let it run a week, not let it be dead a week. Purplebackpack89 01:03, 10 November 2015 (UTC)
But you also have to give people who already voted a chance to read any new information or arguments and rethink their position. --WikiTiki89 01:35, 10 November 2015 (UTC)
  • Oppose. There is no major problem that this proposal solves. DCDuring TALK 04:12, 10 November 2015 (UTC)
  • Oppose per Ungoliant and especially DCDuring. —Μετάknowledgediscuss/deeds 04:15, 10 November 2015 (UTC)
  • Oppose. This proposal assumes that editors who might care about RFDs are always online. Some of us have to step away for a while due to circumstances in real life. Closing anything immediately risks shutting out and disenfranchising any editor who cannot be on Wiktionary all the time.
Also, per DCDuring, this is a solution in search of a problem. ‑‑ Eiríkr Útlendi │Tala við mig 18:17, 10 November 2015 (UTC)
  • Oppose. There is no great harm done by letting the discussion remain open for a few extra days, even after a unanimous consensus has been reached. bd2412 T 19:04, 11 November 2015 (UTC)

WT:RFM#Continuation of #Category:en:Names into Category:English names[edit]

I'm notifying users of this discussion, since it's not so trivial and needs input from many people. —CodeCat 00:48, 10 November 2015 (UTC)

German "du contractions" (haste, fährste...)[edit]

What should we do about colloquial German contractions like hast du > haste, where the sound of the "du" gets lost in the "-st" and the whole thing is reduced to a schwa? It's a fairly universal process, but of course it can only occur with verbs that used in a colloquial context so not all -ste forms will be attested. I've added a sense at haste, and created a page at fährste, but I'd like to get some opinions from other German editors about how far to go and how to format the entries before rolling this out any further. Smurrayinchester (talk) 10:42, 10 November 2015 (UTC)

Yiddish has a similar problem האָסט דו ‎(host du) > האָסטו ‎(hostu). We have an entry for the suffix ־ו ‎(-u) itself, but I don't really think it is worth creating an entry for each verb with this suffix. --WikiTiki89 16:05, 10 November 2015 (UTC)
Brabantian Dutch also has this contraction, though it's etymologically different (it's the 2nd person plural in origin). I think there should be entries for them if they are attested. —CodeCat 18:42, 10 November 2015 (UTC)
Maybe also to consider: es contractions, like geht es -> geht's or gehts. - 19:13, 10 November 2015 (UTC)

Wikimania 2016 scholarships ambassadors needed[edit]

Hello! Wikimania 2016 scholarships will soon be open; by the end of the week we'll form the committee and we need your help, see Scholarship committee for details.

If you want to carefully review nearly a thousand applications in January, you might be a perfect committee member. Otherwise, you can volunteer as "ambassador": you will observe all the committee activities, ensure that people from your language or project manage to apply for a scholarship, translate scholarship applications written in your language to English and so on. Ambassadors are allowed to ask for a scholarship, unlike committee members.

Wikimania 2016 scholarships subteam 10:47, 10 November 2015 (UTC)

Wiktionary:Votes/pl-2015-10/Entry name section - started[edit]

FYI: Wiktionary:Votes/pl-2015-10/Entry name section started today. --Daniel Carrero (talk) 03:42, 11 November 2015 (UTC)

I looked through it. It looks mostly like formalizing existing practice. Is there anything controversial or anything being proposed that isn't existing practice? Benwing2 (talk) 04:58, 11 November 2015 (UTC)
You're right, that was what I had in mind when creating the vote: it's about formalizing existing practice. I believe there's nothing controversial there; also, I'm not proposing anything new. --Daniel Carrero (talk) 05:41, 11 November 2015 (UTC)

rfi categories[edit]

I edited {{rfi}} to allow categorization into language-specific categories.

But there are 6,597 entries using {{rfi}} without a language code. Those entries are currently categorized in Category:Entries needing images by language. Can a bot add the language code to all entries, please? Thank you. --Daniel Carrero (talk) 00:13, 12 November 2015 (UTC)

P.S.: This template has lots of redirects: {{reqphoto}}, {{Reqphoto}}, {{rfimage}}, {{rfdrawing}} and {{rfphoto}}. If a bot can add the language code to all these entries as I requested, I also suggest changing all these to {{rfi}} for consistency. --Daniel Carrero (talk) 00:55, 12 November 2015 (UTC)
Why is it important that these are categorized by language? An image doesn't have any intrinsic language (unless it contains text). DTLHS (talk) 01:01, 12 November 2015 (UTC)
I'm with DTLHS. Somethings, adding language codes to templates is more trouble than it's worth. Particularly in this case since pictures can be used in any language. Purplebackpack89 01:07, 12 November 2015 (UTC)
If we had some Commons-image-mining tools AND Commons had some data that indicated the language a diagram was worded in or the language of text appearing in a photo, then the language code might be useful. Until both conditions are met the language code seems useless. By the time those conditions are met the language code may seem like a quaint relic. DCDuring TALK 02:40, 12 November 2015 (UTC)
An image doesn't have an intrinsic language, but they sections they appear in do. These requests aren't independent of the language sections they appear in. Renard Migrant (talk) 17:59, 12 November 2015 (UTC)
One image may contain text in multiple languages, or other complicated scenarios (e.g. a sign that happens to say the same thing in two local languages, so isn't clearly either of them). This feels like categorisation for the sake of it. Equinox 18:00, 12 November 2015 (UTC)
This isn't about what languages are in the image, but whether an editor has the knowledge to judge whether an image is fitting for a given word. —CodeCat 18:10, 12 November 2015 (UTC)
  • I'm having some difficulty understanding how the language code would help in any specific current situation. Are there some specific examples of how this would help? DCDuring TALK 19:27, 12 November 2015 (UTC)
    • Would you trust yourself in adding images to random words in, say, Finnish? What about a rare language like Ainu? If you didn't speak the language, how would you know the image was appropriate? —CodeCat 22:02, 12 November 2015 (UTC)
      • Also, if the Finnish entries needing images were grouped together, they would draw the attention of Finnish editors. I could try improving our Portuguese entries by adding images to Portuguese entries needing them, but out current list of entries needing images is mostly a mess of Translingual taxonomic names. --Daniel Carrero (talk) 22:21, 12 November 2015 (UTC)
      • If the English gloss were accurate, I would. If the gloss were a polysemic English word or an obsolete, archaic, or rare word, I would not make the assumption that the gloss is accurate. Similarly if there were grammar mistakes or poor diction in the English gloss. If our English glosses are of such unreliable quality that these screens are not sufficient, then obviously I should not trust myself. Are these screens insufficient?
      • I look forward to the increased efforts to add images to FL L2 sections that will be unleashed by this categorization. I would have looked forward even more to improvement of the reliable quality of English glosses of FL entries. DCDuring TALK 23:42, 12 November 2015 (UTC)
        • If you don't speak the language, it's hard to get an idea of the "real" meaning of a word, even with an accurate definition. Therefore, some who is familiar with the language would be able to choose are more fitting image. --WikiTiki89 23:51, 12 November 2015 (UTC)
          A bad image could even give the wrong impression about the meaning. —CodeCat 23:54, 12 November 2015 (UTC)
          Also, a native speaker of Finnish navigating Category:Finnish entries needing images would understand most if not all the contents of the category just by looking at the entries listed. She could say: "I don't believe they don't have an image for (thing) yet!"
          With all the 6,500+ requests lumped together, one has to ignore the words in languages they don't know. (The alternative of looking for an English gloss is being discussed above and I don't have anything to add.) I, for one, don't speak Finnish. --Daniel Carrero (talk) 13:24, 13 November 2015 (UTC)

Wiktionary:Votes/2015-11/Language-specific rfi categories --Daniel Carrero (talk) 14:23, 26 November 2015 (UTC)

Appendix:Article 1 of the Universal Declaration of Human Rights[edit]

I just made this - what do you all think? It would certainly be interesting to see how many translations we can get. Aryamanarora (talk) 02:24, 12 November 2015 (UTC)

Sources I have ported over versions from Wikimedia Foundation projects. Many more can be found here: http://omniglot.com/udhr/. Can these be imported? —Justin (koavf)TCM 03:04, 12 November 2015 (UTC)
Why can we not just leave these on Wikisource? Do they need to be here? —CodeCat 03:12, 12 November 2015 (UTC)
Yes, leave them where they are. I don't see that they have any function here. SemperBlotto (talk) 08:57, 12 November 2015 (UTC)
I agree with CodeCat and SemperBlotto. Wikisource already has the declaration in many languages. English: Universal Declaration of Human Rights, French: Déclaration universelle des Droits de l’Homme. I see no reason why anyone would look for that information here to begin with. --Daniel Carrero (talk) 09:03, 12 November 2015 (UTC)
RFD it is then! Renard Migrant (talk) 17:57, 12 November 2015 (UTC)
I'm also not quite sure why we need this. But I did fix some errors in it. --WikiTiki89 18:19, 12 November 2015 (UTC)
I agree. Doesn't make sense as an appendix to a dictionary. Equinox 18:26, 12 November 2015 (UTC)
Incidentally, the Even-Shoshan Dictionary of Hebrew does have the Israeli Declaration of Independence in the back. Although, since it is given in two side-by-side versions one with vowel points and the other in plene spelling, it is probably just there to illustrate the differences between vocalized and plene writing. --WikiTiki89 18:34, 12 November 2015 (UTC)
  • I also think we should delete it. Is this enough of a snowball to make it happen? —Μετάknowledgediscuss/deeds 18:30, 12 November 2015 (UTC)
Deleted. - -sche (discuss) 21:01, 12 November 2015 (UTC)
  • I support this deletion after the fact. —Aɴɢʀ (talk) 12:09, 13 November 2015 (UTC)

Unnormalized Old French forms[edit]

I'm starting to get interested in facsimiles of the original manuscripts. It's occurred to me dictionaries (that I know of) don't cover manuscript forms and it could be something we do as a USP. If you don't know the basics of French spelling you might not want to read any further. Anyway. A good starting point would be Talk:aprés where the Old French section was unilaterally removed in 2010 because Old French doesn't have an acute accent.

It does and it doesn't; scholars seem to universally use é to represent /e/ at the end of the word or when the penultimate letter and the final letter is s. I'd argue that these forms are both real and Old French. I have at least four books from my university study that I could use as evidence and take pictures of if anyone wants me to. Also on Google Books

There seem to be no other accents that are consistently used. For example you can find seürté here (a third of the way down, use Ctrl + F) but Godefroy lists it as seurté as does the Anglo-Norman On-Line Hub. Other than that, you do get the odd transcription using à and the odd one using grave accents, but these are rare.

So, does anyone have an opinion on the following? I've been updating WT:About Old French but it feels odd doing it on my own. The options I see are allow, disallow and either list as alternative forms but don't create or list in alternative forms without links (just plain text, no square brackets)

  • No acute accents: seurte for seurté (which we list as seürté)
  • No capital letters: france for France
  • No cedillas: francois for françois (you can see francoise here, second word of the second line in black)
  • u/v distinction: they're visually the same but I think they're worth keeping as separate letters like we do in Latin. For example trouve appears visually as trouue (modern trouve, verb form) but I think they're really separate letters that share a glyph. This comment added 21:31, 12 November 2015 (UTC) Renard Migrant (talk)

How about alternative normalizations:

  • Allow diaereses: traïson for traison (the diaeresis here is to show it's three syllables and not two.)
  • à for a (occasionally used for the same reasons modern French has à and . But I can't remember where I saw it. I think it was in a modern print Wace)

Renard Migrant (talk) 18:29, 12 November 2015 (UTC)

Perhaps we can use a Latin-like system for Old French. Meaning, the entry names will not have diacritics that were not actually found in manuscripts, but diacritics will be added on the headword lines and text and stripped from links. --WikiTiki89 18:41, 12 November 2015 (UTC)
I'd considered that but we'd be going against what apparently every other dictionary does. If you're copying and pasting from another website, it could matter quite a lot. Renard Migrant (talk) 21:55, 12 November 2015 (UTC)
Other dictionaries don't actually have an entry name vs. headword distinction. --WikiTiki89 22:42, 12 November 2015 (UTC)
That is the root of the issue really. The fact that all spellings have an individual dedicated page, like color and colour are separate issues. Anyway, a specific example would be like trové being merged into trove (no entry right now) and when you look at the etymology of treasure trove, it says from tresor trové, you go there and it says only Spanish. Sure you can change the Wiktionary entry, but what about other websites that use the standardized spelling trové. And how many people get their hands on actual manuscripts? I've seen a book in Old Frenchl (French in the left-hand column, Old French on the right) in a general book store in France by way of comparison. Renard Migrant (talk) 23:47, 12 November 2015 (UTC)
And if we automatically strip links, that solves all the problems. The entry will be at [[trove]], and both {{l|fro|trové}} and {{l|fro|trove}} will point to the same page. --WikiTiki89 23:54, 12 November 2015 (UTC)
Would {{l|fro|trové}} actually go to trove if there was a page with a different language at trové? We certainly have to think about the case where people are entering words from books.--Prosfilaes (talk) 08:10, 13 November 2015 (UTC)
Yes, it would. {{l|la|jūra}} takes you to [[jura]] since we strip macrons for Latin, but {{l|lv|jūra}} takes you to [[jūra]] since we don't strip macrons for Latvian. —Aɴɢʀ (talk) 11:34, 13 November 2015 (UTC)
We can get away with the Latin practice because there's very little overlap between Latin words with macrons and entries in other languages that use macrons because the languages are unrelated and have different morphology (see, however Latin and Japanese romaji ). In this case, there are a number of closely related languages with similar morphology, so there's bound to be a lot of overlap, and whenever there's overlap, searches go to the entry with diacritics and not to the plain form. Chuck Entz (talk) 18:52, 13 November 2015 (UTC)
I believe that we should have the forms of words that are found in printed text, because that's what people are going to be looking up. Those are the form of the word most commonly found in durably archived forms of the language.--Prosfilaes (talk) 08:10, 13 November 2015 (UTC)
I agree. What Wikitiki89 says is possible, no doubt, but is anyone in favour of it? Copying and pasting trové from another website, won't get you to trove. Renard Migrant (talk) 12:03, 13 November 2015 (UTC)
And this is what an unlinked alternative form looks like. Renard Migrant (talk) 12:34, 13 November 2015 (UTC)

Copyright of images of manuscripts[edit]

Looking at the link (http://www.kb.dk/permalink/2006/manus/225/eng/4/?var=1) above, can we use this image in any way? I assume while the text is uncopyrightable, the photo of it is copyrightable. Renard Migrant (talk) 18:31, 12 November 2015 (UTC)

See commons:Commons:When_to_use_the_PD-Art_tag and commons:Commons:Reuse_of_PD-Art_photographs#Nordic_countries. So we can use it.--Prosfilaes (talk) 08:01, 13 November 2015 (UTC)
Does this come under art? Renard Migrant (talk) 15:10, 13 November 2015 (UTC)
PD-Art is for photos taken from a distance. I think these images were made on a scanner, so commons:Commons:When to use the PD-scan tag applies. At any rate, we certainly have other scans of old manuscripts at Commons, see e.g. commons:Category:9th-century manuscripts. —Aɴɢʀ (talk) 15:51, 13 November 2015 (UTC)

WT:EL, should Etymology come after Pronunciation?[edit]

Currently the most common order of headings is to have Etymology before Pronunciation. But when the Etymology headings are numbered, then we put Pronunciation above if it applies to all of the etymologies. I think this is a bit backwards, to be honest, because it means that whenever you add a second etymology, you have to swap the headings around. It's much more common for a word to have only one pronunciation irrespective of etymology, than to have etymology-specific pronunciations. Therefore I think that this should be changed so that the default order is to have Pronunciation above Etymology. It goes without saying that if the pronunciations differ, then they should be placed within their respective etymology sections, as now. —CodeCat 21:56, 12 November 2015 (UTC)

I bet more casual users want a pron than an ety, too. Equinox 21:58, 12 November 2015 (UTC)
If we considered what users want relevant, we'd put the definition first. —CodeCat 22:01, 12 November 2015 (UTC)
We should. Equinox 22:02, 12 November 2015 (UTC)
Ok, but that's a separate debate. I definitely encourage you to start a new discussion on that, and I'd support it too, but I'd rather not get this proposal muddled up by making it bigger than it is. —CodeCat 22:04, 12 November 2015 (UTC)
I support CodeCat's proposal. --Daniel Carrero (talk) 22:55, 12 November 2015 (UTC)
Oppose for aesthetic reasons. It doesn't look good when the etymology section is under the pronunciation section. If you really want to rethink things, it might make sense for the etymology to be at the bottom, but as has been said in previous discussions, this would drastically affect the hierarchy of the entry contents. As far as what users find relevant, it's actually really easy to ignore the etymology section; I never understood why people find it annoying. Also, I'm not sure I buy CodeCat's claim that "It's much more common for a word to have only one pronunciation irrespective of etymology, than to have etymology-specific pronunciations." This just doesn't seem to be the case in my own experience. --WikiTiki89 23:12, 12 November 2015 (UTC)
Don't you work on Hebrew a lot, which has a nonphonetic writing system? That's why. —CodeCat 23:14, 12 November 2015 (UTC)
Yes, but even my experience outside of Semitic languages, such as with Russian and English, leads me to doubt your claim. --WikiTiki89 23:20, 12 November 2015 (UTC)
I agree in part. We have to keep foreign languages in mind too. If a number of languages have different pronunciations depending on etymology, then it may be better to keep the pronunciation first. However, I think it makes more sense for the pronunciation to be first in English entries, except when the pronunciation is distinct for the different etymologies (as for words like wind). Andrew Sheedy (talk) 23:19, 12 November 2015 (UTC)
wind is a bad example because the current entry fails to show that the verb wind /wɪnd/ derives from the noun, and that the noun wind /waɪnd/ derives from the verb. There should really be four etymology sections, though two of them share a pronunciation with the other two. There is no way to solve this kind of overlap and nesting in the general case. —CodeCat 01:02, 13 November 2015 (UTC)
Under the proposal how would one present homographs having the same etymology but different pronunciations? DCDuring TALK 23:49, 12 November 2015 (UTC)
Then they're the same word, just with different pronunciations, aren't they? —CodeCat 00:28, 13 November 2015 (UTC)
I think the question is if the pronunciation section should be split into two sections (one for the definitions with a certain pronunciation and one for those with another pronunciation), since the etymology would be out of the way, being higher up in the hierarchy than the pronunciations. For an example of an entry this would apply to, see right. Andrew Sheedy (talk) 00:52, 13 November 2015 (UTC)
But any other number of properties could be shared or distinct, too. I've seen English entries where a single headword line contains two different inflectional patterns, where the choice depends on the meaning. And of course there are cases where synonyms and such also differ by meaning. We solve this with the {{sense}} template. Why can't that be done for pronunciations? —CodeCat 00:59, 13 November 2015 (UTC)
I'd forgotten about the {{sense}} template. I'll replace the header in the pronunciation section of right with it (though I really should have just added a gloss in the first place). Andrew Sheedy (talk) 01:09, 13 November 2015 (UTC)
I don't like that the proposal does not link to previous discussions on the subject; there are multiple ones. The discussions contains specific objections. --Dan Polansky (talk) 09:57, 14 November 2015 (UTC)

Wiktionary:Votes/pl-2015-11/NORM: 10 proposals - starting soon[edit]

FYI: Wiktionary:Votes/pl-2015-11/NORM: 10 proposals (which was created 2 weeks ago) is going to start in 2 days.

Duration of the vote: 3 months. --Daniel Carrero (talk) 04:58, 13 November 2015 (UTC)

The vote started. --Daniel Carrero (talk) 03:41, 15 November 2015 (UTC)

Suggestion: Remove userspace pages from Category:Requested entries[edit]


--Daniel Carrero (talk) 03:00, 14 November 2015 (UTC)

What if we make a subcategory? Category:Requested Entries (Userspace) or the like? Aryamanarora (talk) 03:34, 14 November 2015 (UTC)
Today, I created Wiktionary:Redlink dumps which looks better than a category IMO, since it is an organized list with a few comments here and there. Personally, I don't want to take the trouble to place every page listed at Wiktionary:Redlink dumps (let alone the hundreds of subpages) in a different category. If anyone wants to do it, I don't have anything to say about that. I only want to remove those from Category:Requested entries. --Daniel Carrero (talk) 14:39, 14 November 2015 (UTC)
Notably, I listed User:-sche#German at Wiktionary:Redlink dumps#Other languages, because @-sche has plenty of German redlinks in their user page, but I would not place an user page in a category called Category:Requested Entries (Userspace) (at least not without asking), I feel it would be rude. --Daniel Carrero (talk) 14:46, 14 November 2015 (UTC)
Keep them categorized somewhere like Aryamanarora says. Maybe Category:Requested Entries (user generated). Renard Migrant (talk) 17:21, 14 November 2015 (UTC)
Sure, but I'd understand "user-generated" to mean manual work. Most of these are the contrary of "user-generated": they are script-generated / computer-generated. Category:Redlink dumps should be a good enough name for that, like WT:Redlink dumps. --Daniel Carrero (talk) 17:26, 14 November 2015 (UTC)
Yeah, having userspace pages directly in Category:Requested entries is odd. I don't mind if someone categorizes User:-sche/wanted (the subpage of my userpage where the "wanted" terms are) into something like Category:Requested Entries (userspace). I've thought about merging it into the official list of requested entries, but I don't want to swamp that list, especially since many of the entries on my page are relatively less-important terms I just happened to notice we were missing. Using a list like Wiktionary:Redlink dumps rather than a category is also fine, IMO, and that list could be linked-to from a "See also" type section at the bottom of WT:RE:en, WT:RE:de, etc. - -sche (discuss) 23:03, 14 November 2015 (UTC)
As postscripts: I agree that "user-generated" is liable to be misunderstood and should probably be avoided, since clearer alternatives exist. Also, I don't mind splitting the many German and English words on my "wanted entries" subpage onto separate monolingual subpages, if that would be helpful. - -sche (discuss) 22:18, 15 November 2015 (UTC)
@-sche Thanks for splitting your lists into English and German pages, they do look better now.
Unfortunately, the name Category:Requested Entries (userspace), too, has the problem of not being the most accurate name possible. At WT:Redlink dumps, there are some pages in the Wiktionary: namespace:
--Daniel Carrero (talk) 13:36, 17 November 2015 (UTC)

According to my calculations:

I'm not interested in doing the work of placing those 968 entries in a separate category. We already have WT:Redlink dumps listing them, so I'd rather use my time on Wiktionary doing something else. If someone else wants to do it, you have my blessing. --Daniel Carrero (talk) 11:27, 18 November 2015 (UTC)

Yes check.svg Done --Daniel Carrero (talk) 02:58, 24 November 2015 (UTC)
I mean, I removed all userspace pages from that category. I didn't place them in any other category, for the reasons I said above. --Daniel Carrero (talk) 14:01, 25 November 2015 (UTC)

Blocking policy clarification[edit]

In Wiktionary:Votes/pl-2010-01/New blocking policy, editors seemed to want to simplify blocking policy (WT:BLOCK). To actually achieve the simplification for hasty readers, I think we need to reduce the policy page to the actual policy and nothing else, which would be to reduce it to the following:

:''See also '''[[Help:Interacting with humans]]'''''
{{policy|draft=The portion of it which is policy may not be modified without a [[Wiktionary:Votes|VOTE]].}}

# The block tool should only be used to prevent edits that will, directly or indirectly, hinder or harm the progress of the English Wiktionary.
# The block tool should not be used unless less drastic means of stopping these edits are, by the assessment of the blocking administrator, highly unlikely to succeed.

Interwiki links can follow the above text.

In the past, I have seen multiple cases where editors cited parts of the page that are not the actual policy, so the above proposed change seems really required and useful.

Comments? --Dan Polansky (talk) 11:10, 14 November 2015 (UTC)

Maybe the non-policy portion of the page should be moved elsewhere. Possible name: Help:Blocking. --Daniel Carrero (talk) 14:57, 14 November 2015 (UTC)
Split into Policy and another header, such as Rationale or Explanation. Wouldn't even require a vote to do that as you could leave the policy bit unchanged. Renard Migrant (talk) 17:20, 14 November 2015 (UTC)
Splitting to headers while keeping non-policy on the wiki page does not really work for me. I want the wiki page to contain the policy and nothing else. I have no issue with there being Help:Blocking but it should probably be written in a way that does suggest that it contains regulations (rules). Tables with specific blocking lengths look like rules. --Dan Polansky (talk) 17:34, 14 November 2015 (UTC)
Do you think we need a vote for moving the non-policy contents to Help:Blocking? If the answer is yes, then would the contents of Help:Blocking be locked for editing, thus requiring further votes if we want to change something? --Daniel Carrero (talk) 03:53, 15 November 2015 (UTC)
@Daniel Carrero: I think we need a vote to edit WT:BLOCK. I don't think we need a vote to create a non-policy page Help:Blocking. Admitted, one could argue that it should be possible to edit the non-policy parts of WT:BLOCK without a vote. But I think it much better to use a vote to turn the page into a policy-only page via a vote that removes all non-policy parts. That, of course, presupposes support for changing the page as proposed by me. --Dan Polansky (talk) 08:14, 22 November 2015 (UTC)
Wait, it is already split to headers: WT:BLOCK#Policy and WT:BLOCK#Explanation. And this splitting seems to do little to prevent confusion. There is even red background in the policy text. In fact, the page contains multiple kludges to make it clear that only part of it is a policy. It looks really clumsy and does not really work, from what I can see. --Dan Polansky (talk) 17:44, 14 November 2015 (UTC)
It's not up to you, though, is it. Renard Migrant (talk) 14:55, 22 November 2015 (UTC)

I created Wiktionary:Votes/pl-2015-11/Short blocking policy. --Daniel Carrero (talk) 22:24, 25 November 2015 (UTC)

What distinguishes a synonym from an alternative form?[edit]

I've come across several Finnish entries, where one word is defined as an alternative form of another, but the two words have different morphological structure and hence etymologies. An example is kolkkaa vs kolkata, but I've seen examples which differ more substantially. It would be somewhat like having two verbs in English, one with -en and another with -ify, defined as alternatives of the same term. I think these shouldn't be considered alternative forms, but we don't really have clear definitions for what is an alternative form and what is a synonym. Alternative forms, at the very least, are a subset of synonyms, but I would like to set some stricter criteria for distinguishing them. What comes to mind is having the same morphological structure and/or etymology, or forms that differ by dialect or something like that. —CodeCat 18:13, 14 November 2015 (UTC)

I also find myself in this situation every now and then. The general guideline I apply to myself is that words are alternative forms if they are synonyms and have roughly the same etymology.
I know this is not very helpful, because of the “roughly”. But consider the pair piranha and piraña: both have the same Old Tupi etymon, the only difference is that one entered English via Portuguese and the other via Spanish. — Ungoliant (falai) 22:35, 14 November 2015 (UTC)
This is one of those things that's easier to decide on a case-by-case basis than with a general guideline. --WikiTiki89 02:57, 15 November 2015 (UTC)
Kolkkaa only has the third sense listed under kolkata ("to clatter") which seems like sufficient grounds to not list this as an "alternate form". --Tropylium (talk) 19:11, 15 November 2015 (UTC)

Rhymes navigation[edit]

I would like to found a couple of rhymes pages so I would like to ask about the preferred format of navigation at the top. I noticed at Rhymes:Czech/ofka that Lo Ximiendo changed it for {{rhymes nav}}, which Dan Polansky has reverted, unfortunately without explanation. The same has happened at about 20 more rhymes pages. I have also noticed that the template itself declares that "this template should be placed at the top of all rhymes pages". So what is the preferred way? Thank you. Jan Kameníček (talk) 21:54, 14 November 2015 (UTC)

{{rhymes nav}} was installed to rhyme pages by user CodeCat by a bot and without consensus. I oppose its use. It sets up rhyme pages for a hiearchy of categories and index pages, which I find undesirable and confusing. In my view, all Czech rhyme pages should be in Category:Czech rhymes, and subcategories like Category:Czech rhymes/a- should not exist. The subcategories are clumsy, do not add any real value and do not provide a truly useful navigation tool, unlike e.g. the tables at Rhymes:Czech. The markup at Rhymes:Czech/ofka that creates "Rhymes > Czech" is perectly simple, straightforward, does what needs to be done, and is in no need to be replaced with a template that presupposes a page structure that the creators of the rhyme pages do not support such as Rhymes:Czech/o- linked from a templated version of Rhymes:Czech/ofka.
I did a lot of substantive work on Czech rhyme pages, and consider myself to know what I am talking about.
I also object to use of templates where they add close to no value but give power to people who lock templates to be only editable by admins and the like, or even move their content to modules (Module:rhymes) to further raise barier of editing. That said, for some purposes, modules are extremely useful. --Dan Polansky (talk) 22:14, 14 November 2015 (UTC)
Is the template generating incorrect content? If not, it should be used. — Ungoliant (falai) 22:15, 14 November 2015 (UTC)
I prefer the non-templated page. There is no need to use a template instead, unless tangible benefits of template can be shown. I still hope that editors at large do not support this type of over-templatization.
The template places the page into a category that IMHO should not exist, and links to an index page that should not exist either. Correct or incorrect is not at stake; editor preferences are at stake. --Dan Polansky (talk) 22:20, 14 November 2015 (UTC)
An aside: as a remnant of CodeCat's non-consensual changes, we still do not have "-" back in the names of rhyme pages; see also Wiktionary:Votes/2014-09/Renaming rhyme pages. --Dan Polansky (talk) 22:20, 14 November 2015 (UTC)
As for what the template itself declares, what else do you expect from CodeCat's templates? See also Wiktionary talk:Votes/2014-08/Debotting MewBot. --Dan Polansky (talk) 22:21, 14 November 2015 (UTC)
I'd say, if you want to change the practice of simple formatting at the top of Czech rhyme entries, and the flat category structure of Czech rhyme entries, please someone create a vote, and do it now. Do it yourself, so that you do not need to accuse me later of poor drafting. --Dan Polansky (talk) 22:23, 14 November 2015 (UTC)
I do not want to change any practice, I am just asking, because I have not learned anything neither from the summaries of your reverts of Lo Ximiendo's edits neither from the summary of your revert of my edit. I do not care very much which format is chosen, I just want to know so that next time I am not reverted again. You wrote that the template started to be introduced without having been discussed, so this might be an opportunity to find out the general opinion of it. (By the way, your work on rhymes pages was enormous; that is beyond any doubt). Jan Kameníček (talk) 23:00, 14 November 2015 (UTC)
Nobody seems to oppose Dan Polansky's arguments, so I go on in the way that he promotes. Jan Kameníček (talk) 21:18, 24 November 2015 (UTC)
You can also go on the way you have done before. There's nothing wrong with your previous edit and they were reverted without ground. Dan is just trying to bully you into doing things his way. —CodeCat 22:01, 24 November 2015 (UTC)

Wiktionary:Votes/2015-11/Language-specific rfi categories - created vote[edit]

FYI: I created a new vote about {{rfi}}, the vote is linked in the header.

Also, I think there's no harm in repeating what I said in a previous post: There's an unrelated vote that started today. You can cast your votes on it already:

--Daniel Carrero (talk) 05:19, 15 November 2015 (UTC)

Removing rfi from talk pages[edit]

FYI: I intend to remove {{rfi}} from talk pages.


  • It is used in 19 talk pages.
  • It is used in 6,500+ entries.


  • When the entry still needs an image, I am going to move the {{rfi}} to the entry itself, in the correct language section.
  • When the entry already has an image, I am going to remove the {{rfi}} altogether as a request fulfilled.


  1. Numbers suggest that the overwhelming practice is placing the request in the entry, not the talk page.
  2. About half of those entries have images already. The people who added the images did not remove the box, I suspect this happened because the box was hidden in the talk page.
  3. Probably I just got bored enough to make this list rather than just doing the change in the 19 entries, but I'm erring on the side of announcing one's intent at the BP for openness. It would also be nice if this prevented new rfis being added to the talk pages in the future, for better consistency.

(✔ = request fulfilled, has at least one image)

--Daniel Carrero (talk) 05:44, 15 November 2015 (UTC)

Note: User:DCDuring has been adding images to some entries listed and editing my message above to update the list. And, for that, he has my thanks. --Daniel Carrero (talk) 14:27, 15 November 2015 (UTC)
Done. I finished moving all the rfis to the main entry, or deleting the ones where the request was fulfilled. --Daniel Carrero (talk) 09:49, 17 November 2015 (UTC)

Wiktionaries linked at Norwegian language, Norwegian Bokmål language and Norwegian Nynorsk language[edit]

According to meta:Wiktionary, there are two Norwegian Wiktionaries:

  • Norwegian (Bokmål) = no.wiktionary.org
  • Norwegian (Nynorsk) = nn.wiktionary.org

I don't speak Norwegian, so I'll assume that's accurate.

Our language categories link to Wiktionary editions when they exist, and they are perfectly capable of showing links to multiple Wiktionaries at once. (compare Category:English language and Category:Serbo-Croatian language) But I think our three Norwegian language categories are not linking to Norwegian Wiktionaries accurately.

I believe perhaps they should display:

--Daniel Carrero (talk) 13:46, 16 November 2015 (UTC)

  • Yes indeed, there are Wiktionaries and Wikipedias for both Norwegian languages. Are you referring to those written in Bokmål and Nynorsk? Donnanz (talk) 13:55, 16 November 2015 (UTC)
w:no:Portal:Forside (Bokmål Wikipedia), w:nn:Hovudside (Nynorsk Wikipedia)
no:Wiktionary:Forside (Bokmål Wiktionary), nn:Hovudside (Nynorsk Wiktionary) —Stephen (Talk) 00:59, 17 November 2015 (UTC)
With my ability, I am unable to do the proposed change to the modules. Maybe it has something to do with Module:wikimedia languages.
Currently, Category:Norwegian language links only to nowikt, not to nnwikt. I think it should link to both. --Daniel Carrero (talk) 18:19, 24 November 2015 (UTC)

Namespace abbreviations[edit]

There has been some recent discussion in the BP and in the recent vote adding the “Reconstructed:” namespace about adding more snazzy namespace search bar abbreviations like the preëxisting “WT:” for “Wiktionary:”. It would make things far more convenient to have a few more of these like “TP:” for “Template:”, “AP:” for “Appendix:”, and “MW:” for “MediaWiki:”. I wanted to check to see whether was enough interest to merit a vote and also what abbreviations people would like to see. My initial list would look something like:

  • AP → Appendix
  • C → Citation
  • CT/CA/CAT → Category
  • MD → Module
  • MW → MediaWiki
  • T → Talk/Discussion
  • TP → Template
  • RC → Reconstruction

And potentially many more. These things will make life a lot easy for frequent contributors of this project and delay the impending carpal tunnel syndrome that awaits us all a few years. It would also be cool if there were a way to jump immediately to a talk page by appending “T:” to a namespace (e.g. “AP:T:” → “Appendix talk:” or “TP:T:” → “Template talk:”). —JohnC5 01:28, 17 November 2015 (UTC)

This would get my vote, although I find myself going to category pages a lot more than citation pages, so I might make C = category and CI = citation. I also might find it easier to remember if you use the first two letters whenever the word isn't clearly two parts, e.g. TE=template, MO=module. Benwing2 (talk) 02:09, 17 November 2015 (UTC)
I could support behind all these options.
I know it's not that long to begin with but I could imagine “U:” for “User:” to be useful, especially since “U:T:” would be great. —JohnC5 02:17, 17 November 2015 (UTC)
You could also abbreviate UT=User talk, TT/TET=Template talk, AT/APT=Appendix talk, etc. Benwing2 (talk) 02:51, 17 November 2015 (UTC)
Also a possibility; though the :T: suffix does avoid ambiguity. —JohnC5 02:59, 17 November 2015 (UTC)
I support the proposal, but "MW → MediaWiki" and "C: → Citation or Category" are unavailable, because they link to sister projects. Full list: w:Help:Interwiki linking.
Wikipedia uses "WT → Wikipedia talk" (their equivalent of Wiktionary talk), "T → Template", "CAT → Category" and "H → Help". Full list: w:Wikipedia:Shortcut#Pseudo-namespaces. IMO we should use, too, "CAT → Category" and "H → Help". "Cat" is our standard abbreviation of "category" anyway -- many category templates and at least 1 categorization gadget have "cat" in the name. --Daniel Carrero (talk) 09:38, 17 November 2015 (UTC)
I also prefer CAT over anything else for categories. Equinox 15:09, 17 November 2015 (UTC)
So for the sake of listing things, we would prefer something like:
  • AP → Appendix
  • CIT → Citation
  • CAT → Category
  • H → Help
  • MD/MO/MOD? → Module
  • MWK/MED? → MediaWiki
  • RC → Reconstruction
  • U → User
Then the question remains of what we would like for “Talk” and “Template”. I'd prefer:
  • T → Talk/Discussion
  • TP/TEM/TEMP? → Template
The above option allows for the unambiguous “u:t:” for “User talk:”. Alternatively, if we think that we link/navigate to templates more often than talk pages:
  • TK? → Talk/Discussion
  • T → Template
JohnC5 15:32, 17 November 2015 (UTC)
* DOC:en-nounTemplate:en-noun/documentation
* MDOC:parametersModule:parameters/documentation
--Daniel Carrero (talk) 15:54, 17 November 2015 (UTC)
I was curious about that. Is it possible to have a prefix translated into a circumfix (i.e. “DOC:” → “Template: … /documentation”? —JohnC5 16:03, 17 November 2015 (UTC)
I support new abbreviations for our unique namespaces, like citations, appendix and reconstructions. But I don't think we should be making ones for modules or templates, as this will only add confusion and incompatibility when copying content between wiktionaries / wikis. If it's only for the search bar it's ok, but we shouldn't be creating ways to reference a template in wiki markup that only work on this Wiki. Pengo (talk) 20:36, 17 November 2015 (UTC)
@Pengo: to be honest, the “Template” and “Category” namespaces are probably the ones for which I most would like abbreviations, but that is a very valid comment for which I thank you. —JohnC5 20:43, 17 November 2015 (UTC)
Re: "we shouldn't be creating ways to reference a template in wiki markup that only work on this Wiki."
But aren't some templates kind of like this? When we want to reference a template in a discussion, we often use {{temp}}, so it's like "temp" were a very particular kind of abbreviation for a certain namespace. --Daniel Carrero (talk) 20:49, 17 November 2015 (UTC)
Maybe if we do it first, all the other wikis will follow. When typing fast, I can never spell Category right on the first try (it usually ends up as Cateogry or something), so it would be very useful to have an abbreviation. --WikiTiki89 21:03, 17 November 2015 (UTC)
But {{temp}} itself can be copied verbatim to other Wikis still and will still work the same way and you need to know nothing special about en.wikt to do so. Changing Template: to Temp: might seem minor, but it means exporting wikitext requires actually editing the wikitext, and suddenly requires specialized knowledge of English Wiktionary's configuration. I don't think it's an exaggeration to say it becomes an order of magnitude more difficult and error prone. If those abbreviations work their way into module code then it becomes more difficult again and requires more expertise. I'd rather not break compatibility with the other hundreds of other WMF Wikis for the sake of a minor convenience. If we could introduce temp: or mod: across all wikis then that'd be fine, or if it was just for searching it'd be great, or if they were auto-expanded when you preview/save that'd be okay too. but otherwise I'd rather be cautious and not introduce the possibility of breaking things unnecessarily. Pengo (talk) 21:39, 17 November 2015 (UTC)
Is there somewhere where we can suggest the addition of "cat: -> category:" in all wikis simultaneously? meta.wikimedia.org? --Daniel Carrero (talk) 21:47, 17 November 2015 (UTC)
But you can't type {{temp}} into the searchbar. I don't care so much about actual wikilinks. --WikiTiki89 21:53, 17 November 2015 (UTC)
No, templates can be difficult to copy between wikiprojects. For example {{Navbox}} from Wikipedia is impossible to copy without hard work.--Dixtosa (talk) 10:30, 18 November 2015 (UTC)
@Daniel Carrero:, forget that. Just forget it. Just do it... xD--Dixtosa (talk) 10:30, 18 November 2015 (UTC)
Regarding "U:T:", is a colon inside a namespace name possible? We currently have links like WT:T:ADE, but that was a workaround not having a true namespace alias — they are actual redirects (i.e. they exist as pages, unlike WT:About German which only exists as Wiktionary:About German) in the Wiktionary namespace which start with "T:". We could, of course, continue that convention. How many namespaces are long enough and sufficiently often linked to that they need aliases? Wikipedia, a much bigger project, gets by with only a few. I don't think MediaWiki pages are linked to often enough that they need an alias. Increasing the number of aliases does have a few (slight) drawbacks, e.g. pre-existing or possible future clashes with interwiki links. "Cat", although a language code, is fortunately not an interwiki; the interwiki to Catalan projects is "ca". "Mod", on the other hand, is the only code for Mobilian, so a "mod:" space would conflict with interwiki links if a Mobilian wiki were created; likewise "Cit" is the code for Chittagonian. Do we need shorthand for citations pages, anyway? I would add "AP → Appendix" and "RC → Reconstruction" and "CAT → Category". If we needed a shortcut to modules, what if we just used "M:" and didn't add a shortcut to MediaWiki pages? Likewise I would use "T:" for template; "Talk:" is not hard to type in full. - -sche (discuss) 02:18, 18 November 2015 (UTC)
As I've already pointed out, many of these shortcuts would be much more useful for navigation through the search bar than strictly for linking. Also, if it is at all possible that CAT: would not have to be preceded by a colon in links, that would be very, very convenient. --WikiTiki89 02:28, 18 November 2015 (UTC)

  • I am sure U:T: can't be a namespace.
  • User:Ungoliant_MMDCCLXIV has a user script that adds two or so search inputs beside the main search input that search in specific namespaces. A good idea for those who only need namespace abbreviations to search easily. --Dixtosa (talk) 10:30, 18 November 2015 (UTC)
    Can you link to it? --WikiTiki89 15:54, 18 November 2015 (UTC)
    @Dixtosa how did you find that out?? — Ungoliant (falai) 18:44, 18 November 2015 (UTC)
    I just tried it out.
    Wikitiki, the snippet that does that is inlcuded in his monobook.js. --Dixtosa (talk) 15:30, 21 November 2015 (UTC)

Given all that I've seen above, I would still like to have abbreviations for at least Category, Template, Module, Reconstruction, Appendix since those are the most typed and least convenient. —JohnC5 16:02, 18 November 2015 (UTC)

I am planning on maybe creating a vote for at least the most wanted abbreviations on this discussion. The ones mentioned by John5 (Category, Template, Module, Reconstruction, Appendix) have my support. Aside from that, I would really like to know if it's possible also redirecting "DOC:" and "MDOC:" as I suggested above. --Daniel Carrero (talk) 19:25, 18 November 2015 (UTC)
But the documentation pages are really meant to be viewed from the template/module's page itself, so I'm not sure why we would need to link directly to the documentation. --WikiTiki89 19:30, 18 November 2015 (UTC)
Please do start a vote! I feel that abbreviations for Reconstruction and Appendix are not contentious. It might make sense to have a separate section of the vote for Category, Template, and Module, since Pengo at least seems to have some reservations about those. —JohnC5 18:21, 21 November 2015 (UTC)
I would even say we should have a separate section for each namespace. --WikiTiki89 21:57, 21 November 2015 (UTC)
Sounds good to me. —JohnC5 17:29, 22 November 2015 (UTC)

Wiktionary:Votes/2015-11/Namespace abbreviations --Daniel Carrero (talk) 05:59, 24 November 2015 (UTC)

Remove the numbers from Etymology sections[edit]

Why do we have numbers on Etymology sections? The numbers themselves don't mean anything, and are subject to reordering anytime anyway. We also don't number other sections; you don't see "Noun 1" or "Conjugation 2" sections anywhere. The numbers are not necessary to understand the entry at all, since the header structure already gives this information. It's also annoying to have to number and renumber them all the time, when we don't require this treatment for other headers. Therefore I propose dropping the numbering from the headers, so that they are treated like we treat other headers already. —CodeCat 18:42, 18 November 2015 (UTC)

Previously, they were used for links, such as foo#Etymology 1. They are also useful as an indication to the reader that there are more etymologies to follow. --WikiTiki89 18:45, 18 November 2015 (UTC)
Most readers don't know or care about etymologies, so it's not very interesting information. The definitions should really come first, and entry information should be grouped by term (like other dictionaries do), not etymology. But since people keep blocking that, we have to find other means to slap some sense into our entry structure. —CodeCat 18:48, 18 November 2015 (UTC)
Re: "But since people keep blocking that, we have to find other means to slap some sense into our entry structure" -- I don't know about other people, but if someone proposes a solid layout with definition first and makes a vote for it, I would consider supporting it. --Daniel Carrero (talk) 19:32, 18 November 2015 (UTC)
Sorry, I meant "more etymology sections to follow". What does term mean in "grouped by term"? And aren't you basically saying "this might not be the right thing to do, but no one wants to do the right thing, so let's do this instead"? --WikiTiki89 18:52, 18 November 2015 (UTC)
I see it as an improvement, which is the point. Wiktionary should be improved. And by "term" I mean part-of-speech headers. I can't say "part of speech" because then people will think I want to group different nouns together. I think that entries should be formatted with etymology and pronunciation nested under the part of speech header. The other way around doesn't make much sense, since every word has its own etymology anyway, there should be as many etymology sections as there are part of speech sections. —CodeCat 19:00, 18 November 2015 (UTC)
Case in point: diff is way too much administrative work to add a simple etymology. I should just be able to add the etymology under the appropriate POS section, and not have to worry about adding additional numbered sections and adding extra equals signs to all the headers. —CodeCat 19:35, 18 November 2015 (UTC)
Other dictionaries number the headwords, e.g. ¹wind and ²wind or wind 1 and wind 2 or the like. I think it would be confusing to have avoid numbers altogether (and we certainly have used "Noun 1" and "Pronunciation 2" headers in the past, though bots tend to remove them). Ultimately I think it would be least confusing (though not 100% nonconfusing) to have each entry on a page of its own, i.e. with its own URL, e.g. "/wiki/en/wind_(movement_of_air)", "/wiki/en/wind_(twist)", "/wiki/nl/wind_(wind)", "/wiki/nl/wind_(form_of_winden)", "/wiki/ang/wind", and so on. Then "/wiki/wind" would just be disambig page, as would "/wiki/en/wind" and "/wiki/nl/wind". —Aɴɢʀ (talk) 19:36, 18 November 2015 (UTC)
It might be tricky for words with ones of meanings - eg, is train1 "/wiki/en/train_(rail_vehicle)", "/wiki/en/train_(long_skirt)", "/wiki/en/train_(procession)", etc? Effectively, the disambiguation pages would just become the entries themselves. Smurrayinchester (talk) 20:42, 21 November 2015 (UTC)
This is certainly true, and it's a problem I'm already having with categorising suffixes. But the issue with numbers is that they don't allow reordering without breaking links. In most dictionaries, the content is created in advance so the editors know the numbering and can refer back to the numbers. But Wiktionary is always in development, so senses are split and rearranged as we go. We invented {{senseid}} to solve this, but we haven't yet used it at a level higher than a single sense. It certainly needs a solution though. —CodeCat 20:50, 21 November 2015 (UTC)

Suggestion: make "Vote started" and "Vote ends" agree on the tense[edit]

Most votes have the dates like this:

  • Vote started:
  • Vote ends:

I suggest editing the vote-generator templates to agree on the tense. I propose:

  • Starting dateStart date:
  • End date:

For reference, the "vote-generator templates" I mentioned are:

--Daniel Carrero (talk) 21:26, 18 November 2015 (UTC)

Ironically, your suggestion has the same problem, mixing a participle with a noun. It should be either "start date" and "end date" or "starting date" and "ending date" (and I prefer the former pair). --WikiTiki89 21:50, 18 November 2015 (UTC)
Sure. Striked the "Starting". --Daniel Carrero (talk) 21:59, 18 November 2015 (UTC)
I find "vote started/ended" (in whatever tense) clearer than "start/end date". The language is more active somehow. Equinox 14:30, 19 November 2015 (UTC)
  • I propose to undo the diff from 2014, resulting in "Vote starts" rather than "Vote started". The past tense is inappropriate, IMHO. --Dan Polansky (talk) 08:06, 22 November 2015 (UTC)

I also suggest retroactively editing all votes to make the "Vote started" and "Vote ends" agree on the tense. I wouldn't mind creating a vote for this. --Daniel Carrero (talk) 04:17, 24 November 2015 (UTC)

Created a vote: Wiktionary:Votes/2015-11/Fix tense: start/end in votes. --Daniel Carrero (talk) 07:37, 29 November 2015 (UTC)

redirect pages needed[edit]

I don't think this is currently allowed, but Wiktionary desperately needs to start making redirect pages for protowords, transliterations, and spelling variants not currently permitted in mainspace. A simple "#redirect [[correct location of term]]" would be a huge help in finding these items on Wiktionary. Words in protolanguages (e.g. Proto-Indo-European, Proto-Germanic, etc.) are often very difficult to find, especially for Proto-Indo-European, for which roots can often be spelled different ways. What is even worse is if someone has a protoword they want to look up in Wiktionary but is not sure for which protolanguage; since protowords are in appendices, it can be very difficult to find. Placing redirect pages to the appendix entries for protowords would be a huge help. Another example is for Russian, Latin, Classical Greek, etc. words, which can often be spelled with or without diacritics (for stress and tone), but are only indexed in mainspace with minimal diacritics. Looking for words by copying and pasting the diacriticked versions into search can make using Wiktionary difficult as well. Redirect links for the diacriticked versions to the main non-diacriticked pages would be a huge help there as well. Trying to search a copied diacriticked term (e.g. νῑ́κη) by removing the diacritics from the copied non-Roman Unicode characters on an ASCII keyboard is often impossible, and requires replacing the characters using a character map instead (νίκη, νικη). Nicole Sharp (talk) 04:46, 19 November 2015 (UTC)

I agree. The software already redirects to pages that vary only in diacritics, but it needs to be expanded to work with more scripts. — Ungoliant (falai) 17:03, 21 November 2015 (UTC)
[I]f someone has a protoword they want to look up in Wiktionary but is not sure for which protolanguage — Probably the best approach to this is to start with one of the descendants. If a proto-word has been added on Wiktionary, it's likely linked from its descendants, too. I guess in principle someone could have neither — where you only have something like *fō "hand" and no idea what part of the world this is from — but this seems too unlikely to be useful to account for.
Proto-word redirects for transcription variants would be good to have around, and certainly preferrable to treating them as "spelling variants" with separate entries. --Tropylium (talk) 21:37, 21 November 2015 (UTC)
We already do that last part. —CodeCat 00:19, 22 November 2015 (UTC)
I know, I'm just echoing the recommendation to add more. --Tropylium (talk) 02:17, 22 November 2015 (UTC)
Yes, that is what I usually do now for protowords, is that I have to find a derivative word, then find the protoword from an etymology. That is very inefficient though. Redirects for protowords, transliterations, and spelling/diacritic variants are still needed. Nicole Sharp (talk) 10:49, 28 November 2015 (UTC)
Expanding a bit more on this, I'd like to also suggest at least as a general practice (possibly policy):
  • Any reconstruction entry that lists alternate reconstructions should
  1. not link to these
  2. set up all listed alternate reconstructions as redirects (provided that this does not clash with other reconstructed entries).
--Tropylium (talk) 16:17, 23 November 2015 (UTC)
I think links can be helpful just to ensure that the redirects exist. —CodeCat 16:27, 23 November 2015 (UTC)

A cool Pleco update for Cantonese[edit]

To Chinese editors: Pleco dictionary now has a new downloadable dictionary for specifically Cantonese terms - 20,000 entries. Such entries are marked as CCY. Previously, only terms that shared the forms with Mandarin were there. Funny enough, these terms still have pinyin readings:

PY qú dì
ZY ㄑㄩˊ ㄉㄧˋ
JP keoi5 dei6

Please spread the news for Chinese editors. No need to go online to Sheik's dictionary any more, which should be larger, actually. --Anatoli T. (обсудить/вклад) 07:13, 23 November 2015 (UTC)

A def line for phrasal verbs[edit]

I would like to add a definition line to the main verb for each phrasal verb. I've made Template:phrasal verb for that purpose. You can see it in use at abide right now. I think this will make it easier for people to see that it's there -- if someone is looking for help understanding a sentence that uses "abide by", they won't know they should be looking up the phrasal verb or going down to "Related terms" to find what they're looking for. It might also help reduce the frequency with which people add phrasal verb definitions to the main verb form, since they don't realize that it is part of a different entry. What do you think? WurdSnatcher (talk) 21:43, 23 November 2015 (UTC)

Ave WurdSnatcher, nos correcturi te salutamus.​—msh210 (talk) 19:59, 25 November 2015 (UTC)

Your input requested on the proposed #FreeBassel banner campaign[edit]

This is a message regarding the proposed 2015 Free Bassel banner. Translations are available.

Hi everyone,

This is to inform all Wikimedia contributors that a straw poll seeking your involvement has just been started on Meta-Wiki.

As some of your might be aware, a small group of Wikimedia volunteers have proposed a banner campaign informing Wikipedia readers about the urgent situation of our fellow Wikipedian, open source software developer and Creative Commons activist, Bassel Khartabil. An exemplary banner and an explanatory page have now been prepared, and translated into about half a dozen languages by volunteer translators.

We are seeking your involvement to decide if the global Wikimedia community approves starting a banner campaign asking Wikipedia readers to call on the Syrian government to release Bassel from prison. We understand that a campaign like this would be unprecedented in Wikipedia's history, which is why we're seeking the widest possible consensus among the community.

Given Bassel's urgent situation and the resulting tight schedule, we ask everyone to get involved with the poll and the discussion to the widest possible extent, and to promote it among your communities as soon as possible.

(Apologies for writing in English; please kindly translate this message into your own language.)

Thank you for your participation!

Posted by the MediaWiki message delivery 21:47, 25 November 2015 (UTC) • TranslateGet help

About the active votes[edit]

These are the most recent votes that I created. All 3 of them were based on previous discussions and were announced before, I'm just repeating the announcement here for conveniency/visibility/whatever.

  1. Wiktionary:Votes/2015-11/Namespace abbreviations -- I created it yesterday. Scheduled start date: Dec 1.
  2. Wiktionary:Votes/pl-2015-11/Short blocking policy -- I created it today. Scheduled start date: Dec 2.
  3. Wiktionary:Votes/2015-11/Language-specific rfi categories -- It started on Nov 22, you can vote on it now.

Also I didn't create this, but it is another recently-created vote which may be of interest:

  1. Wiktionary:Votes/bc-2015-11/User:Chuck Entz for bureaucrat -- You can vote here now, too.

On the opposite end, here are the votes that are closest to end, both on Nov 30:

  1. Wiktionary:Votes/pl-2015-09/Using macrons and breves for Ancient Greek in various places
  2. Wiktionary:Votes/pl-2015-10/Headword line

Also, I extended one vote by 1 month:

  1. Wiktionary:Votes/2015-10/Matched-pair naming format: left, space, right -- It currently has 100% support (5-0-0). It was going to end on Nov 22, but I felt uncomfortable closing the vote because only 5 people voted. That said, it's arguably a minor proposal. I guess I could have closed it but I'd rather wait a little more.

I didn't mention all the active votes apart from those that are starting now, ending now or the one that I extended. I'm leaving the vote box here which should contain all the active votes. --Daniel Carrero (talk) 23:05, 25 November 2015 (UTC)

I don't think there was any reason to extend the matched-pair vote. --WikiTiki89 23:30, 25 November 2015 (UTC)
  • We don't have any precedent on enforcing a quorum that I know of. It might not be a bad idea; I might previously have thought of it as pointless bureaucracy given our low turnout, but the recent advances in vote visibility (and thus turnout) have made this feasible in protecting the democratic structure. —Μετάknowledgediscuss/deeds 00:18, 26 November 2015 (UTC)
    In some cases the issues involved in a vote presuppose technical or linguistic knowledge that may not be had by many. Would an abstention on the grounds of ignorance count for quorum purposes? If not, how else would the quorum be adjusted for such cases. DCDuring TALK 00:26, 26 November 2015 (UTC)
I support counting abstention votes for quorum purposes. --Daniel Carrero (talk) 00:32, 26 November 2015 (UTC)
Agreed. Benwing2 (talk) 05:24, 26 November 2015 (UTC)
I take it the quorum would be used as a rationale for extending votes? For example, we could have the rule that we should not close any votes with less than 10 voters; and that such votes should be extended by one month. To prevent an endless loop of extending particularly unpopular votes, however unlikely, we could also have the rule that a number of X months or Y extensions is the maximum alllowed, after which the vote is closed as failed. --Daniel Carrero (talk) 06:04, 26 November 2015 (UTC)
I think the extension of the 5-participant vote was a very acceptable one. It is not clear that it was really necessary, but if the creator of the vote felt 5 was not good enough, I don't see a good reason to oppose extension. There certainly cannot be any claim of result fishing (or whatever the term is). --Dan Polansky (talk) 19:02, 29 November 2015 (UTC)

Quorum proposal:

  • If, by the end date, a vote has less than 10 participants, it should be extended by 1 month.
  • A vote can be extended 2 times using the rule above. After which, if the vote remains with less than 10 participants by the end date, it should be closed as failed.

--Daniel Carrero (talk) 09:45, 29 November 2015 (UTC)

10 might be too many for some sorts of votes. For example, votes enabling bots are often 5-0 or 6-0. It seems like the size of the quorum might depend on how much opposition there is, and whether it's of the type that's potentially controversial (e.g. a policy vote) or usually non-controversial (e.g. a bot vote). Benwing2 (talk) 13:15, 29 November 2015 (UTC)

Module errors on many pages (particularly Portuguese) regarding incorrect genders[edit]

I added a check to Module:gender and number to make sure someone didn't add some nonsensical gender like "masculine feminine" or "singular plural". These are two separate gender specifications, so they should be specified as such. Apparently, some editors have been making this mistake a lot, and as a result there are a lot of module errors. Can these be fixed? —CodeCat 00:56, 26 November 2015 (UTC)

It appears that all of these are the work of User:Ungoliant MMDCCLXIV‎, who has invented his own interpretation of gender codes, against years of consensus and common practice. "m-f" has never been a valid gender, so I don't know why this user started adding it to things without any discussion. The proper way to add multiple genders is, and has always been, to specify the second gender with another parameter, such as g=m|g2=f in {{head}} or {{l}}. —CodeCat 01:53, 26 November 2015 (UTC)

For the record, the hundreds of module errors are being caused by CodeCat’s edits to Module:gender and number, not by anything I did. — Ungoliant (falai) 01:57, 26 November 2015 (UTC)
All I did was add a check on incorrect genders. You provided the incorrect genders, and have apparently done so on a huge scale without any discussion. Or can you point me to the discussion where you got consensus for using "m-f" as a gender? —CodeCat 02:02, 26 November 2015 (UTC)
Can you point me to the discussion where you got consensus for filling Wiktionary with module errors where before we had something working properly that hadn’t gotten a single complaint despite years of use? — Ungoliant (falai) 02:07, 26 November 2015 (UTC)
There's nothing nonsensical about m-f or mf- it's just shorthand for both genders (see 父母, for a real-life example). Ungoliant isn't the only one who does this- I've cleaned up a number of instances over the past year or so (muttering under my breath the whole time). Since this isn't ambiguous, it should be allowed for in the code. Chuck Entz (talk) 04:22, 26 November 2015 (UTC)
I agree with CodeCat that you should use g=m|g2=f or similar in place of m-f, but I'd suggest having a bot clean these up. Benwing2 (talk) 05:23, 26 November 2015 (UTC)
g=m|g2=f means something different from m-f in Portuguese templates. This is the second time CodeCat has tried to unilaterally remove this distinction, even though it was discussed in WT:APT. — Ungoliant (falai) 12:17, 26 November 2015 (UTC)
What is the difference? I see WT:APT mentioning mf and morf but there's no mention of specifying multiple genders using different params. BTW one way to deal with this is to have the relevant template convert mf to g=m|g2=f underlyingly. Benwing2 (talk) 01:52, 27 November 2015 (UTC)
m-f and mf are used for words that have multiple genders, with the gender used corresponding to the referent’s sex when necessary; multiple gender parameters (and morf, which became unnecessary after {{pt-noun}} was converted to Lua) are for words that have a single gender, but a different gender is used depending on formality, dialectal, chronolectal or idiosyncratic factors (i.e. gangue is masculine in Portugal and feminine in Brazil). — Ungoliant (falai) 02:06, 27 November 2015 (UTC)
That's not at all obvious, it's just a convention you invented. There are better and more intuitive ways to denote the that the gender of the noun matches the gender of the referent. —CodeCat 02:16, 27 November 2015 (UTC)
That’s not an excuse to break the template. — Ungoliant (falai) 02:17, 27 November 2015 (UTC)
You're right, it isn't. So why did you? —CodeCat 02:19, 27 November 2015 (UTC)
What you've done is like deleting a widely-transcluded template without orphaning it. I don't care how hideous you think the status quo was, it wasn't as bad as the thousands of module errors we're faced with at the moment. This is bad for the function and the reputation of our site, and you have yet to say anything that justifies leaving it this way. Please explain why your edits shouldn't be reverted until the problem with the parameters is resolved. Chuck Entz (talk) 07:16, 27 November 2015 (UTC)
(edit conflict) It seems to me it's useful to have a difference between X and Y vs. X or Y for multiple genders, which the gender module doesn't support, but AFAIK the intended meaning of g=m|g2=f is X and Y. Maybe the gender module should have a way of supporting this distinction? Benwing2 (talk) 02:22, 27 November 2015 (UTC)
Maybe the way forward is to create individual gender modules for languages with exceptional considerations. I know of at least two Spanish words that have non-semantically varying gender: puente and samba. — Ungoliant (falai) 02:42, 27 November 2015 (UTC)
I think it would be better to have consistent usage of gender across languages whenever possible. The distinction of nouns that are multiple genders corresponding to different meanings and nouns that have varying gender by usage without a meaning distinction exists in many languages, probably in all languages with grammatical gender, and it would be good if the gender module supports this. BTW I take back what I said earlier about the intended meaning of g=m|g2=f, I think it's actually intended to cover both situations, or at least I can think of words that use it for both, e.g. in дереве́нщина ‎(derevénščina, yokel) it is used to indicate an epicene noun (m or f according to the semantic referent), whereas in طَرِيق ‎(ṭarīq, road) it indicates a noun whose gender can vary without meaning distinction. Maybe the gender module can be modified to support something like g=m-or-f and g=m-and-f (for Russian this could potentially be e.g. m-an-and-f-an-and-f-in, with more than two genders possible). Benwing2 (talk) 03:26, 27 November 2015 (UTC)
I'm just gonna butt in here and say that Hindi has several nouns that can be masculine and feminine based on context, and some that can be both depending on the speaker's choice. This is not just restricted to Portuguese. See Category:Hindi masculine and feminine nouns. Aryamanarora (talk) 17:57, 27 November 2015 (UTC)
It's now been 3.5 days and we still have 1,757 module errors. Benwing2 (talk) 13:17, 29 November 2015 (UTC)
The one thing we know is that none of those errors is the fault of any contributor to the module. DCDuring TALK 14:11, 29 November 2015 (UTC)
  • Why does the OP not undo the changes to the checking code, given there is no consensus for them? --Dan Polansky (talk) 18:57, 29 November 2015 (UTC)
  • I request that Module:gender and number is unprotected so that autoconfirmed editors can edit the page. --Dan Polansky (talk) 19:07, 29 November 2015 (UTC)
    • Yes check.svg Done I unprotected the module, feel free to fix things. Probably this will get protected again in the near future as a widely used module, but for now unprotecting is for the best, I believe. --Daniel Carrero (talk) 22:41, 29 November 2015 (UTC)
      • I reverted CodeCat's edits, so Category:Pages with module errors should be emptied once the server finishes catching up with that. I plan to protect the module again later, if that's okay. --Daniel Carrero (talk) 20:25, 1 December 2015 (UTC)
        • But what about the problems? —CodeCat 21:01, 1 December 2015 (UTC)
          • If you refer to the widespread usage of m-f (and mf) in Portuguese, there's no consensus for changing that to something else in the current discussion, and a number of people here expected the changes to Module:gender and number to be reverted, so I believe I did the right thing. --Daniel Carrero (talk) 21:22, 1 December 2015 (UTC)
            • I think a vote should be made to decide whether a separate notation for epicene nouns is desirable, and what notation is preferred for this. I don't mind having a special notation, but "m-f" most certainly isn't it; it's too easily confused with "m|f" (since for every language but Portuguese, "m|f" is used for epicene nouns). It's also not obvious that "m-f" or the resulting output indicates epicene nouns in particular. I also don't like the idea of having two different gender specifiers in one gender; it's logically contradictory. For all other combinations, we never had any problem specifying two genders as, well, two genders. —CodeCat 21:32, 1 December 2015 (UTC)
              • Above, I suggested improvements to the gender/number module that would support the distinction that m-f vs. m|f for Portuguese is trying to express, approximately between epicene and varying-gender nouns, where in the former the different genders correspond to semantic differences and in the latter they don't. (There is also the case of non-epicene nouns where different genders correspond to semantic differences, e.g. le tour "the turn" vs. la tour "the tower", but these probably should always be given separate headers.) What do you think of these? Benwing2 (talk) 00:33, 2 December 2015 (UTC)
                • The hyphen separates different parts of a gender specification, so your idea won't work. I think a new gender tag like epi would be much better, as it's clearly distinct from m|f. —CodeCat 00:53, 2 December 2015 (UTC)
                • Also, something to take note of is that languages may have nouns that are only epicene in some of their inflected forms. To say it another way, the two genders may share a lemma but have different inflections. Italian is an example I could name. This suggests that epicene nouns are really two nouns with different meanings, but (perhaps only partially) overlapping inflections, and we just happen to lump them together for convenience. —CodeCat 00:56, 2 December 2015 (UTC)

About deleting l/en, l/la, l/de and others[edit]

In some RFDO discussions dating back to April 2015, (which are still open) there seems to be some consensus towards deleting templates on the format of Template:l/de, Template:l/la, Template:l/en.

See discussions:

I am bringing this up on the BP because that is quite a big project: there are a few dozens of templates like this. Most seem to be superfluous to Template:l. You could type {{l|en|buzzard}} and not {{l/en|buzzard}}. These are the templates I'd like to delete.

Some of these templates have actual language-specific purposes: this includes {{l/he}}, {{ja-l}} and {{ko-l}}, which I propose to be kept even if all others are deleted. (but I'd rather move Template:l/he to Template:he-l and leave a redirect, for naming consistency). --Daniel Carrero (talk) 02:57, 26 November 2015 (UTC)

My impression from the old RFDOs and RFMs is that there is already consensus for all of what you propose. —Μετάknowledgediscuss/deeds 03:31, 26 November 2015 (UTC)
I think I'm going to create a vote: "Allow bots to change l/la into l|la" and list the specific templates that would be deleted. Even if we have consensus to do that, this involves editing probably thousands of pages in a repeated fashion, so it is clearly bot work. --Daniel Carrero (talk) 06:30, 26 November 2015 (UTC)
Aren't {{l/en}} etc. meant to reduce page size? That's what I've seen people saying. Aryamanarora (talk) 17:53, 27 November 2015 (UTC)
{{l/en}} does not add anything over {{l|en}}, it just adds more indirection and confusion. The only argument I can think of in favour of {{l/XX}} is readability. In most fonts l (L) and | (pipe) look very similar, but you can just switch on a monospaced font in the editor. Don't know why this is not on by default. Jberkel (talk) 01:18, 28 November 2015 (UTC)
Also, the system of l/XX requires the additional work of creating the language-specific templates. Sometimes, the template l/XX exists for a language; for most languages, they don't.
Case in point: Appendix:Proto-Germanic/hirdijaz has descendantes in 15 languages: 9 are linked using {{l}}, 6 are linked using l/XX templates. This is inconsistent. I assume that the process of editing that page involved the avoidable hassle of having to check whether each specific template exists. --Daniel Carrero (talk) 14:58, 29 November 2015 (UTC)
Another argument: at WT:RFDO#Template:l/la, it has been pointed out that Template:l/la lacks some functionality of {{l|la}}, so rather than being an equivalent template, it is inferior to it. So, we can't take for granted that all l/XX templates have all the desired functionality. --Daniel Carrero (talk) 15:03, 29 November 2015 (UTC)

I created Wiktionary:Votes/2015-12/Deleting: l/en, l/la, l/de, etc.. --Daniel Carrero (talk) 04:41, 1 December 2015 (UTC)

Is it right to vote on this? I mean we already have procedure for deleting templates, why bypass it? Renard Migrant (talk) 16:19, 1 December 2015 (UTC)
@Renard Migrant All these templates passed RFDO already (to be exact, their discussions weren't closed yet, but they are months old and the consensus is clear that they should be deleted).
I assume we need a vote at least for the purpose of letting a bot fix all the entries, don't you think so? Also "Template:l/xx" are still widely used, and last time I checked, there were some people still using it in new entries (like Stääd today and subcontraoctave), so I think not everyone got the memo yet.
Also, maybe it's not a bad idea sometimes creating votes when deleting widely used templates: if someone had a freaking great reason to delete {{en-noun}}, I'd expect it to be voted, rather than RFDO'd. But that's just my opinion, these are probably less used than {{en-noun}}. --Daniel Carrero (talk) 17:24, 1 December 2015 (UTC)
We've never needed a vote before for a bot to orphan and delete RFDO'd templates. --WikiTiki89 18:08, 1 December 2015 (UTC)
I don't know about "needing", but I already gave my arguments concerning why I think creating that vote is okay. @Wikitiki89, I found some votes in which it was proposed orphaning and deleting some specific templates.
--Daniel Carrero (talk) 21:44, 1 December 2015 (UTC)
Ok, I'll go through them one by one:
I don't think that in this case any of the potential precedents above turn out to be applicable. --WikiTiki89 22:19, 1 December 2015 (UTC)
What do you think of Wiktionary:Votes/2010-10/Deleting Wikisaurus slash-more pages?
Also I'd like to have the ability to create a new vote even in the case of having no precedent for it. If you disagree with the vote, you can vote oppose on principle.
Note: at Wiktionary:Beer parlour/2015/June#The Index namespace, I promised I would create a vote for deleting most of the index namespace, but I'm late for that. I still plan to create such a vote. (as opposed to a RFDO for that purpose) The way I planned it, I am in the process of seeing the indexes of all languages to see where they would fit on the vote. This part has been taking some time; I never got around to finishing it. --Daniel Carrero (talk) 00:30, 2 December 2015 (UTC)
But templates are different from content pages. Deleting a whole namespace of pages or a whole set of "Wikisaurus slash-more pages" is deleting content, which is a much bigger issue than changing markup, despite the fact that it involves deleting nearly 100 separate templates, since content-wise this change is invisible. See also Dan Polansky's comment at Wikisaurus talk:penis/more#Deletion debate, particularly that "There has been a long-standing precedent to keep "/more" pages in Wikisaurus." The {{l/en}} templates used to serve a useful purpose, but they do not anymore. --WikiTiki89 01:08, 2 December 2015 (UTC)
Ok, fine. I suppose there's no problem if I delete the vote and start orphaning some of the l/ templates myself? --Daniel Carrero (talk) 03:49, 2 December 2015 (UTC)
Go ahead. Benwing2 (talk) 03:56, 2 December 2015 (UTC)

term cleanup in user pages and discussion pages[edit]

@Angr, CodeCat, Metaknowledge, DTLHS, Jberkel, Equinox, Benwing2:

As per User:Daniel Carrero/term cleanup, I was hired by @Angr to add lang= to all instances of {{term}} as a paid job.

There's also a newly-created vote (Wiktionary:Votes/2015-11/term → m; context → label) which proposes to convert automatically all instances of {{term}} with langcode into {{m}}.

I've been doing it on the entry namespace only. (also Help:Misspellings) But we have cleanup categories for other namespaces, too:

Is it okay if I edit others' user pages and discussions to add the language code?

Editing these pages would be a step in the direction of orphaning and probably deleting {{term}}. If these pages still use it, then the template can't be deleted. If the template is deleted, then it's going to break all pages that use it.

(P.S.: I edited this page and readded my signature a few times to add more people to the ping list. Sorry if the notification appeared multiple times to you, I didn't mean to spam.) --Daniel Carrero (talk) 15:32, 26 November 2015 (UTC)

The pinging didn't work. Anyway, wouldn't there be cases where people were intentionally showing the difference between two templates? We might not want to make the historical archives harder to read. —Μετάknowledgediscuss/deeds 17:22, 26 November 2015 (UTC)
History, shmistory. That argument has never stopped such an effort before, even though in many ways it trashes one of the key ideas of a wiki. DCDuring TALK 20:04, 26 November 2015 (UTC)
As I recall, I hired you (Daniel) to clean up Category:term cleanup and all its subcategories, i.e. including the ones for other namespaces. I don't really care whether {{term}} is deleted or not; if it's kept, then once Category:term cleanup and all subcategories are cleared, I'd prefer {{term}} to output an error message if lang= isn't present. That way Category:term cleanup won't fill up again. —Aɴɢʀ (talk) 21:15, 26 November 2015 (UTC)
I didn't get your ping either, but it's fine with me if you edit other people's user pages to fix this up. I've done this before when making incompatible changes to templates and I think it's more polite to do it than just leave the pages broken, even if many of these pages are outdated. Benwing2 (talk) 01:44, 27 November 2015 (UTC)
Honestly seems like a waste of time. How is this really helping the dictionary in any significant way? Can't you add missing Portuguese words instead? Equinox 11:00, 27 November 2015 (UTC)
I am deeply concerned that someone is getting paid to do the kind of work everyone here does on a volunteer basis. No one's contributions are more valuable than anyone else's. This completely unacceptable. -Cloudcuckoolander (talk) 11:45, 27 November 2015 (UTC)
User:Dan Polansky questioned this project before, at Wiktionary:Grease pit/2015/November#Suggestion: "term -> m" by bot. There, some people supported the proposal of migrating {{term}} to {{m}}; the latter requires a language code so I'm adding the codes. In the earlier discussion Wiktionary:Beer parlour/2013/April#Template term and lang parameter, some people supported the proposal of making langcodes mandatory for {{term}}.
Advantages of having the language code that were already mentioned before include: all text without langcode is assumed English apparently, and uses the script "None", so using the right langcode would result in proper formatting from both MediaWiki:Common.css and Special:MyPage/common.css. The orange links gadget only works with a language code; also, if we can convert all instances of {{term}} into {{m}}, then the code would look more consistent, because both are often used side-by-side but they are basically the same template under different names. --Daniel Carrero (talk) 11:49, 27 November 2015 (UTC)
I have little interest in contributing to a project that does not value quality contributions from all editors equally. Either we all get paid, or our only "reward" for editing is knowing we've contributed to the body of free knowledge. I know the former is infeasible, so the only solution is to ban paid editing on this project. I will not contribute any more until it is. -Cloudcuckoolander (talk) 12:05, 27 November 2015 (UTC)
I suggested doing cleanup work as a paid job at Wiktionary:Beer parlour/2015/October#Boring cleanup work for money. This was based on the earlier discussion Wiktionary:Beer parlour/2012/July#Reward or bounty board, in which someone said "I see no reason why a person doing boring cleanup work should not be paid with money if someone offers that money." and "appropriately clean up Category:Translation table header lacks gloss" is mentioned as one possibility.
According to meta:Terms of use/Paid contributions amendment, Wikimedia allows paid contributions on the condition that they are publicly disclosed, which I did: it's no secret to anyone that I'm doing term cleanup for money.
Anyone can do the same job. Other people are converting {{term}} to {{m}} basically everyday, but it takes a ton of time to do the whole job of manually emptying a cleanup category with 20,000+ entries, so I offered to make it my personal project. It's not that the work of person A is more valuable than the work of person B, I think it's more about the possibility of offering money for choosing what exactly person B is going to do with their time, as long as the community is okay with it. When I first opened the discussion, I even said, basically: "I need money. Want do you want me to do?" The current project was not originally my idea, it was open for the community to choose something if they wanted. --Daniel Carrero (talk) 12:16, 27 November 2015 (UTC)
Paid editing is completely antithetical to a volunteer project aimed at creating a free repository of knowledge. If we were going to hire an expert to create entries for an endangered language, that would be one thing. There probably aren't a lot of, say, native Ainu speakers hanging around on Wikimedia projects, so entries that do get created might be unreliable. Hiring an expert to create reliable entries would be a reasonable solution in that case. But paying regular editors to do regular work is not reasonable. It's creating a two-tier system where some contributions are valued more than others. Why is the work you're doing considered "boring" enough to warrant monetary compensation? Finding citations, formatting citations, creating requested entries -- these are all time-consuming and sometimes tedious tasks I regularly do. I don't get paid for any of it, and I don't expect to. The point is that most of the work necessary to build and maintain a wiki is time-consuming and tedious. If you want to motivate people to take on tasks no one seems willing to do, find another way. This is counterproductive and wrong. -Cloudcuckoolander (talk) 12:51, 27 November 2015 (UTC)
Not that I'm completely happy with it, but what we're talking about here is an arrangement between two editors, not any kind of action by Wiktionary or the community. "The project" isn't assigning a different value, Angr is. As for a "two-tiered" system: it's not cold hard cash, but the system has a provision for thanking other contributors for their edits, and we certainly don't restrict people from offering verbal support for some things, but not for others. Daniel does need to be careful about avoiding conflict of interest in his actions as an admin in relation to this, and we need to keep it in mind in judging his votes and other actions as a community member on related matters, though. I probably would be a lot more concerned if he wasn't already active and contributing without this. Chuck Entz (talk) 22:22, 27 November 2015 (UTC)
Wikis are founded upon the principle of voluntary contribution. The idea is that people freely volunteer their time, energy, and expertise for the common cause of creating a source of free knowledge. Building and maintaining wikis is often a time-consuming, tedious process, but we're all supposed to roll up our sleeves and do our part for the common good. Editors are allowed to decide the quantity and type of work they do. No one is obligated to do things they do not want to do. But they're supposed to do this work freely, knowing their only "reward" is contributing to the body of free knowledge, and the satisfaction of a job well done. When a wiki allows editors to accept payment in exchange for doing regular wiki-work, it's a wholesale abandonment of the most central value – voluntary contribution – upon which a wiki operates, which inevitably alters the terms of engagement for all other editors. And it does create a two-tier system in which it's okay for editors to accept bribes to perform certain tasks deemed so dull that no one could possibly be willing to do them otherwise, but everyone else is expected to continue performing the equally time-consuming, equally vital work they do for the wiki out of the kindness of their hearts. It's undervaluing the work of the wiki's volunteers and taking them for granted. And do you know what being undervalued and taken for granted usually does to a volunteer? It quickly extinguishes whatever motivation they have to volunteer, as it has done in me. I think people are being myopic in how they're looking at this issue. They're seeing it as a simple, clean solution to the very real problem of this wiki's backlog, but is making a dent in the backlog worth the long-term cost of throwing out the central value of wiki culture? It's a Pandora's box that never should have been opened, and needs to be closed before any more damage is done. -Cloudcuckoolander (talk) 18:34, 28 November 2015 (UTC)
@Daniel Carrero If you want another project, I have a list of over 200 sets entries that need to be merged (that is, the entries are alternative forms of one another but have full entries). But you’ll have to convince someone else to provide the dough as I’m also unemployed lol! — Ungoliant (falai) 12:30, 27 November 2015 (UTC)
I'll do it for $10 (dollars) or R$40 (reais) through this PayPal link. :) Also I don't mind if I'm opening a "market" for paid jobs on Wiktionary -- someone else might see my message and offer R$30 to get the job. --Daniel Carrero (talk) 12:53, 27 November 2015 (UTC)
That said, (like anyone, I suppose) I've been known to do stuff for free as a favor, when people ask me to. The term cleanup is the only project I am doing, or have done in the past, that is an exception to that.--Daniel Carrero (talk) 22:45, 27 November 2015 (UTC)
I don't see a problem with editing user pages, if an explanatory comment is left in the summary. Maybe it should be done as a last step, just before deleting {{term}}. Which I think should be done, if we keep both {{m}} and {{term}} around there's more confusion on which template to use.
Regarding the payment discussion – I've also contributed money to Daniel's project because I think it is useful, although maybe not in an immediately obvious way. It's sort of long term data hygiene. I really don't see how anyone could object to this, you can't compare this situation to the paid-edit discussion on Wikipedia. We're talking about boring cleanup work, which is almost automatable work but not quite. There's nothing controversial about it, and I'm glad somebody is focussing on it, because it just wouldn't get done otherwise. Jberkel (talk) 01:00, 28 November 2015 (UTC)
I don't personally have any problem with Daniel getting paid (not a whole lot, it seems), esp. for doing cleanup workof the sort that's useful but people often don't want to do. Benwing2 (talk) 07:52, 28 November 2015 (UTC)

When is a plural not a plural?[edit]

<Sorry for the chemistry context - probably more general than that.> The word tripalmitins has recently been added, with the definition of "plural of tripalmitin". Now tripalmitin is a specific organic compound that doesn't really have a plural. But the plural form is easily attestable, in sentences such as "Thin layer chromatography of saturated lipids in the fat indicated the presence of mono, di and tripalmitins." The author here means "the presence of monopalmitin, dipalmitin and tripalmitin" and has chosen to put the "s" on the last member of a list to signify all members of the list. So, is the term (as used in this example) a plural? If not, what is it? SemperBlotto (talk) 08:52, 27 November 2015 (UTC)

I'd say it's really "mono-, di- and tri-" + "palmitins", like how "hydrochloric and sulphuric acids" isn't really "hydrochloric and" + "sulphuric acids" but "hydrochloric and sulphuric" + "acids". "tripalmitins" itself is not a word in that sentence, any more than "mono" or "di" are. I'd say this is something that we should exclude on a commonsense basis, just as we'd exclude "thes" as the plural of the word "the" ("There were seven thes on the page"). Smurrayinchester (talk) 10:53, 27 November 2015 (UTC)
I saw examples other than that kind. There are some that are just "tripalmitins" alone. Equinox 10:58, 27 November 2015 (UTC)
The only other examples I can see are variants on "labeled tripalmitins" – these would be different tripalmitin molecules which have been synthesized with radioactive isotopes (carbon 14) in order to allow the otherwise indistinguishable tripalmitin samples to be told apart. It's a bit like orange juice – uncountable, but you could say "South American orange juices contained more vitamin C than North American ones" if you did a scientific study that artificially categorized the juices. Smurrayinchester (talk) 12:03, 27 November 2015 (UTC)
Yes, I'm not questioning its existence as a real plural (in specialised usage); I just thought there might be an aspect of grammar that I wasn't familiar with. SemperBlotto (talk) 12:07, 27 November 2015 (UTC)
To me it seems like a result of the deepening of analytical knowledge. Someone defines something that is unique in a domain or context. Then someone looks into it further and discovers or invents variations, rendering the plural necessary to encompass the variation, at least in some contexts. For our purposes we should just have a plural without a lot of explanation. DCDuring TALK 16:25, 27 November 2015 (UTC)
I agree that we have to weed out things like "mono-, di- and tri-palmitins" and e.g. "myristic and palmitic acids", per Smurray; compare Talk:Asperger's syndromes. (But, on the subject of palmitic acids, I can find 2012, Osamu Hayaishi, Molecular Mechanisms Of Oxygen Activation, ISBN 0323143261, page 45: "The mechanism in the formation of a-hydroxy acids in the leaf system has been studied with palmitic acids stereospecifically labeled with tritium at C-2 and C-3 (80, 81).")
On the other hand, if uses like "labeled tripalmitins" are attested, then I think it would be reasonable to have tripalmitins as a plural of tripalmitin. If we wanted to, we could expand the definition of "tripalmitin" by adding to the end something like "...; an instance of this triglyceride", but I don't think it would be necessary; happiness lists happinesses as it plural without bothering to expand its definition-line, and likewise other words affected by the routine phenomenon of pluralization of things that might be argued to be conceptually unpluralizable, such as (all attested on Google Books) "oxygens", "uraniums", "carbon monoxides", "angers", "Jewishnesses", etc. - -sche (discuss) 18:12, 27 November 2015 (UTC)

Category:English proper noun plural forms[edit]

Given before we just had Category:English plurals, shouldn't we now starts discriminating between noun plurals and proper noun plurals? Plenty of proper nouns have them. Like given names and surnames; Steves; Stephens, Matthews, Dianes and so on. Renard Migrant (talk) 13:05, 27 November 2015 (UTC)

They're not really any different in nature from regular noun plural forms, so I don't think a distinction is really useful. —CodeCat 16:27, 27 November 2015 (UTC)
Wiktionary:Votes/pl-2011-12/Merging proper nouns into nouns, a vote to merge proper nouns into nouns, failed. Wiktionary:Requests for moves, mergers and splits#Merge_Category:.28language.29_proper_noun_forms_into_Category:.28language.29_noun_forms showed (!vote-level, by which I mean 2/3rds) consensus not to merge "proper noun forms" into "noun forms". So, yes, "proper noun plural forms" should be separate from "noun plural forms", unless someone can demonstrate enough consensus to merge them to overrule the consensuses I just linked to. - -sche (discuss) 17:39, 27 November 2015 (UTC)

New Latin vs. Translingual[edit]

At the entry lycaenid, there is "From New Latin Lycaenidae." in the etymology. I added the code mul to the term, under the assumption that Lycaenidae won't ever have a Latin section, just the Translingual section. I didn't change the "New Latin" part. I've seen some entries saying "From Translingual blablabla.", but "New Latin" is more specific than "Translingual", so I wouldn't want to remove that information from the entry. What I did here is probably what I'd do for other entries, so feel free to suggest if I should do anything different. --Daniel Carrero (talk) 13:16, 27 November 2015 (UTC)

No, I think that's what I'd do too. Aryamanarora (talk) 17:51, 27 November 2015 (UTC)
If the etymon is clearly a taxonomic name, as Lycaenidae is, we should call it Translingual, even were it pre-Linnaean. DCDuring TALK 20:29, 27 November 2015 (UTC)
From New Latin is fine; just use the language code mul with it. Renard Migrant (talk) 20:42, 27 November 2015 (UTC)

Alternative forms: chains and cycles[edit]

User:DTLHS/cleanup/alt form chains contains cycles or chains (of length >= 2) of alternative forms. Some of these might be legitimate. DTLHS (talk) 05:32, 29 November 2015 (UTC)

Thanks for doing this. Reminds me of an old copy of Encyclopedia Americana we had when I was a kid ... I looked up "rabies" and it said "see hydrophobia". So I looked up "hydrophobia" and guess what it said?
The cycles seem clearly wrong; the chains less obviously but I suspect a bot should clean them up, akin to removing double redirects in Wikipedia. Benwing2 (talk) 07:20, 29 November 2015 (UTC)
Thanks, this is useful. Some chains and even some cycles are legitimate (see e.g. sea-purse vs sea puss, two etymologically distinct terms, each most common with one spelling but also spellable the other way). Other cycles are the result of typos, e.g. sope, which had linked to itself. - -sche (discuss) 08:47, 29 November 2015 (UTC)
Are you sure about sea-purse vs. sea puss? It looks like 'sea purse' has two meanings, one of which is a rare alternative of 'sea puss', but it's less obvious to me that the meaning "egg case" has 'sea puss' as an alternative; if so, it should indicate this using an "Alternative forms" header. Benwing2 (talk) 09:02, 29 November 2015 (UTC)
Google is constantly digitizing more old books, and new books, so it's possible the situation now is different than when I created the entries in 2012 — but at that time I found that skate egg cases were usually called "sea-purses", but sometimes by confusion with the other term "sea pusses", whereas rip currents were usually called "sea pusses" (and channels through sandbars were often "sepooses"), but sometimes by confusion with the other term "sea-purses". Looking at it now, I'll add glosses to make that (hopefully) easier to follow. - -sche (discuss) 09:25, 29 November 2015 (UTC)

Wiktionary:Votes/2015-11/Fix tense: start/end in votes[edit]

FYI: Created Wiktionary:Votes/2015-11/Fix tense: start/end in votes. --Daniel Carrero (talk) 07:42, 29 November 2015 (UTC)

Wiktionary:Votes/2015-11/Namespace abbreviations[edit]

I delayed the start of this vote a little, it's going to start in 4 days. (in Dec 4 2015)

Please check to see if you agree with the list of namespace abbreviations before the vote starts. You can suggest abbreviations for other namespaces, you can change the proposed abbreviations, you can edit the vote, etc. --Daniel Carrero (talk) 04:17, 30 November 2015 (UTC)

Regarding sections such as "Derived terms" and "Hyponyms"[edit]

There are section such as "Derived terms" and "Hyponyms".
There should also be the possibility to have a section like "Derived hyponyms". See for example the German word Lehrer: Many words are both derived terms and hyponyms, and listing them twice is annoying and redundant.
"Derived synonyms" and "Derived antonyms" could be possible too, but as there should only be a few derived synonyms, if there are any, this shouldn't be needed. E.g. Versfuß is a derived synonym for one meaning of Fuß, but there aren't dozens of derived synonyms.

If they are both hyponyms and derived terms, then list them under both, since they are both. —CodeCat 22:45, 30 November 2015 (UTC)
Having numerous header combinations like "Derived hyponyms", "Derived synonyms", "Related antonyms", etc. would be equally (or more) annoying and redundant. Equinox 23:02, 30 November 2015 (UTC)
Hyponyms (and other semantic relations), which appear above Derived terms, thereby have some kind of priority IMO. Thus, I usually have any terms that could be inserted under both headings only under the Hyponyms heading. In any event I normally view Derived terms as less important than the semantic relations. They could probably be added by an automated process, whereas the semantic relations cannot (yet) be so added. DCDuring TALK 23:49, 30 November 2015 (UTC)
My feeling is that, here, derived terms are more important that semantic relations such as hyponyms, which are more encyclopedic than linguistic, and better added to thesaurus pages. A few ones may be useful in the main page, but long lists (tens or hundreds of words) should be in thesaurus pages: they relate to the meaning, and are shared by all synonyms. On the other hand, a long list of derived words is a good thing in the lemma page, because it's linguistic information about the word. Lmaltier (talk) 21:10, 1 December 2015 (UTC)
I agree, we don't have to analyse all the semantic relations in intricate detail. Synonyms and antonyms are enough. —CodeCat 21:12, 1 December 2015 (UTC)

German declension table templates[edit]

  1. See for example Wikiwörterbuch and Konfix. The dative forms ending in -e should not be attestable, so the declension template creates wrong forms.
    Maybe it would be better to have some parameter to not add dative e, to add dative e and to add dative e and a note.
  2. Maybe it would be better to shorten "see notes". E.g. it could simply be "?" with a link, similar to en.wikipedia.org/wiki/Template:Nihongo . Maybe "i" (for information) or an icon with "i" instead of "?" could be better though.

December 2015

Wiktionary:Votes/2015-12/Deleting: l/en, l/la, l/de, etc.[edit]

FYI: I created a new vote, linked directly above in the title. --Daniel Carrero (talk) 04:42, 1 December 2015 (UTC)

Mathematical entries with semi-equivalent senses[edit]

This sort of came up at the RFD of vector. We have quite a few entries for mathematical objects (I'm going to pick on tensor as an example, but algebra probably has it far worse, especially the cop out at sense 7!) with big piles of senses that are sort-of-equivalent. These generally have a high-school-maths-layperson definition at one end of the difficulty spectrum (for tensor, "A mathematical object consisting of a set of components with n indices each of which range from 1 to m where n is the rank and m is the dimension of the tensor.", which isn't too hard to understand if you think of a tensor as looking like an n×m spreadsheet, but doesn't describe all the crazy sorts of tensors that pure mathematicians have dreamed up) and an abstract but precisely defined definition at the other ("An image of a tuple under a tensor product map." which covers all(?) eventualities but tells you absolutely nothing about what a tensor is like, like defining gold as "What comes out of a gold mine"). Current, I think they're a confusing mess, but my suggestion at the RFD of vector to arrange them as subsenses of a general sense wasn't too popular either. How should these sorts of entries be treated? Smurrayinchester (talk) 11:11, 1 December 2015 (UTC)

I have a related issue with many terms that have a "true" definition that is scientifically accurate but also has little cognitive connection to the word as used in everyday life. [[iron]] comes to mind. The chemical-context definition does not have much connection even to the metal-merchants' definition, let alone that of the man on the street.
For mathematical terms differentiation by level of abstraction seems appropriate. The high-level-of-abstraction definitions both require and the readership can grasp a more specialized vocabulary.
I do think that the sense-subsense approach is useful to remind readers of the underlying unity of the mathematical senses. The sense is a good place for the topical heading "mathematics"; the subsenses would merit usage context labels like "elementary", "formal", or whatever actually reflects the differences in the user population. DCDuring TALK 17:19, 1 December 2015 (UTC)
Symbol support vote.svg Support arranging the discussed senses as subsenses. --Daniel Carrero (talk) 23:24, 1 December 2015 (UTC)

Community Wishlist Survey[edit]

Hi everyone!

We're beginning the second part of the Community Tech team's Community Wishlist Survey, and we're inviting all active contributors to vote on the proposals that have been submitted.

Thanks to you and other Wikimedia contributors, 111 proposals were submitted to the team. We've split the proposals into categories, and now it's time to vote! You can vote for any proposal listed on the pages, using the {{Support}} tag. Feel free to add comments pro or con, but only support votes will be counted. The voting period will be 2 weeks, ending on December 14.

The proposals with the most support votes will be the team's top priority backlog to investigate and address. Thank you for participating, and we're looking forward to hearing what you think!

/Johan (WMF) using MediaWiki message delivery (talk) 14:41, 1 December 2015 (UTC)

Converted templated/sectioned links to plain links[edit]

Today, MewBot (talkcontribs) edited exactly 919 entries with the summary "Converted templated/sectioned links to plain links".

I don't know if this bot run was specifically discussed before, but I think doing it was a good idea.

It seems the bot basically did 2 things:

  1. Converting [[example#English|example]] into [[example]].
  2. Converting {{l|en|example}} into [[example]] within templates.

I support it within {{head}}, {{m}} and probably other templates. It makes some entries more standardized and less ugly. Example edits:

I also support these edits specifically, they removed langcodes within lists of Derived terms, but templates such as {{der3}} and {{der4}} already add the section links:

Here, I don't like very much the fact that English sections were linked through {{l}} and now they aren't, but I guess I can live with that, until the langcode is added to the template:

--Daniel Carrero (talk) 23:21, 1 December 2015 (UTC)

If both of these things were done within templates, then they are the right thing to do. Section links are automatically added within templates, while using {{l|en|example}} within a template adds extra unnecessary formatting, which could even be bad for scripts for which we enlarge the font, causing the font to be enlarged twice. Anyway, I know why CodeCat did this. It was to accommodate changes to the linking system discussed at WT:Grease pit/2015/December#Sense-ids for multiple linked words in a template. By the way, what you were complaining about in your last link can be fixed like this. --WikiTiki89 23:30, 1 December 2015 (UTC)
I know, thank you. --Daniel Carrero (talk) 23:45, 1 December 2015 (UTC)
These entries are being tracked through Special:WhatLinksHere/Template:tracking/links/fragment. There's quite a few the bot hasn't fixed or isn't able to. I'm not sure what to do with some cases, including any uses of {{zh-l}} that are nested inside an already-linking template. —CodeCat 01:04, 2 December 2015 (UTC)
What you can do is take my advice and not worry about it and require sense-ids to be preceded by a hyphen. --WikiTiki89 01:13, 2 December 2015 (UTC)