Wiktionary:Beer parlour

Definition from Wiktionary, the free dictionary
(Redirected from Wiktionary:Beer parlor)
Jump to: navigation, search

Wiktionary > Discussion rooms > Beer parlour

Lautrec a corner in a dance hall 1892.jpg

Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.

Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.

Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!

Beer parlour archives edit


April 2016

Category:Classical Latin[edit]

Aren't Latin entries presumed to be classical unless otherwise noted? Should we have this category? DTLHS (talk) 03:18, 2 April 2016 (UTC)

Indeed, that's the standard we've used. The use of this as a context label is fine when indicating a spelling, but can we prevent it from categorising? —Μετάknowledgediscuss/deeds 03:27, 2 April 2016 (UTC)
For just Latin or everywhere? (AFAIK no other languages use the Classical label) KarikaSlayer (talk) 18:30, 5 April 2016 (UTC)
@KarikaSlayer Classical vs. Vedic Sanskrit comes to mind. —JohnC5 18:38, 5 April 2016 (UTC)
@JohnC5 I was thinking more about the code under "classical" in Module:labels/data/regional. To my knowledge Latin is the only language that uses that specific label (Classical Hebrew has its own tag, and it doesn't look like there's a Category:Classical Sanskrit). KarikaSlayer (talk) 18:48, 5 April 2016 (UTC)
Note that I just changed "Classical Hebrew" to be called "Biblical Hebrew", which is the much more common name. --WikiTiki89 18:53, 5 April 2016 (UTC)
Classical Arabic, Classical Chinese, etc. See w:Classical language for a long list. Although for Arabic we say "Koranic" or "Modern Standard". Not sure about the others. Benwing2 (talk) 23:32, 5 April 2016 (UTC)
There is a lot of Classical Arabic that is not Koranic. In fact, I would say Koranic Arabic is pre-Classical. --WikiTiki89 23:49, 5 April 2016 (UTC)

Catalan/Old Provencal and Walloon/Old French[edit]

I'm not sure if this has been discussed already, but there are a few issues with the way we handle Old Provencal in relation to Catalan. For one, it isn't universally accepted that the former was truly or technically an ancestor/parent language of the latter, even though I personally basically think so. Or at least I can agree an early form of what we may call Old Provencal or it's immediate ancestor was parent to both Catalan and Occitan. The problem is that many of the Old Provencal entries we have happen to be from later in time, in later stages of the language after certain sound shifts distancing and differentiating it (and the later Occitan) from what would become Catalan happened (or Old Catalan). So it may not be accurate to list a certain Catalan term as a descendant on an Old Provencal page of a term that was attested after the split Old Catalan made from it, as Old Provencal by then underwent several sound changes that already made it look more like Occitan but less like Catalan.

Like for one example, Catalan dolç vs. Old Provencal dous and Occitan doç, or another example: Catalan ocell vs. the 'au' diphthong arising in Old Provencal auzel and Occitan aucèl (a later unique development not to be confused with a direct inheritance from the original Latin 'au' in some cases). Or occir and aucir. In some cases, Catalan may have also received influence from neighboring Castillian Spanish, further differentiating it. I do agree that inherited Catalan terms can be listed as deriving from (some form of) Old Provencal, but not necessarily all the forms we have listed (the same even goes for some Occitan words, unless we specify which variant they came from, and not just the main lemma term; for example, how do we relate Cat. caure and Oc. caire, also càder and càser to Old Prov. chazer? Or the descendants on eu?). For the majority of terms, it isn't a problem, as they are very similar or identical, but there are some notable exceptions.

I've been making some of these descendant entries on Old Provencal or etymology entries on Catalan terms, based on what others have already done, but now I'm not so sure about the way we deal with it.

It does seem at least partly a matter of semantics, and what we call or how we define Old Provencal and Old Catalan and Old Occitan.

Also, is it really accurate to call Old French a parent language of Walloon? We're defining Old French that broadly, so as to basically include any Oïl language? Word dewd544 (talk) 20:12, 2 April 2016 (UTC)

  • First point, what else could be the parent language of Walloon? I can't think of anything. I think the answer to your question is yes Old French by our definition includes all the Oïl dialects including what are now the British Isles (including Ireland), France, specifically the northern half of France and what is now Belgium. I don't see how we say is came directly from Vulgar Latin as that leaves a gap of several hundred years. And there's no language or hint of a language called 'Old Walloon'. Renard Migrant (talk) 11:17, 20 April 2016 (UTC)
  • Not sure who the other user is creating Old Provençal entries. There's an argument for renaming the whole language Old Occitan because we merged Provençal and Occitan some years ago when ISO 639 retired prv. As for chazer it's the spelling used by Bernard de Ventadour as it's available on Wikisouce in s:fr:Catégorie:Œuvres de troubadors. FEW lists it as cazer, caire, caer (top of the second column, code is apr. for German Altprovenzalisch). Essentially the existence of chazer does not imply that cazer does not exist. I wouldn't worry too much about having the exact form (or mostly likely of the forms) that the word descended from. Note that Old French soloil originally listed French soleil as a descendant, but this got moved to Old French soleil. These things aren't set in stone.
  • Date-wise I have noted in Wiktionary:About Old Provençal that while we have a code for Old Catalan, roa-oca we don't have a cutoff either geographical or in terms of dates for Old Catalan. Of course if Catalan caure comes from Old Catalan, that also does not imply that it doesn't come from Old Provençal before that. Renard Migrant (talk) 11:24, 20 April 2016 (UTC)
A related issue, see the etymology of vriþa. The Old Norse form is Old Icelandic, which has a change of word-initial vr- > r-. But Swedish doesn't have this change, so it makes it look like the v disappeared and then reappeared. —CodeCat 17:53, 23 April 2016 (UTC)

PIE verb lemmas[edit]

Please see Wiktionary talk:About Proto-Indo-European#Lemmatising PIE verbs. —CodeCat 15:06, 3 April 2016 (UTC)

Singapore English entries[edit]

It looks like someone (maybe a teacher) in Singapore has noticed Wiktionary and has persuaded a group of people to start adding Singlish terms. Most of them are OK, but etymology sections can be over-complicated and formatting can be a little strange. I think it's better to clean the bad ones up rather than deleting them. On your guard! SemperBlotto (talk) 15:50, 3 April 2016 (UTC)

Maybe we should build a list so that they can be re-examined after the wave dies down. —suzukaze (tc) 06:29, 4 April 2016 (UTC)
Yeah, it seems they've been tasked with creating entries in a specific format (with "usage notes" that are not lexical and thus not appropriate here) and keep reverting if we change them. Not a great idea on a public, shared wiki project really. Equinox 16:12, 4 April 2016 (UTC)
The reversion isn't good, but the entries are interesting. DCDuring TALK 17:49, 4 April 2016 (UTC)
Can someone provide some links for those of us who have no idea where to look? --WikiTiki89 17:52, 4 April 2016 (UTC)
See contributions from Coffeeandbiscuits (talkcontribs), Potatogarcia (talkcontribs), Whatyoumean2016 (talkcontribs), Razif07 (talkcontribs), Kellytongjy1990 (talkcontribs), Syy03 (talkcontribs), Chingwennnn (talkcontribs), Afbak (talkcontribs), Nahte79 (talkcontribs), Maeaeae (talkcontribs), Heroaldchern (talkcontribs), Wani.lee (talkcontribs), E-van-316 (talkcontribs) (and others?) (not in any particular order). SemperBlotto (talk) 20:22, 4 April 2016 (UTC)
And Category:Singapore English. DCDuring TALK 21:03, 4 April 2016 (UTC)
For Singapore English cat members with usage notes: [1] DCDuring TALK 21:07, 4 April 2016 (UTC)
Well, that exercise seems to have ended, and teacher is now marking their work. I don't suppose we will ever hear of the results. SemperBlotto (talk) 18:08, 6 April 2016 (UTC)

Filter watchlists and recent changes by language[edit]

I added an option to WT:PREFS: "Filter watchlist and recent changes to only show changes for certain languages." (Third from the bottom of the "Experiments" section.) Suggestions, bug reports, feature requests, etc would be most welcome. --Yair rand (talk) 18:49, 3 April 2016 (UTC)

It seems it only filters out mainspace entries? --Giorgi Eufshi (talk) 06:27, 4 April 2016 (UTC)
Yes. Should it filter out other namespaces? --Yair rand (talk) 14:32, 4 April 2016 (UTC)
I would think that when you use it, it should show only changes in the mainspace and reconstruction namespace. Another flaw is that this is all done client-side, which means that the number of results you see is very small and inconsistent. If only this could be done server-side... But still, a useful gadget. Thanks, Yair! --WikiTiki89 14:58, 4 April 2016 (UTC)
Since we separate reconstructions by language anyway, it can be assumed that watching such a page means you're interested in that language. Or am I understanding this wrong? —CodeCat 15:03, 4 April 2016 (UTC)
But you might not be watching every page in the language you're interested in and want to see all recent changes related to that language. I guess that makes more sense in recent changes than it does in the watchlist. --WikiTiki89 15:10, 4 April 2016 (UTC)
That's true. —CodeCat 15:22, 4 April 2016 (UTC)

Russian Church Slavonic[edit]

I asked over at the Grease Pit about creating an etymology-only language for Russian Church Slavonic. I wrote:

I'm pretty sure it should be treated as a dialect of Old Church Slavonic (code cu). Not sure what code to use, maybe cu-ru? (Although most such codes seem to have 3 letters after the hyphen.) While we're at it, we might want similar entries for other Church Slavonic dialects; Wikipedia mentions three: Old Moscow, Croatian and Czech, but all of them in rather limited usage. Benwing2 (talk) 03:53, 3 April 2016 (UTC)

Wikitiki responded:

This is more of a Beer Parlour issue, since the real question is whether we want to have these. User:Ivan Štambuk probably has some opinions about it. --WikiTiki89 04:02, 3 April 2016 (UTC)

Any comments? I'm not sure why it wouldn't be a good idea to have these. They are clearly different dialects from Old Church Slavonic, with words spelled differently, etc. Benwing2 (talk) 04:07, 4 April 2016 (UTC)

(You mentioned there a comparison with Medieval Latin, but we do not have etymology-only languages for "French Medieval Latin" or "English Medieval Latin".) But anyway, if you give me an example of where you intended to use this code, then I can more easily consider whether or not it's a good idea. --WikiTiki89 14:46, 4 April 2016 (UTC)
@Benwing2: In case you missed my previous post, I would like an example of where you intended to use an etymology-only code for Russian Church Slavonic. --WikiTiki89 14:26, 5 April 2016 (UTC)
@Wikitiki89 Thanks, I did miss your post. The comparison with Medieval Latin was meant to be Classical vs. Medieval, not French Medieval vs. English Medieval; but if the issue of Russian vs. Croatian vs. Czech bothers you, then I'd be OK with just Russian. An example of where it would be used is осени́ть ‎(osenítʹ). Vasmer specifically says it's borrowed from Russian Church Slavonic rather than from Old Church Slavonic. Benwing2 (talk) 21:42, 5 April 2016 (UTC)
@Benwing2: I can't find an entry for that in Vasmer at all. Can you link me to where you found it? --WikiTiki89 23:56, 5 April 2016 (UTC)
Hmmm. I actually got it from ru:осенить, which says it comes from Vasmer and another source, but you're right that it's not in the online version of Vasmer. Perhaps it's the other source, or the online version of Vasmer doesn't include everything that was published? Benwing2 (talk) 00:09, 6 April 2016 (UTC)
What you may have thought was another source is actually just a link to a "list of literature". Anyway, assuming the information is correct, if a Russian word is derived from Russian Church Slavonic, that can mean one of two things: either the word was inherited from Old Church Slavonic, but not actually attested in the OCS period, or the word was borrowed into Russian Church Slavonic from Russian itself (with a possible spelling change). In the former case, we can just call it OCS; in the latter case, we can just say that its spelling was influenced by OCS. I don't think late varieties of Church Slavonic were ever used as a written lingua franca, nor do I think much innovation occurred within them, which is why they are much less useful as etymology languages than Medieval Latin. But I would like User:Ivan Štambuk to confirm this, since he knows a lot more than I do about it. --WikiTiki89 00:36, 6 April 2016 (UTC)

Old Hindi[edit]

I've found a grammatical analysis of Old Hindi, which shows considerable differences from modern Hindi (more cases, less schwa dropping, etc.). Could we make hi-old into a full language, not just etymology-only? The code inc-ohi also works. —Aryamanarora (मुझसे बात करो) 20:26, 4 April 2016 (UTC)

Is it different from Sauraseni Prakrit psu? —Aɴɢʀ (talk) 19:22, 5 April 2016 (UTC)
Sauraseni Prakrit is an earlier stage, I think maybe 1000 years before Old Hindi (although the time period of Prakrit is quite long). Between the two was Sauraseni Apabhramsa. If I'm not mistaken, there was a time around maybe 1100 AD when some people still wrote in Apabhramsa and others in Old Hindi, with significant differences in the case system. Apabhramsa still has the old inherited case system to a large extent while Old Hindi is closer to the modern agglutinative system. For this reason, Old Hindi is put in the "modern" stage while Apabhramsa is in the "middle" stage. (This would suggest that the present system of having Old Hindi be an etymological variant of modern Hindi is consistent with the linguistic consensus.) However, take everything I just said with a grain of salt as I'm going by memory and might be wrong in some particulars. Benwing2 (talk) 21:52, 5 April 2016 (UTC)
You got the gist - Old Hindi is a much more simplified Sauraseni Prakrit. It is different enough to be a different language, with (I think) only five cases (modern Hindi has three). Literature is rare, but the book I've found shows some very old poetry in the language, from the region of Rajasthan. Overall, at least a few words can be added in Old Hindi with adequate citations. —Aryamanarora (मुझसे बात करो) 23:11, 5 April 2016 (UTC)
(Hang on, did you just say Hindi is agglutinative? While it does have plenty of postpositions, it is not agglutinative, at least not like Sanskrit) —Aryamanarora (मुझसे बात करो) 23:13, 5 April 2016 (UTC)
I guess I'm thinking more of other Modern Indo-Aryan languages which have more clearly agglutinative-like case systems, i.e. the plural forms are essentially the same as the singular ones. These are in origin postpositions, and in Hindi you can still analyze them this way and say there are only 3 cases. IMO it's pretty clear that these case systems are heavily influenced by the Dravidian ones. I wouldn't say Sanskrit was exactly agglutinative, though; rather, it let you form long compounds and had extensive derivational morphology of the inflected type. Benwing2 (talk) 23:25, 5 April 2016 (UTC)
BTW having more cases doesn't necessarily mean it has to be a separate language -- Early Middle English, for example, had 4 cases and 3 genders, and Late Middle English had no cases and no genders, but they're clearly dialectal variants of the same language. Benwing2 (talk) 23:27, 5 April 2016 (UTC)
@Aryamanarora You say Rajasthan right? Are you sure those aren't in fact Old Marwadi (if such a term exists), or Old Western Rajasthani (Old Gujarati)? DerekWinters (talk) 05:02, 6 April 2016 (UTC)
@DerekWinters No, it's definitely old Hindi. The verb "to be" has the 3p.s.ind. form hai, 1p.s.ind. hū̃, unlike Gujarati che. —Aryamanarora (मुझसे बात करो) 15:13, 10 April 2016 (UTC)
We shouldn't make up our own codes; if a new code for old Hindi is needed, it should be requested from SIL/ISO 639-3 (if a full language code is believed justified) or from IETF-languages (if a subcode will do.)--Prosfilaes (talk) 00:30, 15 April 2016 (UTC)
Good luck with that. —CodeCat 00:40, 15 April 2016 (UTC)
I know for a fact that you've never tried to get a code from IETF-languages, since I've been on that list for a long time. It's not that hard, provided someone is willing to come up with cites to published descriptions and justify why this is a distinct lect.--Prosfilaes (talk) 07:19, 16 April 2016 (UTC)
That would take at least a few months – I want to add entries now. —Aryamanarora (मुझसे बात करो) 11:41, 15 April 2016 (UTC)
So you want to do a lot of work, but don't care if it's useful or will be preserved? It could be done under a private use tag and switched over once we have a real one. If you're going to do something, do it right, instead doing it right now.--Prosfilaes (talk) 07:19, 16 April 2016 (UTC)
Why would the work be less useful or not be preserved if he uses the inc-ohi code now instead of waiting for a new code? If and when the new code is created, the forms can be moved. We had a lot of entries in Norman varieties using our own local codes before the code nrf was created, and once it was, we moved them to the new code. Nothing was lost. —Aɴɢʀ (talk) 07:44, 16 April 2016 (UTC)
Or we could use a correct code, like inc-x-ohi, that never will get confused with anything else. New codes don't pop into existence; they get created when someone proposes them. I can deal with the bureaucracy of IETF-languages, but I know nothing of Old Hindi, and couldn't possibly come up with a list of published descriptions, and am not the best person to make the argument for it. If we just go on and create Old Hindi entries, the odds this will ever get any sort of standard code will be minimal.--Prosfilaes (talk) 23:35, 19 April 2016 (UTC)

Entry layout for reconstructed terms[edit]

Please see some thoughts at Wiktionary talk:Reconstructed terms#Layout proposal. I am considering creating a new draft version of the policy to include considerations such as these at some point. --Tropylium (talk) 09:00, 6 April 2016 (UTC) 

Old East Slavic adjectives[edit]

Right now we have some of these lemmatized at their long forms (e.g. бѣсовьскꙑи ‎(běsovĭskyi) rather than бѣсовьскъ ‎(běsovĭskŭ)), and some of them lemmatized at their short forms (e.g. четвьртъ ‎(četvĭrtŭ) rather than четвьртꙑи ‎(četvĭrtyi)). Is this intentional? I’d think they should be consistently at one or the other, but am I missing something? Vorziblix (talk) 16:04, 7 April 2016 (UTC)

I think they should all be lemmatized at their short forms, unless the short forms are unattested. --WikiTiki89 17:09, 7 April 2016 (UTC)
Why does attestation matter? We lemmatise words in other languages consistently even if the lemma form happens to be not attested. And the lemma form of Slavic adjectives is entirely predictable from any other form. —CodeCat 17:33, 7 April 2016 (UTC)
Because the short forms may have not existed for some adjectives. Compare how the short forms of adjectives with *-ьskъ do not exist in any modern Slavic language. I wouldn't want to lemmatize a short form when it did not exist. --WikiTiki89 17:48, 7 April 2016 (UTC)
We can easily determine from OES grammars which classes of adjectives had only the longer forms. Also consider that OCS did still have short inflections for almost all adjectives. —CodeCat 18:54, 7 April 2016 (UTC)
If that can be determined, then sure. OES ≠ OCS. --WikiTiki89 19:01, 7 April 2016 (UTC)
I would assume so, unless OES attestation is so sparse that nobody has been able to determine it yet. —CodeCat 19:11, 7 April 2016 (UTC)
Short forms of *-ьskъ adjectives do seem to be rare, but even a cursory search turns up at least one instance: »Азъ… хотѣвъ вкоуситі оучительскаго съказаниꙗ, готова прѣложити въ словѣньскь ꙗзꙑкъ.« In any case, I’ll move добрꙑи, тѧжькꙑи, and бородатꙑи, which are each attested in their short forms on the very pages they link to. Vorziblix (talk) 22:03, 7 April 2016 (UTC)


When and why did this stop capitalizing the first letter of the word? Was this discussed? I feel like the link should reflect the spelling in the Wikipedia entry, and thus automatically capitalize the first letter. --WikiTiki89 18:53, 7 April 2016 (UTC)

It was done because not all Wikipedias capitalise all entries. I think it was the Lojban Wikipedia that was having trouble, all the links to it were broken. —CodeCat 18:54, 7 April 2016 (UTC)
So why can't we do that only for the Lojban Wikipedia and whatever few other ones there may be? --WikiTiki89 19:00, 7 April 2016 (UTC)
Ask the person who made the edit and the people who participated in the discussion. —CodeCat 19:12, 7 April 2016 (UTC)
What a review of the Wiktionary:Grease_pit/2015/June "discussion" shows is that, of the three technical adepts pinged 2015, August 5, based on their having contributed to the modules and templates, none responded. DCDuring TALK 20:59, 7 April 2016 (UTC)

Italian pronunciation[edit]

In this page you read:

'See Italian phonology at Wikipedia for a thorough look at the sounds of Italian. In addition, the Wikipedia help page for IPA for Italian has some useful English approximations.'

Neither in Italian phonology nor in Help:IPA for Italian you can find any reference to asterisks (*) used in phonetic transcriptions of Italian language to indicate the so called "syntactic gemination", but in Appendix:Italian pronunciation it is specified that the asterisk is the symbol used for that. The reason you can find it in this dictionary and not in the main encyclopedia is that it was inserted without any consensus by an Italian user and nobody has noticed it during these months. This user, IvanScrooge98, did the same on en.wikipedia, and there he was contested for that: user Macrakis created a discussion in Help talk:IPA for Italian#Syntactic gemination asking to remove the asterisks and users Peter238 and Aeusoes1 joined the discussion and agreed with him. These users are expert in phonetic issues and IPA and decided to delete this arbitrary symbol deliberately inserted. You can read the full talk in the link, and I am reporting the main passages here:

  • We should not show the syntactic gemination (SG) symbol (*) in our transcriptions of Italian, except in articles that are specifically about Italian phonology, for three reasons: 1) cases are predictable except for a small closed set of function words; 2) the actual set of words varies by variety; 3) it is not a universal feature of standard spoken Italian.
  • Syntactic gemination is fully predictable — they occur after stressed final syllables, including stressed monosyllables (phonological SG) and after a small, closed set of unstressed monosyllables and some penultimate-stressed polysyllables. This closed set includes only function words and not nouns, which are the typical words for which we give pronunciations.
  • The actual set of exceptional words varies by variety of Italian, even among those varieties which show SG. But in any case, it does not include nouns, if I'm not mistaken.
  • Many varieties of standard Italian spoken outside central Italy do not show SG at all.
  • Also, the only time the "*" annotation is useful is if the word is being composed with another word, e.g., "il Po superiore", which is presumably [il'possuperi'ore] rather than [il'po.superi'ore]. But if you know enough Italian to compose words like this, you don't need the "*".
  • I agree that there's no need to transcribe SG when the last syllable is stressed, but I'm not sure about the rest.
  • Why couldn't we just show the actual gemination if it occurs in our transcription? It would be like somehow marking French transcriptions with the final consonant that is normally elided except in cases of liaison.
  • In looking back over the discussion, I see that I wasn't specific enough about the case in question. I didn't intend to discuss the transcription of syntactic gemination in the interior of phrases. That is, I see no problem with [kafˈfɛ lˈlatte]. I was concerned with cases like the Po (river) article, where the name of the river is transcribed as [pɔ*]. This is parallel to transcribing Alaska (and every other noun ending in 'a') as [əˈlæskə(r)] because in sentences like "Alaska is big", RP speakers say [əˈlæskər ɪz 'bɪg], which I hope no one is proposing.

Consider, please, above all the last 2 points. It is completely useless, not to say counterproductive, to add an asterisk to show something that, if the asterisk is used, does not happen at all. When it happens, in a sequence of words, a consonant is doubled or it is used the ː symbol as for vowels. Moreover, the International Phonetic Alphabet (which is supposed to be the conventional alphabet used in this dictionary to transcribe the pronunciation of words and names in any language) has never, ever, employed asterisks for syntactic gemination, in any language. They are used sometimes for consonants pronounced fortis in Korean language and maybe in other cases, so using it for Italian words (either stressed on the final vowel, or monosyllabic, or a half dozen of polysyllabic words) would just confuse readers, as it happened to Macrakis on en.wikipedia, and even here to other users. Why should we keep a totally subjective convention introduced without asking anyone and just at one user's will? I frankly find it absurd to have to open a new discussion to ask for permission to remove something introduced without consulting anyone and already crossed out from en.wikipedia, but I am following the rules and probably also the 3 users who have already discussed about it in the talk page I have linked above may join too. I entrust your common sense about your final decision. 21:34, 7 April 2016 (UTC)

i consider this argumentation reasonable Phfnhyn (talk) 23:55, 8 April 2016 (UTC)

The IP editor above has correctly quoted my arguments against including the * symbol for "syntactic gemination". This is certainly an interesting phonological phenomenon, but is not tied to individual words (except for a small number of function words). I see no point in including * in the phonetic transcription of Italian words on Wikipedia or in Wiktionary. Even for the small closed class of function words, it is not universal in Italian, and really belongs in a grammar or phonological description, not in a dictionary. --Macrakis (talk) 18:04, 9 April 2016 (UTC)

I don't know much about it, but I agree using the asterisk to mark places where syntactic gemination may occur is kind of silly. If its triggers are not predictable in all cases, then maybe it would be better to mark certain words as triggering it, the way certain Irish words (e.g. go, i, bhur, etc.) are marked as triggering certain initial mutations. —Aɴɢʀ (talk) 19:56, 9 April 2016 (UTC)
Silly is the word! As quoted above, it's the same as marking French words ending with "d p s t x z", because when followed by a word starting with a vowel such consonants are read, unlinke if the words are pronounced alone or followed by consonant: the logical solution is just writing the consonant when it's actually pronounced in the 1st case. E.G.: "ils" (they) is pronounced [il], while "ils ont" (they have) is pronounced [ilz ɔ̃] and "ils ont un" (they have a) [ilz ɔ̃t œ̃]... 20:39, 9 April 2016 (UTC)
Actually, writing French liaison consonants has a lot more merit, because it is unpredictable and has to be memorised for each word. —CodeCat 20:45, 9 April 2016 (UTC)
Yes, maybe I shouldn't have switched language... Italian words with syntactic gemination are fully predictable, and in the page we're talking about there're links to specific articles about Italian phonology where this phenomenon is explained. It'd make a lot more sense marking French words subject to liaison than Italian words subject to syntactic gemination, and here French word are marked with nothing when they're subject to liaison! 20:59, 9 April 2016 (UTC)

I didn’t notice this practice until quite recently. I’ve never seen any dictionary or encyclopaedia inserting asterisks into their pronunciative instructions, for Italian or otherwise. I’d be quite fine with them being subtracted. --Romanophile (contributions) 08:28, 11 April 2016 (UTC)

@Angr I would like this discussion to have aroused more interest, but after 4 days only a few people have replied, none of them liking the use of the asterisk notation in question, though; I cannot say whether this is just not an important matter for the community or the community prefers truly to take the asterisks away, but in both cases, and since the only ones who wrote down their opinion expressed themselves against this convention, I wonder if this is not enough to start considering their removal. 17:30, 11 April 2016 (UTC)

I'd give it a week, but I agree it looks unlikely that the asterisks will be kept. —Aɴɢʀ (talk) 17:44, 11 April 2016 (UTC)
Disclaimer: I am not for removing information just because it is predictable. Many languages allow pronunciation to be predicted from spelling or inflection from verb class, yet get both a pronunciation section and a declension table. But since this case is apparently an idiosyncratic practice marking a non-general prosodic side effect, I'm for removing it. Korn [kʰũːɘ̃n] (talk) 08:05, 12 April 2016 (UTC)

@Angr I see that another use has commented negatively the asterisk notation, tomorrow it will be a week and probably we shall be able to take a decision, if you agree. 18:48, 13 April 2016 (UTC)

Now that a discussion has been held, I have no objection. I was never in favor of the asterisk, I was only opposed to a large-scale change in our representation of Italian pronunciation without discussion first. —Aɴɢʀ (talk) 09:25, 14 April 2016 (UTC)
O.K. then, thank you! Later I'll fix the asterisks. 09:33, 14 April 2016 (UTC)

New policy: Reconstructions must have descendants or derived terms[edit]

Reconstructions are always based on descendant forms which point to their existence. I have already been following this rule as an unofficial policy myself, making sure I can find descendants before creating reconstructed entries. I wonder if we can make this an official policy, to be added to WT:RECONS? —CodeCat 19:12, 8 April 2016 (UTC)

Is there a problem that reconstruction entries are actually being created without descendants, or is this purely a hypothetical issue? --WikiTiki89 19:34, 8 April 2016 (UTC)
There have certainly been a fair few yes. I just deleted them all in the past, but we had no policy stating that it was grounds for deletion. I'd feel better if I could point to a policy when deleting. —CodeCat 19:47, 8 April 2016 (UTC)
I see. The bug is in the {{policy}} template, which seems to imply that a "guideline or common practices page" "must not be modified without a VOTE". We're allowed to enforce guidelines and common practice, so if we were able to modify the page without a BP discussion rather than a vote, that would be the ideal solution. --WikiTiki89 20:09, 8 April 2016 (UTC)
I'm actually asking if people agree though, I don't know if this is a common practice. It's just a practice I've applied, myself. —CodeCat 20:11, 8 April 2016 (UTC)
Ok. I agree. I also think this has been discussed before. --WikiTiki89 20:20, 8 April 2016 (UTC)
The loophole is that you can add {{policy}} without a vote but then any subsequent changes, such as reverting that edit, do need a vote. That's why I think sometimes {{policy}} shouldn't be taken as gospel. Just make the edit. Renard Migrant (talk) 22:47, 8 April 2016 (UTC)
  • I strongly agree. I think it's worth adding to the page. —Μετάknowledgediscuss/deeds 20:16, 8 April 2016 (UTC)
  • If it isn't a problem, it doesn't need a solution. —Aɴɢʀ (talk) 22:45, 8 April 2016 (UTC)
    • Do you think reconstructions without descendants or derived terms are fine, then? —CodeCat 22:48, 8 April 2016 (UTC)
      • No, I think common sense doesn't need to be regulated. —Aɴɢʀ (talk) 09:44, 9 April 2016 (UTC)
  • Not sure what I think. I feel like I need some context. What are some examples of entries that someone created but you deleted? And in what situation and for what purpose would a scholar (assuming the person used scholarly sources and didn't just make things up) reconstruct forms that have no descendants? — Eru·tuon 00:07, 9 April 2016 (UTC)
  • Definitely. —JohnC5 05:24, 9 April 2016 (UTC)
Whether we officialise it with a policy or not I am in support of the proposal. I don't think we have an issue now with it, but creating a policy now will certainly make it easier to deal with should a problem ever arise in future. Leasnam (talk) 21:02, 9 April 2016 (UTC)
  • Hmm. Corner case: suppose we know that in subfamily X of family Y, a base vocabulary item — let's say 'water' — has been replaced by loanwords. However, the loaning has taken place later than proto-X (which might be inferrable e.g. due to external history, due to the loanwords being from numerous distinct sources, or from internal chronology of sound changes). Would it be then legitimate to reconstruct the inherited Proto-X form for 'water', despite its later extinction, e.g. for the purposes of rounding out a Swadesh list for Proto-X? --Tropylium (talk) 16:00, 22 April 2016 (UTC)
    Another corner case that comes to mind: it's occasionally possible to find a reconstruction referenced in a particular source, but without any actual descendants listed (e.g. due to the reconstruction being compared to data from a related branch). I suppose that in such cases, it should still be OK to create the reconstruction on the basis of the source, and worry about looking up the actual descendants later. --Tropylium (talk) 16:06, 22 April 2016 (UTC)
    In such cases, I refrain from creating the entry. —CodeCat 16:12, 22 April 2016 (UTC)
    For your first corner case, no. You don't know that the language inherited the native word. It could have fallen out of use before the borrowing. Perhaps there was another intermediate borrowing, or perhaps people stopped talking about water, who knows. For your second corner case, I would say that you can probably create the entry with the intent to add descendants later, but once it is clear that no such descendants can be found, then you would have to delete it. --WikiTiki89 16:15, 22 April 2016 (UTC)

Black's Law Dictionary 10th Edition (2014)[edit]

If anyone has access to this book, could they check that the edits of (talk) are not coyvios please. They look like they might be word-for-word copies SemperBlotto (talk) 11:45, 9 April 2016 (UTC)

@BD2412, do you have a copy? - -sche (discuss) 22:02, 9 April 2016 (UTC)
Not of the 10th edition, I'm afraid. I'm stuck with the one I picked up in law school - the eighth. By the way, if anyone is interested, Wikisource has the second edition in progress. bd2412 T 18:39, 11 April 2016 (UTC)
The user is not doing a great job either, e.g. default en-noun adding an English -s to Latin terms, even ones that end with -s (and the others rarely take an English plural anyhow). Equinox 12:50, 12 April 2016 (UTC)

Russian -стрелить, -скочить, -прячь, etc. -- suffixes, verbs, roots, or what?[edit]

I've been creating entries for verbal roots like -стрелить, -скочить, -прячь, for convenience in creating etymologies and listing derived verbs. These are cases where there are prefixed verbs, e.g. застрелить, напрячь, and logically the base verb should be attested but it isn't. What part of speech should they be labeled as? I've been using "suffix" but don't feel totally comfortable with this, maybe it should be "verb" instead? Benwing2 (talk) 23:08, 9 April 2016 (UTC)

I've listed morphemes like -fico, -φρων ‎(-phrōn), -γραφία ‎(-graphía) as combining forms. That might be appropriate. I kind of wish categories and {{affix}} were able to recognize the category, though. It currently treats them as suffixes, so σώφρων ‎(sṓphrōn) is placed in Category:Ancient Greek words suffixed with -φρων. — Eru·tuon 23:36, 9 April 2016 (UTC)

Appearance of templates in temp[edit]

Right now, templates referenced using {{temp}}, like {{en-noun}}, have a border around the reference and gray background. This was not so recently, AFAIR. I think it's pretty ugly. Is this a change made via MediaWiki software update or did someone edit some Wiktionary global css? I see that {{temp}} uses the <code> element, so I can achieve the same appearance by using code directly, like this. By contrast, tt element does not do this, like this, and has the reasonably plain appearance that I prefer. --Dan Polansky (talk) 07:39, 10 April 2016 (UTC)

<tt> got 'deprecated', whatever that means. I mean if it's deprecated why does it still work? Anyway have you considered just getting used to it? Humans are pretty adaptable after all. Renard Migrant (talk) 15:59, 11 April 2016 (UTC)
Deprecated means that even though it may still work, it may be removed in the future. Now I have no idea why it's deprecated or who deprecated it. As for {{temp}}, I have come to like to the surrounding box thing. --WikiTiki89 16:47, 11 April 2016 (UTC)
It's deprecated by the HTML standard. —CodeCat 16:57, 11 April 2016 (UTC)
I am indifferent regarding the box, but we can easily create a css class which emulates the tt tag so the deprecation is moot (other than that we should not use it). - TheDaveRoss 17:55, 11 April 2016 (UTC)
  • I like the appearance of {{temp}}. --Dixtosa (talk) 17:10, 11 April 2016 (UTC)
  • Like Wikitiki, I've come to like the surrounding box. - -sche (discuss) 05:45, 22 April 2016 (UTC)

self‐proposal for adminship[edit]

I think that we need more administrators here, so I’d like to suggest adminship for myself. This has been a long time coming, but I’m seeing a lot of crap popping up and I just don’t have the utilities to deal with it by myself. Considering my trustworthiness, I will confess that I have inserted jocular content into the mainspace a few times, but I did have the decency to alert other users about it later (or just fix it myself), so I don’t think that any jokes that I made are still rotting in a lexical entry somewhere. I am a bureaucrat over at Wikcionario, and you simply won’t find any silly jokes there, no matter how arduously you try. The users there seem quite satisfied with my position, and my (few) errors there aren’t particularly outrageous. @Peter Bowman could give his opinion if he’d like. --Romanophile (contributions) 06:16, 11 April 2016 (UTC)

Coming from a returning user who’s infamous for goofing around, that doesn’t surprise me. --Romanophile (contributions) 08:19, 11 April 2016 (UTC)
I wouldn't support you probably because you swear too much, and seem to get stroppy when things don't go your way. --F909fef0j (talk) 08:36, 11 April 2016 (UTC)
In recent times, I don’t think that I really have a tendency towards obstinance; I hate being called stubborn. Also Dick Laurent (talkcontribs) is probably a worse coprolaliac than I am. --Romanophile (contributions) 09:39, 11 April 2016 (UTC)
The real reason you won't be getting support from him is because he is ineligible to vote. And he's already gotten himself blocked, but chances are he'll be back. --WikiTiki89 18:59, 12 April 2016 (UTC)
  1. Symbol support vote.svg Support I think that we need more administrators here, too. JackPotte (talk) 13:52, 11 April 2016 (UTC)
  1. Symbol support vote.svg Support I’m surprised this isn’t already the case. Vorziblix (talk) 18:44, 12 April 2016 (UTC)
  1. Symbol support vote.svg Support I actually disagree; we have too many admins. However, Romanophile would be a good admin, so I can't vote against him. --WikiTiki89 18:59, 12 April 2016 (UTC)
  1. Symbol support vote.svg Support I was also surprised that you weren't already. --Robbie SWE (talk) 19:05, 12 April 2016 (UTC)
  2. Symbol abstain vote.svg Abstain You'll be relieved to know that I won't be seeking adminship. Donnanz (talk) 19:22, 12 April 2016 (UTC)
  3. (comment) That's correct, Romanophile is a serious and solid colleague on eswiktionary. Regards, Peter Bowman (talk) 19:56, 12 April 2016 (UTC)
  4. Symbol support vote.svg Support. I too thought you were already an Admin. Leasnam (talk) 20:35, 12 April 2016 (UTC)
  5. (comment) I still think you are an admin. --Dixtosa (talk) 21:01, 12 April 2016 (UTC)
All right, the chances look reasonable. Now I just have to figure out how to set up an official election, or wait for somebody else to do it for me. --Romanophile (contributions) 00:52, 13 April 2016 (UTC)
Here: Wiktionary:Votes/sy-2016-04/User:Romanophile for admin. --Romanophile (contributions) 03:36, 13 April 2016 (UTC)


Should we put our plateau entries under the category of "Mountains" or "Plateaus"? E.g. Mexican Plateau, Armenian Highland, Tibetan Plateau, Loess Plateau, Mongolian Plateau, etc. ---> Tooironic (talk) 10:25, 13 April 2016 (UTC)

@Tooironic: Plateaux. — I.S.M.E.T.A. 15:28, 14 April 2016 (UTC)
No, the more common English plural is preferable; there's no need for us to be pretentioux. - -sche (discuss) 05:42, 22 April 2016 (UTC)
@-sche: Plateau is also a verb, so that simple Ngram doesn't prove anything. And plateaux is hardly pretentious. — I.S.M.E.T.A. 15:56, 5 May 2016 (UTC)
Even the OED shows plateaux as a secondary (British) spelling, the primary being plateaus. DCDuring TALK 16:02, 5 May 2016 (UTC)

Proto language translations[edit]

It was my understanding that proto languages were not to be included in translations. See my edit on gold that were reverted by @I'm so meta even this acronym. Is this an official policy or not? DTLHS (talk) 15:06, 13 April 2016 (UTC)

FYI, here are: my request, DTLHS’s reversion, and my reversion. — I.S.M.E.T.A. 15:13, 13 April 2016 (UTC)

Yes. The official policy is that words in proto-languages are not to be included in the main namespace other than in etymologies. --WikiTiki89 15:10, 13 April 2016 (UTC)
@DTLHS, Wikitiki89: It seems legitimate that proto-languages' translations be included in those tables. It's information people might well want, and how else would they find it? In this case, I was trying to find out what the indigenous Celtic word for "gold" was, since all the Celtic languages' terms I could find on here derive from the Latin aurum. — I.S.M.E.T.A. 15:16, 13 April 2016 (UTC)
No, we have consistently kept the standard that only languages which are in mainspace can be added as translations, so no Proto-Celtic or Klingon or what have you. There's a lot of information some people might hypothetically want, but that doesn't mean it's appropriate for us to include. —Μετάknowledgediscuss/deeds 15:22, 13 April 2016 (UTC)
(e/c) We don't consider reconstructed words to be real words. That's one reason. As for your specific example, if no Celtic language has an indigenous word for gold, then there would be no basis on which to reconstruct one. --WikiTiki89 15:25, 13 April 2016 (UTC)
What Μετάknowledge said. - -sche (discuss) 15:56, 13 April 2016 (UTC)

*Humph* Fine. Does anyone know of any word in any of the Celtic languages that means "gold" and which doesn't derive from the Latin aurum? There's a fair bit of gold in Wales, so I'd assumed that there would exist such a word. — I.S.M.E.T.A. 15:30, 14 April 2016 (

I'm unaware of any Celtic word for "gold" besides loanwords from aurum. Maybe in Gaulish or Celtiberian, but I don't think there's anything in Insular Celtic. The gap isn't especially surprising, though; there are all sorts of semantic gaps in proto-languages where we might not expect them. For example, there's no reconstructable PIE word for "rain", despite the fact that PIE speakers must have been aware of and had a word for it. —Aɴɢʀ (talk) 21:30, 19 April 2016 (UTC)
@Angr: How frustrating. Thanks anyway. — I.S.M.E.T.A. 15:50, 5 May 2016 (UTC)
Hmm. There seems to have been a PIE verb, at least. Gamkrelidze and Ivanov's Indo-European and the Indo-Europeans says that although taboo replacement led to a "low number of attested Indo-European languages preserving *seu-/*su- in the sense 'rain' [...] the ancient root is preserved in its original sense" of "to rain" in Greek, Albanian, Old Prussian (suge in the Elbing vocabulary), and Tocharian (su- per Adams' Dictionary of Tocharian B), and Reconstruction:Proto-Germanic/sūpaną mentions it.
What would be the expected Celtic reflex of PIE aus- / h₂é-h₂us-o-?
- -sche (discuss) 05:33, 22 April 2016 (UTC)
@-sche Probably Old Irish áu (attested in the meaning 'ear' from *h₂ṓws) and Middle Welsh eu/Modern Welsh au. —Aɴɢʀ (talk) 18:33, 22 April 2016 (UTC)

This is a bit insensitive to visitors - if we have so many reconstructed terms, they ought to be linked in translation tables. —Aryamanarora (मुझसे बात करो) 20:46, 19 April 2016 (UTC)

@Aryamanarora: Yes, that's my thinking. — I.S.M.E.T.A. 15:50, 5 May 2016 (UTC)
No, they oughtn't. They ought to be linked in etymology sections of real words. —Aɴɢʀ (talk) 21:16, 19 April 2016 (UTC)
Why not both? Semantic shifts occur too, you know. Theoretically, the page "fire" would link to *péh₂wr̥ in its etymology section, but have both *péh₂wr̥ and *h₁n̥gʷnis in its translation table. —Aryamanarora (मुझसे बात करो) 20:17, 5 May 2016 (UTC)

How to render "someone" in lemma phrases in Russian?[edit]

@Atitarev, Cinemantique, Wikitiki89, Wanjuscha, KoreanQuoter I wanted to create an entry for the expression ободра́ть кого́-нибудь как ли́пку meaning "to rob someone blind" (literally "to strip someone like a young linden tree"). We have the corresponding English expression under "rob someone blind" with the word "someone" embedded into it. The only issue is that there are three more-or-less synonymous words for "someone" in Russian: кто-нибудь, кто-либо and кто-то. I've been using кто-нибудь in usage examples; Wanjuscha prefers кто-либо. My primary dictionary uses кто-нибудь but Zaliznyak and some others use кто-либо (often abbreviated кто-л or кто-л.). I just put the expression under ободрать как липку (to "rob blind", leaving out the "someone"), but the English analogy suggests that we should include "someone" (there's no rob blind). What should be done? Benwing2 (talk) 18:48, 13 April 2016 (UTC)

One thing to be considered is that in English we add these words because word order is important and there must be an object in between "rob" and "blind", otherwise it doesn't make sense. In Russian, since word order is flexible, we can just create the entry at ободра́ть как ли́пку ‎(obodrátʹ kak lípku). The reason many dictionaries include the word for "someone" is in order to indicate which case it should be in, but we can do that just as easily with tags and/or usage examples in the entry. --WikiTiki89 18:54, 13 April 2016 (UTC)
I think "rob blind" makes perfect sense as a lemma. It can even be used in real sentences that way. —CodeCat 18:57, 13 April 2016 (UTC)
Actually, in this particular case, I would also prefer "rob blind", although this is a dictionary-only form, since in the real world there is always an object in between. But there are many cases where the "someone" is required in the lemma. --WikiTiki89 19:02, 13 April 2016 (UTC)
Not always. "They robbed blind the very people who were trying to help them." —CodeCat 19:04, 13 April 2016 (UTC)
You're right. I definitely support moving rob someone blind to rob blind. --WikiTiki89 19:07, 13 April 2016 (UTC)
To expand on what Wikitiki said, you don't need to represent or add the word "someone" in the Russian entry titles. It's fine for a Russian entry with no word that corresponds to "someone" to link to an English entry containing "someone",; you can pipe the link if you think it would be misleading otherwise: [[rob someone blind|rob blind]]. - -sche (discuss) 01:57, 21 April 2016 (UTC)

ist der Ruf erst ruiniert, lebt es sich ganz ungeniert[edit]

What is our policy on (excessive?) referencing like in this one and also in some other edits by the user:Caligari, who is an admin from the German wiktionary. I don't want to denunciate, I'm just asking because the entry doesn't look like our entries usually look. And I'm relatively sure that we don't want our lemmas to look like the German entry for Haus, which lists 57 sources and an additional 15 references. Kolmiel (talk) 23:53, 14 April 2016 (UTC)

@Kolmiel: The entry for ist der Ruf erst ruiniert, lebt es sich ganz ungeniert looks fine to me. — I.S.M.E.T.A. 12:10, 15 April 2016 (UTC)
I also think that the Wikiwörterbuch's entry looks fine, and that the amount of sourcing that appears therein is proportionate to the amount of substantive content in the entry. — I.S.M.E.T.A. 12:13, 15 April 2016 (UTC)
I agree, but I also agree that the reference material for de:Haus is excessive and I don't want our entries to look like that either. —Aɴɢʀ (talk) 12:19, 15 April 2016 (UTC)
@Angr: I'm not a fan of the Wikiwörterbuch entry layout. I've yet to see how all that reference would look if converted to our format. I expect we'd probably stick it in a collapsible table if there was that much of it. — I.S.M.E.T.A. 12:36, 15 April 2016 (UTC)
If we had many entries with that amount of reference material, someone would doubtless propose a new References tab next to the Citations tab. —Aɴɢʀ (talk) 12:42, 15 April 2016 (UTC)
@Angr: Since Wikipedia hasn't done that already (and I've seen some articles with hundreds of citations), I think it's very unlikely that we'd do so. One of the main justifications of the Citations: was that it would give a place to house citations for non–CFI-satisfying terms; I see no analogical use for a References: tab. — I.S.M.E.T.A. 15:40, 15 April 2016 (UTC)
I didn't say it was likely we would do it, merely that it was likely someone would propose it. It would probably also be really difficult to get <ref> tags to generate text in a different namespace on a different tab. —Aɴɢʀ (talk) 15:43, 15 April 2016 (UTC)
@Angr: I'm afraid I wouldn't know about that. — I.S.M.E.T.A. 15:47, 15 April 2016 (UTC)

This entry seems fine to me. My main issue―and I hate to be a schoolmarm about this―is that Caligari's seems to be in violation of WT:USER#User pages. Furthermore, I can barely read the talkpage. This may just be silly of me and irrelevant, but I thought I'd mention. —JohnC5 14:47, 15 April 2016 (UTC)

@JohnC5: The user page isn't too offensive, even if it does violate WT:USER#User pages, but the talk page's background colour is nauseating. — I.S.M.E.T.A. 15:40, 15 April 2016 (UTC)


I have created a vote with the intent that we can finally decide on a new logo for the English Wiktionary. I welcome all discussion on how to make this vote as likely as possible to be successful at representing the consensus of Wiktionary editors, so please feel free to edit the vote if you can improve it. —Μετάknowledgediscuss/deeds 03:25, 16 April 2016 (UTC)

Thank you! I've always been bothered by the current logo. Benwing2 (talk) 05:21, 16 April 2016 (UTC)
I've always been bothered by the RP pronunciation. It doesn't seem right to endorse British, American, or any other English. We could change the pronunciation to enPR or just omitting it altogether in favo(u)r of something more universally accepted and comprehensible like hyphenation (Wik‧tion‧ar‧y); though on second thought, there might be some debate about that too. It's also strange that the headword format in the logo is not one we use (viz. we don't use PoS abbreviations). —JohnC5 05:43, 16 April 2016 (UTC)
We already did this. Why are we doing it again? --Yair rand (talk) 19:31, 21 April 2016 (UTC)
We haven't done "this", whatever you mean by that, for quite a while, and there is great dissatisfaction with the current logo (just look at the comments above yours). —Μετάknowledgediscuss/deeds 14:18, 26 April 2016 (UTC)
  • The vote has begun! Please vote — I would really like to get maximal turnout so we can find a logo that's broadly acceptable to the community. —Μετάknowledgediscuss/deeds 14:18, 26 April 2016 (UTC)
    Why not give the link in the News for editors top part of the site? Or, *gasp*, at the same place as the News for editors link? — Dakdada 10:46, 27 April 2016 (UTC)

Server switch 2016[edit]

The Wikimedia Foundation will be testing its newest data center in Dallas. This will make sure Wikipedia and the other Wikimedia wikis can stay online even after a disaster. To make sure everything is working, the Wikimedia Technology department needs to conduct a planned test. This test will show whether they can reliably switch from one data center to the other. It requires many teams to prepare for the test and to be available to fix any unexpected problems.

They will switch all traffic to the new data center on Tuesday, 19 April.
On Thursday, 21 April, they will switch back to the primary data center.

Unfortunately, because of some limitations in MediaWiki, all editing must stop during those two switches. We apologize for this disruption, and we are working to minimize it in the future.

You will be able to read, but not edit, all wikis for a short period of time.

  • You will not be able to edit for approximately 15 to 30 minutes on Tuesday, 19 April and Thursday, 21 April, starting at 14:00 UTC (15:00 BST, 16:00 CEST, 10:00 EDT, 07:00 PDT).

If you try to edit or save during these times, you will see an error message. We hope that no edits will be lost during these minutes, but we can't guarantee it. If you see the error message, then please wait until everything is back to normal. Then you should be able to save your edit. But, we recommend that you make a copy of your changes first, just in case.

Other effects:

  • Background jobs will be slower and some may be dropped.

Red links might not be updated as quickly as normal. If you create an article that is already linked somewhere else, the link will stay red longer than usual. Some long-running scripts will have to be stopped.

  • There will be a code freeze for the week of 18 April.

No non-essential code deployments will take place.

This test was originally planned to take place on March 22. April 19th and 21st are the new dates. You can read the schedule at wikitech.wikimedia.org. They will post any changes on that schedule. There will be more notifications about this. Please share this information with your community. /User:Whatamidoing (WMF) (talk) 21:07, 17 April 2016 (UTC)

Proposal to globally ban WayneRay from Wikimedia[edit]

Per Wikimedia's Global bans policy, I'm alerting all communities in which WayneRay participated in that there's a proposal to globally ban his account from all of Wikimedia. Members of the Wiktionary community are welcome in participate in the discussion. --Michaeldsuarez (talk) 14:53, 18 April 2016 (UTC)

That user has only two edits to English Wiktionary, both of which have been deleted. —Aɴɢʀ (talk) 14:55, 18 April 2016 (UTC)
And one of those was to a transwiki that was later deleted, while it was still at Wikipedia. Their only real edit here was creation of a user page that was deleted an hour and a half later as "promotional material". Definitely not a participant here. Chuck Entz (talk) 01:27, 19 April 2016 (UTC)

Cascading protection of the main page[edit]

The main page has cascading protection. This prevents non-admins from editing Words of the Day on the day the word is featured, while allowing them to usually edit and set Words of the Day, which is good and is AFAICT the intention behind the cascading protection. However, it has some negative effects, too: it blocks people from editing certain modules, such as Module:labels/data, when labels appear on the main page. (See discussion.) Should we remove cascading protection from the main page? Alternatively, I think labels used to be "converted" by hand to (''text'') before being plugged into the WOTD templates, whereas that doesn't happen anymore... we could make an effort to convert the labels on all the WsOTD that have been set. - -sche (discuss) 00:48, 20 April 2016 (UTC)

I think labels should be converted to {{qualifier}} for WOTDs. --WikiTiki89 16:04, 20 April 2016 (UTC)
Or even a qualifier template used only on the Main Page! Renard Migrant (talk) 21:58, 20 April 2016 (UTC)
Better yet, they should be substed. Chuck Entz (talk) 01:32, 21 April 2016 (UTC)
I'm not sure that would work. Substing {{lb|en|chemistry}}, for example, yields {{#invoke:labels/templates|show}}. —Aɴɢʀ (talk) 13:51, 21 April 2016 (UTC)
The template code can be tweaked to make substing possible, but I don't think that's a good solution, since it would add too much unnecessary mark-up. --WikiTiki89 15:01, 21 April 2016 (UTC)
Modules can be substituted, and there is also the Lua function mw.isSubsting() which a module can use to return different results depending on whether it's substed or not. —CodeCat 15:45, 21 April 2016 (UTC)
Yes they can be, but they won't be unless the template's code is changed to something like {{<includeonly>safesubst:</includeonly>#invoke:labels/templates|show}}. --WikiTiki89 15:57, 21 April 2016 (UTC)

Let's kill nds-de/nds-nl.[edit]

I've always been hesitant to bring it up because it's not really a high priority issue and since I don't know how to bot, I would be either heaping the work on somebody else or have to work through 500 pages manually. Also I was partially at fault for the adoption of these tags in my early days on Wiktionary and learning about Low German. But being more acquainted with both, I don't see any justification for this split, especially considering that "Low German" covers the last 400 years and not only post WWII. The only actual distinction that runs sharply on the border is orthographic tradition, though you can find some Dutch traditions in Western Germany as well, e.g. "zy" for [zɛɪ] in Eastern Frisia around 1890, which is when modern Low German had its greatest international public attention.
For reference: The distinction between nds-nl and nds-de arose on nds.Wikipedia because they couldn't settle on a way to spell things, but that is an issue we as a descriptive dictionary don't face. We currently have 99 nds-NL lemmas, 405 nds-DE lemmas and 548 nds-only lemmas. Opinions, commentary? Korn [kʰũːɘ̃n] (talk) 10:23, 21 April 2016 (UTC)

I'm fine with merging the codes as long as we have regional labels for them allowing the categories Category:Dutch Low German and Category:German Low German to be retained as regional varieties of Low German. We could even have separate categories for all of the subdialects of Dutch Low German that have their own ISO codes. —Aɴɢʀ (talk) 13:49, 21 April 2016 (UTC)
I know nothing or almost nothing about the language, however based on what I've said on Wiktionary this seems like a good idea. The Norman Wikipedia I know has similar issues with spelling because Jersey and Guernsey spellings are so different. Renard Migrant (talk) 20:08, 21 April 2016 (UTC)
I've had problems deciding which template to use, and now I just use the nds template. With Norwegian words it's hard to tell whether they're of Low German or Middle Low German origin, so I refer to Den Danske Ordbog as well, where they usually differentiate between nedertysk and middelnedertysk. What language came before Middle Low German, by the way? Donnanz (talk) 20:22, 21 April 2016 (UTC)
Old Saxon (osx). —Aɴɢʀ (talk) 21:36, 21 April 2016 (UTC)
I think we should merge these languages only if Norwegian is also merged, it's more or less the same issue. —CodeCat 21:41, 21 April 2016 (UTC)
Ooh, don't start that one up again. What about merging Scots with English?
@Angr: I thought so, cheers. Donnanz (talk) 21:53, 21 April 2016 (UTC)
I'm definitely opposed to making this merger dependent on some unrelated merger. —Aɴɢʀ (talk) 22:08, 21 April 2016 (UTC)
If there is only around 1000 nds entries in all we have only scratched the surface. I have come across many more without entries when entering etymology, and the same goes for gml. Donnanz (talk) 22:31, 21 April 2016 (UTC)
While the situation within Low German might be similar to that of Norwegian, this is not bound to national borders, which is what we wrongly imply. Westphalian dialects exist in both Germany and the Netherlands, they are very similar to each other and markedly different to the other dialects within the same nation. East Frisian and Twents on the other hand are two dialects which have the political border running right through them without too great an impact. Sometimes not even on orthography, as the zy-example shows. The dialects within Germany might agree roughly on which glyphs not to use. But since they have vastly different phonetic inventories, and are often influenced by different regional traditions, they do not employ identical spellings either, even if they agree on the pronunciation. Further they do not agree on declension patterns, number and form of articles and pronouns, and number of grammatical cases. So while the situation within Low German might be like that of Nynorsk/Bokmål/Rigsdansk, that is simply not the issue nds-de and nds-nl are dealing with. What they are dealing with is spelling. They would separate entries with the same pronunciation, meaning and grammar into separate L2 headers purely based on spelling. And this is what I ask to do away with. Korn [kʰũːɘ̃n] (talk) 10:48, 22 April 2016 (UTC)
I know that we are descriptive, but we must always be forward facing. What does the future hold for these two varieties? Will they trend towards greater divergence in future or no? I know, we can cross that bridge when we get to it, but why wait? Planning today for the future is never foolish. Leasnam (talk) 19:11, 22 April 2016 (UTC)
I don't know about Dutch Low German, but I suspect German Low German will become extinct before it has a chance to either converge with or diverge from Dutch Low German. Trying to find a fluent native speaker of German Low German below the age of 50 is like trying to find a hay-colored needle in a hay stack. —Aɴɢʀ (talk) 19:19, 22 April 2016 (UTC)
Hmm, good point. Then what's left will be just the one variety. But (hopefully) like Bavarian and Swiss German, there is always hope :) Leasnam (talk) 19:25, 22 April 2016 (UTC)
I don't see why being forward facing would include making a random guess at the outcome of the next few decades of development. Whatever their result, they won't undo the last centuries either. And if they change that much, they probably deserve their own L2 as a new phase of the language. My point was that it seems only sensible to me that either we separate the variants based on how different they are, or group them all as one. If we split them up, German Low German has to be parted into the actual dialects - which range from something between 4 and 40, depending on what you split by. If we're not going to split up Low German into its actual dialects, I don't see why we would create the singular case that groups of spellings would get a separate language tag here. (My understanding is that Nynorsk also differs in grammar from Bokmål, but, again, "both" Low Germans already differ in grammar and spelling from themselves plenty.) Korn [kʰũːɘ̃n] (talk) 20:07, 22 April 2016 (UTC)
No one in their right mind would suggest a random guess. But we can gauge based on the past 100 years or so whether they are becoming more alike, remaining static, or diverging. Leasnam (talk) 21:16, 22 April 2016 (UTC)
Maybe there's a parallel with Flemish and Swiss German which aren't regarded as separate languages here, despite the fact they are spoken in countries other than the home country of the original language. Donnanz (talk) 13:45, 25 April 2016 (UTC)
This comparison seems correct to me. I do not see how a divergence of dialects in the future justifies the split as we have it (for less than half of the entries), though. The future is neither the present nor the past, which is what we record. And I don't see why a one difference should get special treatment over all other differences which are not tied to a political border. Be as it may, since there doesn't seem to be a hard opposition, I'll just use the plain NDS-tag in etymology and descendence sections, especially since it's the major one anyway. Korn [kʰũːɘ̃n] (talk) 15:06, 25 April 2016 (UTC)
We are a dictionary, not a crystal ball. Predicting the future is not our business and should not drive any of our decisions. --WikiTiki89 15:10, 25 April 2016 (UTC)
When Wiktionary merged the various dialect codes which the Ethnologue / ISO encoded alongside nds (Sallands, Westphalian, etc) into just two codes mirroring the two Wikipedia codes, I hoped it would make the situation less messy, avoiding some of the debates over spelling and capitalization because some trends are much clearer (even if not entirely uniform) inside nds-de and nds-nl than in nds as a whole, partly as a result of the separate pull that Dutch has exerted on nds-nl and that German has exerted on nds-de.
nds-de and nds-nl are far more dissimilar than e.g. sr, hr and bs (which are functionally identical, all based on one specific subdialect). There are dialects on the edges of each which bleed into the other, as is also true of e.g. Scots vs English (where sentences can sometimes be hard to classify as Scots vs Scottish English). There is also fine internal variation, especially historically: for example, some trends and specific words distinguished similar varieties of Low Prussian from each other, not just from Western Pomeranian vs Mecklenburgish, etc — this is true of most Germanic lects, e.g. Central Franconian, Swedish, even English (contrast da yooge boid ate da olykoek /də judʒ bɜjd eɪt də ˈ(oʊ~oə).lɪ.kʊk/ and the huge bird ate the doughnut /ðə hjudʒ bɝd eɪt ðə ˈdoʊ.nʌt/).
One user previously proposed merging not only nds-de and nds-nl but also pdt into nds as nds.Wiktionary does; I oppose merging pdt into nds. I neither support nor oppose merging nds-de and nds-nl.
- -sche (discuss) 16:31, 25 April 2016 (UTC)
What's your reasoning for Plautdietsch? And what is your definition of it? Many nds-de entries have a tag including Low Prussian, which is what I understand as Plautdietsch. Korn [kʰũːɘ̃n] (talk) 22:40, 26 April 2016 (UTC)
Plautdietsch is a Low-Prussian-derived lect that was historically (and still is) spoken outside Prussia, in America and Ukraine and Russia and elsewhere. It has been kept separate on account of its separate geographic development, just like Luxembourgish and Transylvanian Saxon have been kept separate from each other and from Central Franconian, and Pennsylvania German and Volga German have been kept separate from Rhine Franconian, etc. Low Prussian is the lect that was historically spoken inside Prussia. We have entries in it because it is well- and accessibly-documented.
I have periodically questioned if it was sensible to do things that way — merge all the "inland" varieties of Rhine Franconian under one code, but keep "outland" varieties like Pennsylvania German and Volga German all separate, and likewise Hunsrik, etc, etc. It is convenient in some ways, and the weight of precedent is firmly behind it when it comes to German lects. But I periodically muse that we seem to be doing just the opposite of Ethnologue: they split everything except languages they speak, e.g. they split Fula varieties but not New York vs London English; whereas, we merge Fula but split up (some of) the languages we speak.
- -sche (discuss) 01:53, 27 April 2016 (UTC)
I'll state for the record that I think our language codes for languages which are not officially codified by some nation should be divided solely by difference in grammar and phonology. Minor variations thereof, as well as political and lexical differences, should be the criteria for tags and sub-categories. Exempli gratia I would class Berlinerisch as de. Korn [kʰũːɘ̃n] (talk) 21:15, 27 April 2016 (UTC)
Why even make an exception for languages codified by a nation? --WikiTiki89 21:20, 27 April 2016 (UTC)
Because if we were to start defining and splitting Scandinavian lects based on actual differences, we would become unusable to an average people who wants to look up a word in Swedish and not Svea-Geatlandish and or whatever we'd arrive at. My hunch is that anything with an official set of rules and spellings should have its own header, since it's what people would most likely look up. Don't pin me down on it, though, this statement feels like thin ice. Korn [kʰũːɘ̃n] (talk) 11:07, 28 April 2016 (UTC)
There's no need to use funny names; we'd just call it Southern Swedish or something. Wouldn't people wanting to look up a Norwegian word already be confused that there is no such thing as Norwegian, but only "Norwegian Bokmål" and "Norwegian Nynorsk"? And that Serbian, Croatian, and Bosnian don't exist, but instead Serbo-Croatian does? We shouldn't worry too much about whether people will be confused by which languages we choose to merge and which to split. --WikiTiki89 15:13, 28 April 2016 (UTC)
The southern Swedish dialect is known as Scanian. Donnanz (talk) 15:34, 28 April 2016 (UTC)
We got one of the problems right there: Is Scanian really southern Swedish? I've seen it classified as Eastern Danish. And there are things like Tronderish/Jamtlandic, whose area is in both and Sweden for like 50%. Is that Central Norwegian or Central Swedish? Nor do I think that people would understand "South Swedish" to be a supra-term to what they understand to be just "Swedish", rather than a subcategory of it. You got a point about Bokmål and Nynorsk nomenclature. I guess it comes down to whether one can expect users to intuitively understand that the term covers what is looked for, I don't know if that is or is not the case for Norwegian and any other language in that vein. Korn [kʰũːɘ̃n] (talk) 16:42, 28 April 2016 (UTC)


Discussion moved from User talk:Stephen G. Brown#Palochka.

I see that you were discussing the use of uppercase and lowercase Palochka back in 2012 and 2013. Currently, ru.wiktionary (XML dump) contains some 5000 uppercase and 1000 lowercase Palochka (mixed in with a few hundred uppercase Latin I and Cyrillic dotted I), while en.wiktionary and fr.wiktionary today almost entirely use the lowercase version and tr.wiktionary and ce.wikipedia (Chechen Wikipedia) only use the uppercase version. Would it be possible to find a consensus for all WMF projects on which version of Palochka to use? Should ru.wiktionary, which currently has a mix, change all to uppercase or to lowercase? --LA2 (talk) 00:56, 17 April 2016 (UTC)

I would much prefer that all uses of palochka be of one case. Originally there was only one palochka, Ӏ (u04c0), and I think it should have stayed that way. Today, the original palochka has become the uppercase palochka, so I support the use of the uppercase for all cases. Since uppercase and lowercase palochkas look identical, having it in two cases would lead to a lot of misspellings and trouble.
It would be great if a consensus among all WMF projects could be reached, but I do not know how to go about doing it. —Stephen (Talk) 07:28, 17 April 2016 (UTC)
I agree that lower-case palochka should never be used. --WikiTiki89 14:26, 18 April 2016 (UTC)
I'd prefer lower-case palochka to be used where appropriate. Many wiki-projects go through normalisation. Wikipedias in languages that use palochka will catch up eventually. It's only correct to spell "Qatar" as Къатӏар ‎(Q̇aṭar) (lower case) in Chechen but the spelling "КъатӀар" (upper case palochka) is common. --Anatoli T. (обсудить/вклад) 09:02, 19 April 2016 (UTC)
I noticed that on my iPhone, the lower-case palochka is actually shorter than the upper-case palochka. In other fonts (on my computer), the lower-case one looks like a lower-case "L/l", and the upper-case one looks like an upper-case "I/i"; and so they range from distinguishable by serifs only, to nearly indistinguishable. @Atitarev: Can you link to professionally type-set printed text in a language that uses palochka? --WikiTiki89 19:37, 19 April 2016 (UTC)
No, I don't think I will be able to link to properly type-set texts in North Caucasian or other Russian minority languages, even for common words like лугӏат ‎(luġat). Cf. Chuvash сăмахсар (Roman ă) vs сӑмахсар ‎(sămahsar), Ossetian цæхх (roman æ) vs цӕхх ‎(cæxx). The Internet penetration and digitization of these minority languages is very poor and hard to verify. You'll find that misspellings are often more common than standard or standardized forms. Vahagn might know more about North Caucasian spellings, fonts and the mix-up with palochka and other special symbols, which are missing in other Cyrillic-based languages. We should try and normalize these spellings. I think we can compare the use of Hebrew "׳" vs a simple apostrophe ' or Arabic hamza, which is used inconsistently out there but we use it here. As a dictionary, we can afford using the latest standard forms. --Anatoli T. (обсудить/вклад) 01:42, 20 April 2016 (UTC)
The professionally typeset books of Soviet times do not distinguish the case of palochka. I am in favour of distinguishing lowercase palochka from the uppercase one, because proper nouns can start with a palochka. For example, Chechen Ӏаьрбийн Цхьанатоьхна Эмираташ ‎(ʿärbīn Cḥanatöχna Emirataš, United Arab Emirates). --Vahag (talk) 06:00, 20 April 2016 (UTC)
@Vahagn Petrosyan, Wikitiki89 Our transliteration modules can't handle all possible misspellings either and the translation adding tool User:Conrad.Irwin/editor.js needs some attention for North Caucasian languages (see lines starting after "var diacriticStrippers"). Compare Avar Къатӏар ‎(Q̇̄aṭar) (lower case palochka, correct) and КъатӀар ‎(Q̇̄atӀar) (upper case palochka). The latter is transliterated incorrectly "Q̇̄atӀar". If we use I, l, 1, | instead of proper palochkas, the results are even worse. --Anatoli T. (обсудить/вклад) 08:00, 20 April 2016 (UTC)
@Vahagn Petrosyan: What do current professionally type-set texts do? How would professionally type-set text have rendered United Arab Emirates then and now? --WikiTiki89 15:45, 21 April 2016 (UTC)
According to the few modern sources I have, they too do not distinguish the case of palochka. Compare this extract from {{R:ce:Aliroev}}. --Vahag (talk) 16:06, 21 April 2016 (UTC)
Do your sources say anything about whether the next letter is capitalized (i.e. something like ӀАьрбийн Цхьанатоьхна Эмираташ ‎(ʿÄrbīn Cḥanatöχna Emirataš))? --WikiTiki89 17:26, 21 April 2016 (UTC)
They don't capitalize the next letter. --Vahag (talk) 06:28, 22 April 2016 (UTC)

Yes, we should aim for normalization. But is it settled that the lowercase Palochka is the future? Maybe adding it to Unicode 5.0 was a mistake that we should better ignore? (In the 1970s, the Swedish Academy tried to change the spelling of zebra to sebra and juice to jos, but these reforms didn't catch on, nobody used them, and they are now a laughing matter of a bygone era when overly optimistic planners thought they had an influence that indeed they lacked.) If there is a scientific/societal consensus to use the lowercase Palochka, then we should do so. But is there? It seems to me that the Chechen Wikipedia is already fully normalized and standardized on using the uppercase Palochka. What exactly is wrong with that? --LA2 (talk) 22:09, 20 April 2016 (UTC)

I really favor using only the uppercase as the Chechens do. The palochka is nothing more than a version of the apostrophe, and we would not want to have upper- and lowercase apostrophes, would we? And there is no reason why the unicase palochka couldn’t be used as the initial letter of a proper noun.
It should not be a difficult choice. At the moment, it seems that the speakers and writers of the affected languages themselves prefer to use the unicameral palochka, and we should follow suit. If at a future time the writers of these languages decide that they want two cases for the palochka, it will be a simple matter for us here to make this change on Wiktionary. Just as the readers and writers of a language should have the right to decide the spellings of their words, and how to punctuate their language, those same people should have the say in whether the palochka is unicase or dual case, and we should bow to their usage. It can always be changed later on if the users of the languages want to do it. —Stephen (Talk) 01:02, 21 April 2016 (UTC)
I think this topic should be moved to Beer parlour, it concerns language policies, transliterations of a few languages. Whatever decision is made, could be accompanied by some bot work, changes to translit modules and policy pages. --Anatoli T. (обсудить/вклад) 21:53, 21 April 2016 (UTC)
Topic moved from User talk:Stephen G. Brown#Palochka. —Stephen (Talk) 08:41, 22 April 2016 (UTC)
I think the original reason for adding a lowercase character is that Unicode rules were changed such that case-pairings are fixed and can never be changed (after the glottal stop debacle). As such, a lowercase-only character can get an uppercase counterpart encoded later, but the opposite isn't possible anymore. As such, lowercase counterparts were added for all uppercase-only characters in Unicode (there were only about 5 or 7 or so) so they wouldn't run into problems in the case that a lowercase counterpart would later be needed. -- Liliana 08:55, 22 April 2016 (UTC)
It may the background story but we need to decide, if we're going to use upper case palochka or use upper/lower case palochkas in the concerned languages. I no longer insist on using both, like with other, usual upper/lower case letters. It seems there is more evidence that we need to stick to upper case palochka. --Anatoli T. (обсудить/вклад) 09:23, 22 April 2016 (UTC)
I too don't care that much about which option we choose, as long as we follow it consistently. --Vahag (talk) 09:47, 22 April 2016 (UTC)
In previous discussions, it seemed intuitive to me that we should make the case distinction, for the reasons Anatoli gives above. However, there is a big different between using what amounts to CamelCase, and mixing scripts as in the цæхх vs цӕхх example. If the speakers themselves apparently always or mostly use uppercase, then I guess we should go with that. There is at least one other language with a standard orthography that uses only one member of a pair of cased letters, although it's a constructed language (Klingon uses only I and no i). - -sche (discuss) 15:18, 22 April 2016 (UTC)
The other difference is that with the Latin æ vs. Cyrillic ӕ, we can say that even though we are going against widespread internet practice, we are still staying true to the printed orthography, since these codepoints are not typographically different; however, the lower-case palochka goes against the printed orthography as well, and therefore I oppose using it. --WikiTiki89 15:34, 22 April 2016 (UTC)

Should definitions and glosses reflect the lemma form of a lemma?[edit]

For Latin verbs, most of our entries show the definitions in the first person singular. For example abdico. This is presumably because the lemma form is the first person singular present. I'm not sure if this practice makes sense, though, because the lemma represents all forms and not just the first-person singular present. The Latin lemma should be translated with an English lemma, and in English, it is usual to lemmatise verbs in the infinitive, so we should use it in definitions also. Compare verb lemmas for Bulgarian, Irish, Macedonian, Welsh and probably many other languages that have no infinitive, where we already do this. —CodeCat 18:51, 22 April 2016 (UTC)

I agree that lemmas should be translated and glossed with lemmas. For example, dīcō should be defined as "to say". Just like Arabic قَالَ ‎(qāla), which is the 3rd person singular past tense, is defined as "to say", and Macedonian каже ‎(kaže), which is the 3rd person singular present tense, is defined as "to say". --WikiTiki89 19:01, 22 April 2016 (UTC)
I agree in principle, but it will be difficult to get people to comply, perhaps especially for Latin and Ancient Greek. This may be due in part to how these languages are taught. When I was in school and competing in certamen at Junior Classical League conventions, your answer would be considered wrong if you said something like "dīcō – to say". We were constantly reminded to answer along the lines of either "dīcō – I say" or "dīcere – to say". —Aɴɢʀ (talk) 19:16, 22 April 2016 (UTC)
I disagree, as we are defining the word and dico, for example, does not mean "to say". Lemma-to-lemma makes sense on some levels, but I think it's better to provide an accurate translation of that form of the word, especially because beginners in the language might not be aware that the lemma of a Latin verb is not the same form of the verb as the English lemma, and this could be a potential source of confusion. Andrew Sheedy (talk) 11:51, 26 April 2016 (UTC)
Would you then agree with defining Arabic verbs with things like "said", "opened", "defined", etc? And Welsh verbs with "saying", "opening", "defining"? —CodeCat 13:20, 26 April 2016 (UTC)
@CodeCat: That is one reason why Welsh verbs' lemmata should not be the verbnoun. — I.S.M.E.T.A. 16:01, 5 May 2016 (UTC)
We aren't defining the word. We are defining the paradigm. In some languages, the lemma form does not even cleanly correspond to any English tense. Furthermore, many languages lack an infinitive, so what would we put in translation tables? --WikiTiki89 15:10, 26 April 2016 (UTC)
This is another good point. Since we translate from English lemmas to whatever lemma the other language uses, we should translate back into English from lemma to lemma too. It doesn't make much sense for a translation of a Latin verb to appear on the English infinitive page, only for that verb to define itself as a 1st person form. —CodeCat 15:49, 26 April 2016 (UTC)
This is an old subject with previous discussion, Wiktionary:Tea_room/2009/November#nāscor. There is even a vote draft that was intended to address this issue: Wiktionary:Votes/pl-2009-12/Definition layout. --Dan Polansky (talk) 14:14, 30 April 2016 (UTC)

Categorize ditransitive verbs by template label[edit]

Please do. Thanks.--Dixtosa (talk) 12:01, 23 April 2016 (UTC)

Done. - -sche (discuss) 02:27, 27 April 2016 (UTC)

Global ban of Liliana-60[edit]

FYI, I sent a message today to the e-mail in the global account log[2] of Liliana-60 to ask exactly why she was globally blocked. I'm waiting for their response. --Daniel Carrero (talk) 19:06, 23 April 2016 (UTC)

Death threats. — Ungoliant (falai) 19:26, 23 April 2016 (UTC)
  • The transparency has been quite underwhelming. I'm curious to see how they respond to your query. —Μετάknowledgediscuss/deeds 19:33, 23 April 2016 (UTC)
    Contrast #Proposal to globally ban WayneRay from Wikimedia above; it links to guidelines that seem to have been ignored in this case. —Μετάknowledgediscuss/deeds 19:36, 23 April 2016 (UTC)
    Im guessing you'll get no reply at all. Renard Migrant (talk) 20:41, 23 April 2016 (UTC)
  • Before you complain about transparency, note that the m:WMF Global Ban Policy clearly states "Also, to protect the privacy of all involved, the Wikimedia Foundation generally will not publicly comment on the reason for any specific banning action." and "Please note that questions about specific WMF global bans will not be addressed, to protect the privacy of all involved." I think this is perfectly reasonable. If the ban was wrongful, Liliana should be the one to sort it out. Also, I'm not sure whether this is significant or not, but three other accounts were globally banned on the same day. --WikiTiki89 15:20, 26 April 2016 (UTC)
    I don't know whether it's significant or not either, but it certainly wouldn't put anybody's privacy at risk to tell us. And yet they won't. (Also, as noted before, they already broke their own rules on global bans in this case.) —Μετάknowledgediscuss/deeds 15:25, 26 April 2016 (UTC)
    It seems there is a distinction between community-initiated global bans and WMF-initiated global bans. The rules for the former do not apply to the latter, so no rule was broken. --WikiTiki89 15:40, 26 April 2016 (UTC)
I am unhappy with the process the WMF is using, it seems to me that anyone can be blocked without any evidence being given to justify that block. The whole point of the WMF is that the community is self-governing. - TheDaveRoss 15:38, 26 April 2016 (UTC)
Just because evidence is not released to the public, doesn't mean none was presented. And you can't judge the affects of releasing that information if you don't know what that information is. I'm sure the WMF doesn't take any of this lightly and certainly would not ban just "anyone". --WikiTiki89 15:49, 26 April 2016 (UTC)
They do not need to reveal the contents of the edit to be at least partially transparent. They could, for instance, post something here (where Liliana was very active) saying that they were taking an action, and justifying that action. Unilateral and silent action is against the spirit of the project, and while I agree that they would probably not wantonly ban anyone, Liliana would likely disagree. She is unable to disagree here. - TheDaveRoss 16:59, 26 April 2016 (UTC)
Liliana is able to disagree in private conversation with the WMF. --WikiTiki89 17:32, 26 April 2016 (UTC)
There is a major flaw with that argument, they hold all of the cards and she holds none. There is a reason why trials in civilized society always include impartial third parties, either a jury or the state or both. When community members make blocks there are other community members who can review the block and judge whether or not it was justified. It happens all of the time on this project. When the WMF blocks someone there is no similar check, so the possibility for abuse is much higher. - TheDaveRoss 17:47, 26 April 2016 (UTC)
The WMF is the impartial third party. Equivalent somewhat to "the state" (but I would caution not to take that analogy too far, there are many differences between an organization and a state). --WikiTiki89 18:02, 26 April 2016 (UTC)
They are not the third party, they are not impartial. The WMF has interests of its own, and has recently shown to go to some lengths to hide those interests from constituents. That whole mess with board members etc. being removed and resigning recently demonstrates that the small group of people who represent the organization officially are not infallible and should work harder to operate in the open. The community writ large is the third party in this situation, the WMF is judge, jury and executioner. - TheDaveRoss 18:28, 26 April 2016 (UTC)
They are a third party. Whether they are impartial depends on what the topic is. As far as Liliana's block is concerned, they certainly are impartial. Liliana didn't do anything personally to them (at least as far as we know). --WikiTiki89 18:32, 26 April 2016 (UTC)
Prove it. You want to argue semantics, I would rather address the issue, which is that the WMF took unilateral action without providing justification or an open mechanism for appeal or community input. I am not the kind of person who distrusts authority on spec, but this process is hypocritical from an organization founded on the concept of openness and community control. I agree that the WMF needs to have the ability to take decisive action to protect the project and the community, but those actions should be very rare and be well justified. - TheDaveRoss 18:46, 26 April 2016 (UTC)
I'm not arguing semantics; if you can't see past the semantics what I'm actually trying to say, then I'll try to explain it better. I can see that your issue is with the lack of openness, which has nothing to do with whether the WMF is an impartial third party (and I have explained why I think so). We are in agreement that there is a lack of openness, but the question remains whether the lack of openness is justified. You say "those actions should be very rare and be well justified". There have been 15 of these bans since 2012. I would say that makes it fairly rare, considering the size of the entire community, but you may disagree with that. Whether they are justified, and whether the lack of openness is justified, is not something you can judge without all of the information, which none of us have. But as long as they remain rare, I am willing to assume they are justified. And I'll reiterate that even if it's not justified, it's not as though Liliana has no recourse. --WikiTiki89 19:12, 26 April 2016 (UTC)
The only way I can judge whether the lack of openness is justified is if the WMF justify their actions, they have not. Fifteen bans is relatively few, the four this week however are another story, but it does seem that the actions are infrequent. Even the infrequent actions, however should be justified. By this I do not mean that they should have justification in the mind of the person taking action, I mean that the person taking action should explain the action to the community. You hit the nail on the head in stating that this "is not something you can judge without all of the information," and I completely agree, which is why more information should be made available. They do not need to reveal the specific offending text, but they should be able to cite the policy under which they acted, and describe the reason why they felt the action was necessary. Concerning Liliana's recourse, the entity to whom she can appeal is the same entity who she believes has acted in bad faith. If I block someone unjustly the person I block can appeal to you, Liliana is not afforded the same recourse. - TheDaveRoss 19:45, 26 April 2016 (UTC)
My point was that if you don't know what they're not saying, you can't judge whether or not they should be saying it. Now did Liliana say that the WMF acted in bad faith, or is it just you who's saying that? And don't forget that the WMF isn't just one person either. --WikiTiki89 19:53, 26 April 2016 (UTC)
Liliana said that the initial set of blocks, and the removal of her permissions elsewhere were unjustified, so I assume she also considers the further action similarly. I have not heard from her since this block came along. You are right that I can't know what they are not saying. I just can't imagine any scenario in which silence is necessary. If Liliana posted nuclear launch codes then they should say that she was blocked for violating whatever policy is in place for such an action. They don't have to give out the codes to explain their actions. Also, though the WMF is more than one person, this action was taken using the anonymous account, so I don't know who actually made the block. - TheDaveRoss 20:09, 26 April 2016 (UTC)
There's a difference between unjustified and done in bad faith. But anyway, I do think it would be reasonable and desirable for WMF to release some information. I just wanted to point out that their current policy itself states that they will not say anything, so we should complain about the policy rather than about the silence itself. I don't know how I even got sucked into this argument. Also, another point is that if WMF states their reason and then is later proven wrong about it, it could be seen as slanderous. --WikiTiki89 20:48, 26 April 2016 (UTC)
In the end it probably comes down to the fact that "the community" can't be sued or tried for a crime whereas WMF can. So WMF needs to have the ability to act quickly and decisively if they feel WMF is at risk. To disclose to the community the details of the allegations and evidence many run the risk of increasing WMF's legal risk.
I am reminded of a Saturday Night Live skit involving a press conference during the Gulf war. DCDuring TALK 16:14, 26 April 2016 (UTC)
As a stakeholder in the WMF, a suit brought against them is a suit against my interests as well. I understand that organizations need to protect themselves (I am on the board of a non-profit myself, so I have some [limited] experience with such considerations). As I said above, it is not that I oppose the action, it is the process by which this action was taken that I am in opposition to. I think that the community should be a safe place for all, and if Liliana's words truly violated that safety then perhaps a ban was called for. Without any form of evidence or justification presented I cannot be comfortable with the action. - TheDaveRoss 16:59, 26 April 2016 (UTC)
@TheDaveRoss Therein is why non-admins don't like admins. To a non-admin, the complaint of blocks being capricious seems hardly surprising. Purplebackpack89 16:21, 26 April 2016 (UTC)
I reject your premise. - TheDaveRoss 16:59, 26 April 2016 (UTC)
  • This block process, especially the global block, has struck me as DerHexer and others talking to the right people privately, rather than anything that approaches a community consensus. Also, aren't you supposed to be disruptive on all of your main projects (of which this is one for Liliana) before being globally banned? Purplebackpack89 17:25, 26 April 2016 (UTC)

Multiple times on Wikis including this, she gave info on where users she had disputes with lived and threatened to visit. On German Wikis, she said she would resort to vigilante justice and that only people who don't shy away from murder get ahead. Multiple people issued blocks over threats and noted the danger of her having privileges that require trust, give access to deleted personally identifying information and allow blocking. As she continued to make threats, people predicted she would be banned. It's sure inexplicable she has been banned. Thrwwy (talk) 20:49, 26 April 2016 (UTC)

This is the e-mail I've sent to them on April 23, they didn't reply:


My name is Daniel Carrero, I am an administrator at the English Wiktionary.
I see that user Liliana-60 has been blocked globally by the WMF, with the summary "WMF Global Ban, questions should be addressed to ca@wikimedia.org".

Up until that block, she was an active member of the English Wiktionary.

If possible, I would like to ask why she has been blocked.

Thank you,
Daniel Carrero

On second thought, I could have said "I know she has made death threats in another project, but really the transparency of this global block has been underwhelming and so an so". Oh, well. --Daniel Carrero (talk) 05:52, 4 May 2016 (UTC)

If WMF said why they banned a user, in some cases that official statement might open them up to legal consequences. ("Oh, they said why User:X was banned, but not why User:Y was banned, so what User:Y did must have been really really bad...") So it's best for them to say that they won't comment on why a user was banned, ever. --Rschen7754 02:54, 6 May 2016 (UTC)
  • A straightforward boilerplate reply, like "We cannot comment on the specifics of any case of banning due to legal considerations," would be much more preferable and professional than simple silence. ‑‑ Eiríkr Útlendi │Tala við mig 03:41, 6 May 2016 (UTC)
    As I've already pointed out, they already say that on the m:WMF Global Ban Policy page. I don't know why we should expect them to respond to individual emails with the same statement. --WikiTiki89 15:01, 6 May 2016 (UTC)
  • Because some kind of response, even boilerplate, is preferable to stonewalling. Flat-out ignoring incoming communications is not professional behavior. (I know a lot of WMF is volunteer; I use this term professional as I lack a better term for expressing all of the bases covered by upstanding, respectful, morally sound, forthright, impartial, etc.) ‑‑ Eiríkr Útlendi │Tala við mig 00:02, 7 May 2016 (UTC)
    Of course receiving a response is preferable to one receiving the response. What I'm saying is that it is unreasonable to expect them to reply when they have a policy of not commenting on such issues. It's not all about what's preferable to us. --WikiTiki89 14:52, 9 May 2016 (UTC)
True I suppose. I also double-checked that I got the e-mail address right and I checked my spam folder just in case. --Daniel Carrero (talk) 03:44, 6 May 2016 (UTC)

Linking to the constituent words in the headword templates[edit]

I find links to

  1. "ex" and "libris" in ex_libris
  2. "Marie Antoinette" in Marie_Antoinette_syndrome

totally useless and potentially misleading.

In the first example libris is an orange link (useless). ex is blue but it has nothing to do with Latin ex (misleading).

In the second case Marie_Antoinette makes one link and if you follow the link you will see a definition that has nothing to do with the syndrome (useless).

So, the questions are

  1. do you agree that both (actually three) of the links should be delinked?
  2. is there any reasonable rule to follow to identify the parts that should be linked?
    and finally super high IQ editors can contemplate on the following
  3. What's the purpose of linking to the constituent words in headword templates at all?

--Dixtosa (talk) 19:54, 25 April 2016 (UTC)

I have always advocated for the auto-linking to be opt-in. Having lost that battle, I tried advocating for a reasonably easy way to opt out other than head=the entire long phrase. I lost that battle too. --WikiTiki89 20:00, 25 April 2016 (UTC)
@Dixtosa How could links from Marie Antoinette syndrome to Marie Antoinette and syndrome possibly be misleading? (I agree that links to Marie and Antoinette are not instructive.) If w:Marie Antoinette is not helpful then perhaps the entry needs an image of a painting of Marie Antoinette when she was relatively young, but had whitened hair. I think it is our job to figure out the best way of answering questions "How did those words come to mean that?"
The default linkage to individual terms is a labor-saver because AFAICT no one here has the AI skill to make a module that could do links better than a skilled editor and links to the individual components are probably the single most usually instructive links. The meanings of many headwords of MWEs do have a comprehensible connection to the meanings of some combination of the constituent terms, sometimes of the individual components. DCDuring TALK 21:30, 25 April 2016 (UTC)

Placing template l to all links[edit]

For the record, I dislike the recent bot-made placement of {{l}} to all synonyms and other sections. In a vote, I would have opposed. I belive there must be a better solution. This is ugly and, if tabbed interface is on, unnecessary. I guess I am in a minority and this is really just for the record. --Dan Polansky (talk) 05:50, 30 April 2016 (UTC)

Mewbot and its author are also not very careful to discriminate between correct and incorrect language labels. This is the result of arbitrary (impatient?) initiation of new classes of actions without bothering to ask about consequences. I noticed the problem when I found English vernacular names under Synonyms headers appearing with orange links. This was caused by their being entemplated with {{l|mul}}. It is quite tedious to find and selectively revert inappropriate Mewbot actions. Perhaps it should be tasked with reverting any class of such errors. DCDuring TALK 12:58, 30 April 2016 (UTC)
I agree with DCDuring. MewBot's code seems too "stupid" to do the job properly ({{l|zh|&c}}?). Given the rampant inconsistency in formatting across this project, this seems like a task that must be supervised if done at all. (Dang, at least maybe check if the language section exists on the page first; that could address what seems to be the core of the complaints I've seen about this task, namely that links are being made indiscriminately and inappropriately.) —suzukaze (tc) 14:53, 2 May 2016 (UTC)
Since words don't exist independently of language I'm in favor of always linking to the section of the language in question. But yeah, saying CodeCat is unintentionally breaking things by bot is a bit like saying rain's wet. Renard Migrant (talk) 10:49, 3 May 2016 (UTC)
One of several things I dislike about "l" is that it doesn't form links in edit summaries. It's nicer to see something like "added derived term amazingly" where it's either blue or red and can be clicked on. Equinox 12:16, 3 May 2016 (UTC)
You have to write edit summaries separately anyway; you can still use bare links in them. What I like about {{l}} even for English is that it will always take me to the English section, even when I've previously been looking at another language. In tabbed languages, at least, the software remembers what language you've been looking at, and will take you to the same language (if present) when you click on a bare link. So if I'm looking at aisling#Irish and then click on the bare link [[dream]], I will be taken to dream#Irish instead of dream#English. But if the gloss had {{l|en|dream}} instead, I will be taken to the right place. —Aɴɢʀ (talk) 12:43, 3 May 2016 (UTC)
I agree with Angr. I also like when the links use the format {{l|en}}, for the reasons he mentioned. --Daniel Carrero (talk) 12:45, 3 May 2016 (UTC)
I usually create edit summaries by copy-pasting the content I added. It's quicker. Equinox 12:55, 3 May 2016 (UTC)
https://xkcd.com/1172/ --WikiTiki89 14:40, 3 May 2016 (UTC)
Not relevant. I'm not doing something foolish; copying and pasting is a standard way to move material from one place to another. Equinox 14:58, 3 May 2016 (UTC)
But if you have added the derived term amazingly to an entry, you haven't actually written the words "added derived term amazingly" to the entry (I hope), so in the example you gave above, your edit summary isn't copied and pasted from your edit. —Aɴɢʀ (talk) 15:02, 3 May 2016 (UTC)
Right. That example is not an example of me copying and pasting. It is, though, an example of a clickable blue or red link. Equinox 15:04, 3 May 2016 (UTC)

I agree with Renard Migrant, Aɴɢʀ, and Daniel Carrero. — I.S.M.E.T.A. 16:05, 5 May 2016 (UTC)

May 2016


Is there any particular reason why the snowclones are in appendices? I would have though they fit the main namespace nicely, except the "X" and "Y" in the page titles are a little odd.

For reference, here are all the current snowclone pages:

Note: I edited all the snowclone pages to make them use the normal entry layout. For example, many had a "Origin" section which I renamed to "Etymology". Example diff: link. --Daniel Carrero (talk) 04:40, 2 May 2016 (UTC)

It's precisely because of the X's (and Y's and Z's) that they are in the appendix. --WikiTiki89 17:00, 2 May 2016 (UTC)
Who would look up I am (something), hear me (do something)? I would like to see how this could be made into something that was demonstrably useful. If not, the Appendix is fine. DCDuring TALK 17:27, 2 May 2016 (UTC)


Previous discussion: User talk:Romanophile#Latin

Can we add definitions to forms of various words. This would allow definitions of words to be more easily accesible, especially when the internet is slower. Bman1230 (talk)

We already give definitions to forms of words. For example, chairs is defined as the plural of chair. —CodeCat 18:54, 2 May 2016 (UTC)

I mean include some info on derived terms page. Generally if someone knows what a chair is, they dont need to look up the definition for chairs. /n Maybe something like this. (Imagine better formatting) Bman1230 (talk)

That would duplicate all the information on every form-of page, and it would be a nightmare to maintain. —CodeCat 21:39, 2 May 2016 (UTC)
I was thinking maybe there would be some way to load info from the other page, it would probably have to be changed by wiktionary. Bman1230 (talk)
It would be "easy" enough to substitute plural forms for the singular forms in the definitions, but one would also have to make sure that there was number agreement with any verbs, pronouns, or other nouns that required it. Either AI or extensive tagging would be required, I think. DCDuring TALK 23:35, 2 May 2016 (UTC)
To me it seems like in the majority of cases "An item of furniture used to sit on or in comprising a seat, legs, back, and sometimes arm rests, for use by one person." is a more useful definition than "Plural of chair" despite the disagreement of number"Bman1230 (talk)

would a mouseover system like this work?Bman1230 (talk)



  1. plural of chair
Not as text. Someone would have to enter the text, and any edit to the entry for chair would make the two entries different. I suppose you could transclude the entry at chair into the title tag, but there are limits to how much you can put in a title tag (I only see up to the first half of item 6 in your example). There are also all kinds of complications such as multiple etymologies (wound and winded are past tense forms of different words spelled as wind and wounded is the past tense of wound- but not of the past tense of wind) that would mean you would have to set things up carefully in the main entry so that only the relevant part would be transcluded- which would be subject to getting fouled up whenever someone rearranges the main entry
The Achilles' heel of anything you could come up with is the inability to predict or control what is done to either or both entries after you've set everything up- people add, delete, rearrange and otherwise mess with just about every aspect of entries all the time. Multiply that times thousands and thousands of main entries, and it quickly becomes impossible to maintain. Chuck Entz (talk) 02:15, 3 May 2016 (UTC)


FYI, I created Appendix:Repetition. It feels to me that this is a concept with verifiable semantic value, like Appendix:Capital letter. That's why I formatted it like an entry, too. --Daniel Carrero (talk) 19:47, 3 May 2016 (UTC)

In linguistics, reduplication is considered a kind of affix, but its meanings are language-specific, not translingual. In Indonesian, for example, reduplication is used to form plurals. In some languages, it's used to indicate a diminutive; in others, a wide variety or a large number. In older Indo-European languages (and Proto-Indo-European itself) reduplication of the initial consonant or consonant cluster of a verb root is used to form the present stem of some verbs as well as the perfect stem of most verbs that have a perfect stem. I like the idea of having an entry for the reduplication morpheme, but I doubt it should have a Translingual section. —Aɴɢʀ (talk) 21:11, 3 May 2016 (UTC)
Also, don't confuse reduplication with lengthening. Daaaaaniel is an example of lengthening of the vowel, which in writing is indicated by repeating a letter. --WikiTiki89 01:59, 4 May 2016 (UTC)
Disclaimer: This is just a first draft, I wouldn't mind having a separate language section for plural-forming use in Indonesian that @Angr mentioned, or explaining any better the difference between lengthening of the vowel and actual reduplication that @Wikitiki89 mentioned. --Daniel Carrero (talk) 03:37, 4 May 2016 (UTC)

Tagging entries missing a headword template[edit]

There are still some entries without a headword template that linger around here and there. I would like to tag these for cleanup, using the template {{rfc-head}}. For the bot, I'll use a relatively simple heuristic to determine if a part-of-speech section is missing a template. If the header is not immediately followed, on the start of the next line, by any template, then insert the cleanup template at the start of that line. This may result in false negatives (entries without a headword template that aren't tagged) but it shouldn't give any false positives that I can think of.

To determine which headers are part-of-speech headers, I'll use a precompiled list of all the headers I've come across in a dump, and (painstakingly) split between POS and non-POS headers. If a POS header is used erroneously, like say a Verb header where it's used as something other than a POS header, it will also be tagged as missing a template (false positive), but since we are presumably going to fix the tagged entries manually, the erroneous usage will be noticed during this process.

The bot run will not actually fix any entries, but it should be relatively easy to fix some of the entries with a bot, once they are tagged. —CodeCat 21:07, 5 May 2016 (UTC)

It wouldn't surprise me to find a large number of entries with a file ("[[File:" or "[[Image:") on the line right after the POS header, before the headword template. I consider this suboptimal, but it'd be a false positive in your work, I think. Other than that, this sounds like a good idea. Many other entries put {{wikipedia}} after the POS header, before the headword template, but this won't affect the script you describe. - -sche (discuss) 05:31, 6 May 2016 (UTC)
On a related note, there are a large number of Latvian adjectives with two headword-template lines but only one POS header, as described here, complete with proposed bot fix. - -sche (discuss) 05:31, 6 May 2016 (UTC)
Similar careful logic could be used to revert the erroneous mass insertion of {{l}} with inappropriate language codes. DCDuring TALK 10:57, 6 May 2016 (UTC)
I oppose such tagging since they are fairly easy to identify for anyone being serious about making the list shorter. I generally oppose tagging without fixing, especially in huge volumes. --Dan Polansky (talk) 08:19, 7 May 2016 (UTC)
If done, I suggest it is done "quietly", e.g. adding entries to a category but not having red warning text showing up in the entry. Equinox 11:25, 7 May 2016 (UTC)
I fully support the proposal, with the suggestion that it'd be nice if the script automatically recognized [[File: and [[Image: if possible. --Daniel Carrero (talk) 23:26, 7 May 2016 (UTC)


vCat is a tool on wmflabs that can graphically display category structures. More specifically it can show all ancestor categories (that is, parents and parents' parents and so on) or all descendants. We could have two links for the generated parents and children images on category pages.

Look what it generates for ka:Sports - descendants, Georgian language - ancestors.

I do not have very strong opinion about this but others may have. --Giorgi Eufshi (talk) 09:45, 6 May 2016 (UTC)

Gothic words that are attested only in Runic inscriptions[edit]

Was wondering whether a Runic or Gothic script entry should be created for Gothic words that are attested only in Runic inscriptions. There are not many words of this sort which have one mostly undisputed reading, but there certainly are some. When I added ᚱᚨᚾᛃᚨ ‎(ranja) a while ago I had not yet found any others of this sort, so I went ahead and added the entry in Runic script, the word only being attested in that script. However, recently I came across 𐌷𐌰𐌹𐌻𐌰𐌲𐍃 ‎(hailags), which is also attested only in one Runic inscription (Wulfila, interestingly, prefers 𐍅𐌴𐌹𐌷𐍃 ‎(weihs) to mean holy - if anyone knows why, let me know!) but was added by another user in the Gothic script. Which are preferable here, Gothic or Runic lemmata? Would be interested to hear your thoughts. — Kleio (t · c) 17:38, 11 May 2016 (UTC)

Anything that's attested can be added, so there's no preference. However, Gothic doesn't currently have Runic listed as one of its scripts, so autodetection won't work. —CodeCat 17:47, 11 May 2016 (UTC)
It does now. --WikiTiki89 17:58, 11 May 2016 (UTC)
Good call! — Kleio (t · c) 18:15, 11 May 2016 (UTC)
But if there is no preference, could you not end up with two lemma entries, or, well, situations like this, where it is inconsistent and feels messy? Seems to me it is best to settle for one lemma (imo for the attested script), with, if the attestation is only in a rare script like Runic, a redirect in the Gothic script à la the romanization of redirects we have for most regular Gothic entries. — Kleio (t · c) 18:15, 11 May 2016 (UTC)
When there are multiple lemmas that represent the same basic word, we call them alternative forms. So we can call the runic lemma an alternative form of the gothic lemma. —CodeCat 18:26, 11 May 2016 (UTC)
But the Gothic script forms are not attested in these listed cases. Should there not, in any case, be a consistent approach to what form in these cases (got words only attested in Runic) should be the main entry, and which should be considered the alternative form? Because if I understand you correctly, whoever creates the lemma for this kind of Runic-only Gothic word would decide whether to make the main entry a Runic or Gothic script one, and the other script would then be considered the alternative form. We would, then, have 𐍂𐌰𐌽𐌾𐌰 ‎(ranja) as an alternative form of the lemma ᚱᚨᚾᛃᚨ ‎(ranja), and ᚺᚨᛁᛚᚨᚷᛊ ‎(hailags) as an alternative form of the lemma 𐌷𐌰𐌹𐌻𐌰𐌲𐍃 ‎(hailags) - a completely opposite way of dealing with this, despite them being both attested only in Runic. I might be a bit OCD about this, but it just seems messy to leave that up to an arbitrary decision by the editor, instead of having a consistent approach. — Kleio (t · c) 18:44, 11 May 2016 (UTC)
  • Chiming in from the sidelines, I agree that messy and inconsistent is undesirable. My 2p would be to have all Gothic lemmata in the Latin script, with notes in the etymologies to indicate if a given term is only attested in Runic. Gothic entries in the Runic script would all be soft redirects to the Latin-script entries, much as we have for pinyin entries for Chinese, or romaji entries for Japanese, etc. We have a possible analogous precedent in the handling of Pali entries, which historically were not written in the Latin script until relatively recently, but for which (I think) all EN WT lemmata are given in the Latin script. ‑‑ Eiríkr Útlendi │Tala við mig 18:55, 11 May 2016 (UTC)
  • Why should we do it for this one non-Latin language and not for all others? Korn [kʰũːɘ̃n] (talk) 19:45, 11 May 2016 (UTC)
  • Admittedly, most contemporary students of Gothic and indeed virtually all books published on Gothic use the Latin script, unlike for example Ancient Greek where the Greek alphabet is consistently used. See also this discussion and the past votes that are linked there. Generally speaking though, the issue of Latin script usage, romanization and so forth seems to be a bit of a mess. — Kleio (t · c) 20:01, 11 May 2016 (UTC)
  • @Korn [kʰũːɘ̃n] -- if, by "this", you mean "we should use the Latin script for all lemmata, for all languages", that presents some real organizational difficulties for some languages. For Japanese, have a look at [[せい]] (sei) -- the sheer number of homophones is overwhelming, and our entry doesn't even include all of the entries applicable to this phonetic rendering. Were we to move all of the applicable lemmata to a single heading under [[sei#Japanese]], we would have a substantial challenge in organizing and presenting all of this information in a useful fashion. This is a big part of why Japan has not retired kanji, the borrowed Chinese characters used in writing: these provide much-needed disambiguation. Written and spoken Japanese can be quite different in terms of style and vocabulary, largely because of the writing system. ‑‑ Eiríkr Útlendi │Tala við mig 01:02, 12 May 2016 (UTC)
  • We should record all languages in the language that they are printed in. That is, for Gothic, the Latin script. I see no reason we should be messing around with manuscript traditions at all.--Prosfilaes (talk) 08:08, 12 May 2016 (UTC)
  • Because that is what the Goths wrote their language in and because we rely on primary sources where we can rather than copies from non-native speakers who take editorial liberty. And I'm not saying we should Latin for all lemmas, I'm saying when something gets special treatment it needs a two way justification for why it is done in this of all cases and why it is not done in the others. Korn [kʰũːɘ̃n] (talk) 13:33, 12 May 2016 (UTC)
  • The Gothic script is what Bishop Ulfilas wrote Gothic in, at least. We should rely on printed copies because printed copies are normalized script-wise and can be reliably transcribed and searched for, whereas
    Pepys' Diary
    Pepys' diary as-is can't, and because printed copies are what the users of Wiktionary are going to be using, not the manuscripts.--Prosfilaes (talk) 07:51, 13 May 2016 (UTC)
  • Ad absurdum: If non-native speakers of Russian would start using exclusively romanised versions of Russian books, would that affect your stance on where to lemmatise Russian? (Assuming that it wouldn't won't, that would be the point for giving the argument for the exception that I asked for.) Korn [kʰũːɘ̃n] (talk) 19:20, 14 May 2016 (UTC)
ps.: Shorthand diary writing is not meant for consumption by others. I hope we can all agree that our decisions on writing should be made on the basis of those texts which were actually meant to be read by people. Korn [kʰũːɘ̃n] (talk) 19:22, 14 May 2016 (UTC)
pps.: We have romanisations. The argument to be made is why they should be lemmas rather than links to Gothic as was used by Goths. Korn [kʰũːɘ̃n] (talk) 08:22, 15 May 2016 (UTC)
  • If the speakers of Russian started using exclusively Latin script for reading Russian, we should record Russian in the Latin script. If Russian in Cyrillic script was found only in museums, and most universities had only examples of Russian in Latin script, then yes, we should record Russian in the Latin script. All of the speakers of Gothic who might use Wiktionary, including all the non-existent native speakers, use Latin to write it.
  • I think editorial liberty is a red herring. Latin transcription of Gothic is letter for letter. Whether we cite Gothic in Latin script or Gothic script makes no difference to the accuracy of the original. Whether or not a text is meant to be read by other people doesn't change the fact that bringing Pepys' Diary in a Latin script changes it more than transliterating any Gothic work.--Prosfilaes (talk) 06:59, 17 May 2016 (UTC)

French capital letters with diacritics[edit]

Is diff correct? If so, the usage note should be changed or removed from all entries; there's no sense changing just one entry. - -sche (discuss) 22:14, 11 May 2016 (UTC)

The usage note should be kept, but modified to provide more information. I don't know the details, but I'm pretty sure the practice varies by country and that the common practice in France of dropping accents on capitals originated from typewriters not having keys for the accented capitals. I wouldn't be surprised if the Academy recommends keeping the accents. --WikiTiki89 22:32, 11 May 2016 (UTC)
It is a general typographic rule to keep the diacritics on capital letters. Some bad newspaper title errors arose from the lack of diacritics, e.g. « UN POLICIER TUE »: is it tue (" A policeman kills") or tué ("A policeman killed")? It only remains a common issue because current French keyboards can't write capitalized letters with diacritics easily (at least on Windows), and some people can't be bothered to learn how to use diacritics properly. — Dakdada 09:25, 12 May 2016 (UTC)
I'll also add that when I started taking French in middle school, our French teacher actually taught us that we were supposed to drop the diacritic on capital letters (although, I personally found it illogical and refused to comply). --WikiTiki89 15:02, 12 May 2016 (UTC)
@Wikitiki89 My own high school French teacher had been telling students that diacritics were optional on capitals. I didn't believe it either. Hillcrest98 (talk) 23:33, 15 May 2016 (UTC)
I've read a novel or two in French that did not contain a single accent on a capital letter, so it's certainly optional. The only diacritic that isn't is the cedilla, and even that gets dropped sometimes. Andrew Sheedy (talk) 04:40, 16 May 2016 (UTC)
Before the age of computers and Unicode fonts, it was usual practice that French Canadian retained diacritics on all-caps, while European French preferred caps without diacritics. I have not kept up with trends since 2000. —Stephen (Talk) 09:17, 22 May 2016 (UTC)
Not only French? It was said for a long time for Spanish, but I checked the RAE recommendation: only acronyms and siglation don't take graph acc point 7, at the end. Sobreira (talk) 09:31, 27 May 2016 (UTC)

Creating standards for GML[edit]

Over here, During said out that I shouldn't just decide to put the lemma of Middle Low German words on their attested rather than normalised form without discussion. To my knowledge, I'm the only current editor of Middle Low German, so discussion honestly didn't occur to me, despite my Wiktionariandom. So here it is, go forth and discuss
The marking of umlauts happens in the early MLG period with ø, y and slashed u (see also this question), as well as digraphs, but for the longest part of the period is so overwhelmingly absent that the leading authority of the 19th century (Lübben) was stoutly convinced that umlaut didn't occur in the language. The next standard work on the language (Lasch) does prove him wrong, but points out that "ü is hardly/likely not, ö rarely to be taken as an umlaut" ("ü ist wohl kaum..."), and the examples she gives for ö are actually spelled oe, without superscript. So following our conventions for Latin, I figured to change the lemma from e.g. vögen to vogen. Korn [kʰũːɘ̃n] (talk) 23:15, 11 May 2016 (UTC)

Just to make it clear: you would have the lemma entry be at [[vogen]], but with the inflection line and conjugation table on that page showing vögen. Would [[vögen]] be an alternative form entry for vogen? DCDuring TALK 00:42, 12 May 2016 (UTC)
You understood me correct, yes. The circumflex and the trema are a modern scholarly annotation for clarity, standardly applied like macron to Latin texts. I would say that anything attestable can be an alternative form, anything unattested can not. From what I understand from the grammars (I don't have access to corpora myself), the rendition of an umlaut as Ö and Ü was generally unknown in the period, though, and can be expected to be unattestable. Korn [kʰũːɘ̃n] (talk) 13:30, 12 May 2016 (UTC)
Hence your analogizing to how we handle Latin macrons etc. rather than to how we handle German tremas, which is what I thought the analogy would be. If I actually knew anything substantive in this area, I probably would not have resorted to procedure. I suppose a discussion, preferably with more than just the two of us participating, my contribution being minimal, would be good for Wiktionary talk:About Middle Low German to memorialize the decision. DCDuring TALK 15:09, 12 May 2016 (UTC)
I support the lemmas being the non-diacriticized forms. I wish we would do this with Ancient Greek as well. --WikiTiki89 15:15, 12 May 2016 (UTC)
It's this way too in many cases for Middle English, where "u" ( = /y/), is nowadays oftentimes written "ü" to clarify the pronunciation, but "ü" was never actually used in Middle English orthography (tmk). Middle English "u" could also represent /u/. I would support the same for gml Leasnam (talk) 15:39, 13 May 2016 (UTC)

I am copying this debate to Wiktionary talk:About Middle Low German. Please give further input there. Korn [kʰũːɘ̃n] (talk) 19:27, 14 May 2016 (UTC)

Standard layout of adjective tables?[edit]

For inflection tables of languages with cases, our normal practice is for the singular to appear in one column, then the plural in another column to the right. For languages with a dual, like Slovene, there are three columns. But when a language also has gendered adjectives, there are two dimensions to the table: number and gender, giving a total of 9 combinations in the case of dual languages, and 6 for most others. There's different ways to make the layout of adjective inflection tables in this case:

  • Put them all into one row, singular in all genders first, then plural in all genders. This is what we use for Latin, Russian, Polish, German and it tends to get rather wide especially in the Latin case. Not really good for mobile.
  • Put them all into one row, masculine in all numbers first, then feminine in all numbers, etc. This is apparently what we use for Proto-Germanic. The downside to this layout is that it doesn't keep singular and plural forms together, which often resemble each other in different genders.
  • Have number distinguished by row, and gender by column. This is used for Serbo-Croatian and Slovene.
  • Have gender distinguished by row, and number by column. This is essentially three noun inflection tables stacked on top of each other, so it's more consistent in that way. However, for languages with two numbers, you end up with 3 rows and 2 columns, which gives a rather tall table without using the width much. This is definitely the best layout for mobile though, for this reason.

My question is whether there is a preferred layout for these. Specifically, what should be used for the Proto-Indo-European adjective table I intend to develop? PIE has three numbers and three genders, so it would be either a 3x3 table or a table with 9 (!) columns. —CodeCat 20:54, 13 May 2016 (UTC)

I wonder if there's a way to create tables like these with some CSS magic so that the ultimate layout is determined by the browser based on window size. Anyway, I think without space considerations, having cases on one side and everything else on the other makes the most sense to me, because each number/gender combination can sort of be interpreted as its own sub-lemma with its own declension. --WikiTiki89 21:20, 13 May 2016 (UTC)
But would the columns be the numbers, or the genders? —CodeCat 21:29, 13 May 2016 (UTC)
To clarify, if screen space is not a problem, then I would prefer if cases were rows and numbers and genders were columns. If that makes the table too wide, then you can make the numbers sort of independent tables like for Serbo-Croatian and Slovene. --WikiTiki89 21:38, 13 May 2016 (UTC)
So you think the columns should be the genders, then? —CodeCat 22:04, 13 May 2016 (UTC)
Not necessarily. I think the columns should be gender/number combinations. How those combinations are arranged depends on the particular set of combinations that a language has. --WikiTiki89 22:08, 13 May 2016 (UTC)
There's 9 combinations for PIE... —CodeCat 22:27, 13 May 2016 (UTC)
First of all, there should be no standard layout, since different languages have different considerations, and there are differences in lexicographic traditions and educational standards. The community of editors for a specific language should be allowed to decide on the layout that's best for their languages. Consider also that, once you leave the more archaic post-Anatolian Indo-European languages, you're not going to be able to use any of this consistently across languages.
Now, as far as Indo-European, you have no strong lexicographic traditions to deal with, so you can start from scratch. I think your sorting should be gender first, subdivided by number:
  1. The PIE genders are more like separate declension classes, coming as they do from different origins.
  2. Morphologically, gender morphemes (if you can call them that) tend to occur between the root and the number endings, which makes them more basic- you have a derived stem, to which all the number endings are added.
  3. Semantically, gender is more closely tied to the identity of the referent: numbers can be changed by combining referents in groups, but each of those referents will still have the same gender. That makes gender a more basic category: when you're talking about something you will use the same gender for it in all the different forms you use to talk about it, so you will want to have all the forms for that gender in the same place (of course, something feminine can become grammatically masculine by being grouped with something masculine, but it's still intrinsically feminine).
Of course there are also aspects where number is more basic, and there might be some way to minimize or hide the dual columns if they're next to each other, which would save space in the basic display (the dual number is rather secondary, in many ways, and has mostly disappeared in the daughter languages).
As for horizontal vs vertical arrangements: I wonder if there's any way to get the groups of columns to wrap as a block instead of by line? That is, the neuter singular, dual and plural, with all of their cases, would be a separate table that would move below the tables for the masculine and feminine with their respective numbers and cases if the page wasn't wide enough, and the feminine table would move between the masculine and the neuter if the page was only wide enough for one gender block. If you could do that, you would have the both the horizontal arrangement for wide screens and the vertical arrangement for narrow screens. In the vertical configuration, it would be somewhat like the arrangement at Sanskrit विशाल ‎(viśāla), to give an arbitrary example. Chuck Entz (talk) 03:05, 14 May 2016 (UTC)
I cast my lot with "let the editors decide". Korn [kʰũːɘ̃n] (talk) 19:46, 14 May 2016 (UTC)
I am an editor. I'm asking others to decide. :p —CodeCat 19:54, 14 May 2016 (UTC)
Wir sind das Editor! Wir sind das Editor! - Rephrasing my stance: I have no problem with a non-uniform layout across different languages and thus think discussion of the respective tables should take place in the communities of those actually working on the language. Coming from a Germanic language, for Germanic languages I prefer this way: With a uniform plural it should be case by row, rest by column. That is male/neuter/female/plural × cases ordered NAGD. Languages with gendered plural should follow the same pattern but have two separate tables for singular/plural. Korn [kʰũːɘ̃n] (talk) 08:38, 15 May 2016 (UTC)
The most commonly spoken languages (e.g. German) have a single plural form for all genders, so they need 4 columns, not 9. How many languages need more than 4 columns? And how many words do we currently cover in such languages? Are the weird cases few or many? -- LA2 (talk) 19:35, 15 May 2016 (UTC)
Many Slavic languages preserve distinct plural forms for each of the genders. Slovene is an extreme case because it has dual forms too, so it needs 9 columns (see dober), but Serbo-Croatian is much more widely spoken and needs 6 columns (dobar). The two Baltic languages have no more neuter gender, but the plurals of the two genders remain distinct, so there are 4 columns (geras). Icelandic, a Germanic language, does not have full syncretism of the genders in plural, so 6 columns remain necessary (góður). —CodeCat 20:36, 15 May 2016 (UTC)


I edited the entry my to replace the small list of 4 senses with a link to the more complete list of senses at Appendix:Possessive. See diff. What do you think?

First, I moved User:Msh210/English possessives to Appendix:Possessive and edited the page further. Previous discussions: Wiktionary:Tea room/2011/April#English possessives, Wiktionary:Beer parlour/2016/January#English possessives. What do you think, @msh210?

I'd like to do the same for other entries at some point, like your, etc. --Daniel Carrero (talk) 02:17, 14 May 2016 (UTC)

The Appendix should be a supplement not a replacement. I'd simply revert the elimination of the definitions and add a reference to the Appendix under See also. DCDuring TALK 11:57, 14 May 2016 (UTC)
Whoa. "A possessive used before a noun" doesn't distinguish my from other possessives like your and her. Equinox 11:42, 15 May 2016 (UTC)
I reverted my edit. --Daniel Carrero (talk) 18:26, 16 May 2016 (UTC)

Surprising Homographs?[edit]

As I was explaining our redirection policy earlier, I used my favorite example of a word that has the same spelling, but is completely unrelated in any way: Indonesian air, which means water. I thought it might be fun to come up with a list of these to use in our documentation. I found a few others to start the list. Can anyone think of more?

  1. air ‎(water) (Indonesian)
  2. ball ‎(organ) (Irish)
  3. beach ‎(bee) (Irish)
  4. bean ‎(woman) (Irish)
  5. fear ‎(man) (Irish)
  6. here ‎(testicle) (Hungarian)
  7. millet ‎(nation) (Turkish)
  8. take ‎(bamboo) (Japanese- Romanization of たけ)
  9. pint ‎(penis) (Low German)
  10. teach ‎(house) (Irish)

Different capitalization and place names:

  1. Gift ‎(poison) (German)
  2. Lizard ‎(peninsula in Cornwall) (English)
  3. Mist ‎(manure) (German)
  4. Speck ‎(bacon) (German)
  5. Split ‎(city in Croatia) (Serbo-Croatian, English)
  6. Sexmoan

Feel free to edit/add to my list. Thanks! Chuck Entz (talk) 05:41, 16 May 2016 (UTC)

  • ball ‎(organ)
  • bean ‎(woman)
  • teach ‎(house)

(All Irish)

  • Gift ‎(poison)
  • Mist ‎(manure)

(Both German) --Catsidhe (verba, facta) 06:28, 16 May 2016 (UTC)

I've added those to my list, with the German ones in a separate list for different capitalizations. @Korn: Thanks for pint. There are traces of the same word in English: see cuckoopint. Chuck Entz (talk) 12:46, 16 May 2016 (UTC)
Wiktionary:Foreign Word of the_Day/2013/April#2, Wiktionary:Foreign Word of the_Day/2013/July#15, Wiktionary:Foreign Word of the_Day/2013/October#26, Wiktionary:Foreign Word of the_Day/2014/April#5, Wiktionary:Foreign Word of the_Day/2014/September#2, Wiktionary:Foreign Word of the_Day/2015/April#3, Wiktionary:Foreign Word of the_Day/2015/November#1. — Ungoliant (falai) 14:10, 16 May 2016 (UTC)
Also one of my personal favourites: Mirandese you ‎(I) (coupled with Danish I ‎(you)). — Ungoliant (falai) 14:11, 16 May 2016 (UTC)
I think Gift and Speck deserve a category of their own, since they are in fact exact cognates with the English homographs, whose meanings diverged quite far. --WikiTiki89 14:37, 16 May 2016 (UTC)
कट ‎(kaṭ) /kəʈ/ sounds like "cut" and has the same meaning. —Aryamanarora (मुझसे बात करो) 18:18, 23 May 2016 (UTC)

OK, my gift (not Gift!) to User:Chuck Entz: a chain reaction of false friends....

meaning Galician
"hand" man
"man" home

This actually made that, when opening the first Zara Home shops in the country of origin of the company, (some) people thought that they would be the men's department.

And you may wonder... how to render English "home" into Galician? casa. Galilove... Sobreira (talk) 09:47, 27 May 2016 (UTC)

This reminds me, although these are homohophones and not exactly homographs, of this saying in English about Hebrew: [aˈni] is [mi], [mi] is [hu], [hu] is [hi], and [hi] is [ʃi]. --WikiTiki89 14:55, 27 May 2016 (UTC)

Wiktionary:About Akkadian[edit]

I just created Wiktionary:About Akkadian, but I do not actually know very much about our practices for Akkadian. Can someone who works with Akkadian help fill in the information? --WikiTiki89 15:41, 16 May 2016 (UTC)

@ObsequiousNewt, JohnC5, DerekWinters, Angr: Pinging people who expressed some knowledge of cuneiform in a recent discussion on Hittite lemmas. --WikiTiki89 19:40, 16 May 2016 (UTC)
I'd love to comment, but the discussion above was never satisfactorily resolved, and most of my concerns with my various proposals were never actually addressed. Since the orthography is similar here, my comments apply also. —ObsequiousNewt (εἴρηκα|πεποίηκα) 15:36, 19 May 2016 (UTC)
As before, I would like to know what to do with determinatives (are they part of the lemma or not?). I don't believe they should be because they have no phonetic realization. Should we have a vote about cuneiform lemmatization? —JohnC5 18:39, 19 May 2016 (UTC)
I believe forms with determinatives should be alternative forms, unless the forms without the determinatives are much less common (OTOH for Ancient Egyptian, I would say determinatives should be part of the lemmas, since they were essentially required for most phonetically spelled words). Whether they have a phonetic realization or not is irrelevant. But anyway, there are a lot more basic things to decide first, such as which dialect to use. I'm biased towards Old Babylonian, because I'm going Huehnergard's grammar, but most of our transliterations seem to be in later dialects without the final -m. Also, it seems that most of our entries are for logograms, whereas logograms were usually less common than than phonetic spellings for most words. --WikiTiki89 18:55, 19 May 2016 (UTC)
I think this discussion would benefit by looking at the example of the Han characters used in Chinese, Japanese, Korean and Vietnamese: they are, like the cuneiforms, a very long-lived mixed logographic/phonemic system adopted and used by unrelated languages. The Chinese and Japanese editors, especially, have had a great deal of experience working with variations on some of these very issues.
While you won't find purely semantic independent characters to correspond to determinatives, the vast majority of the characters contain some combination of recognizable semantic and phonemic elements. A very transparent example is (originally both he and she), and , which was created by replacing the semantic element meaning "person" with one meaning "woman".
As for Sumerograms and other variations in the ways that cuneiforms can be interpreted: in Japanese there are usually a number of readings for any given character, which are classified into a series of etymologically-based named types: the on readings are borrowed from Chinese, with several subdivisions for the topolect and/or period of Chinese the word was borrowed from, and the kun readings are native Japanese.
Of course, with cuneiforms we don't have a corresponding body of native scholarship nor the knowledge of modern speakers to draw from, so they're a lot messier. Chuck Entz (talk) 03:13, 20 May 2016 (UTC)

A new "Welcome" dialog[edit]

Hello everyone. This is a heads-up about a change which has just been announced in Tech News: Add the "welcome" dialog (with button to switch) to the wikitext editor.

In a nutshell, later this week this will provide a one-time "Welcome" message in the wikitext editor which explains that anyone can edit, and every improvement helps. The user can then start editing in the wikitext editor right away, or switch to the visual editor. (This is the equivalent of an already existing welcome message for visual editor users, which suggests the option to switch to the wikitext editor. If you have already seen this dialog in the visual editor, you will not see the new one in the wikitext editor.)

  • I want to make sure that, although users will see this dialog only once, they can read it in their language as much as possible. Please read the instructions if you can help with that.
  • I also want to underline that the dialog does not change in any way the current site-wide configuration of the visual editor. Nothing changes permanently for users who chose to hide the visual editor in their Preferences or for those who don't use it anyway, or for wikis where it's still a Beta Feature, or for wikis where certain groups of users don't get the visual editor tab, etc.
    • There is a slight chance that you see a few more questions than usual about the visual editor. Please refer people to the documentation or to the feedback page, and feel free to ping me if you have questions too!
  • Finally, I want to acknowledge that, while not everyone will see that dialog, many of you will; if you're reading this you are likely not the intended recipients of that one-time dialog, so you may be confused or annoyed by it—and if this is the case, I'm truly sorry about that. This message also avoids that you have to explain the same thing over and over again—just point to this section. Please feel free to cross-post this message at other venues on this wiki if you think it will help avoid that users feel caught by surprise by this change.

If you want to learn more, please see https://phabricator.wikimedia.org/T133800; if you have feedback or think you need to report a bug with the dialog, you can post in that task (or at mediawiki.org if you prefer).

Thanks for your attention and happy editing, Elitre (WMF) 16:47, 16 May 2016 (UTC)

Would it really have been impossible or hard to switch this off for registered and logged-in users? DCDuring TALK 17:16, 16 May 2016 (UTC)
The task says so. I'm also here for a reminder—this wiki features a Single Edit Tab system; if you're not sure you know or remember how that works, you can read the guide (which details, among other things, how to switch between editors from the buttons on the toolbar); you can change your editing settings at any time, by the way. (I had also written a very quick intro to the visual editor, in case anyone is interested). Best, --Elitre (WMF) (talk) 14:36, 17 May 2016 (UTC)

Looking for someone to help with FWOTDs[edit]

Hey folks. Due to personal and health issues, I’m unable to spend as much time on Wiktionary as I’d like to. Sooner or later, I won’t have time to keep track of foreign words of the day consistently anymore. For this reason, I really need someone who can share the burden of maintaining the project. While technically anyone who wants to set FWOTDs is free to do so, within the limits of the guidelines, if you want my sanction as an “official” maintainer, you must meet the following criteria:

  • know how to find your way through linguistic literature: half our featured words are pilfered from various articles published in god-forsaken periodicals and magazines, and you will have to be able to find this stuff if we are to prevent FWOTD from being a rotation of major western European languages;
  • have common sense: if you think it would be funny to feature penis or something like that, you’re out. FWOTD is serious motherfuckin’ business!
  • willingness to take the blame: why do you think I’m asking for help anyway? I need someone who can be officially blamed for not noticing my mistakes!

Note that maintaining FWOTD involves lot more than just picking words to feature and updating the templates. Spontaneous nominations are only enough for 10-20% of all featured words, despite my bias towards choosing them; the rest is words that I find or create myself, and this takes a lot of time. I also have to create and upload images for words with unusual scripts, keep an eye out for vandalism on featured words during their day of featuring.

If anyone is interested, reply here and I will send you detailed instructions. — Ungoliant (falai) 02:54, 20 May 2016 (UTC)

Those of you who remember the original FWOTD vote will know that I have slacked off to an embarrassing degree, and I feel very guilty about this. If someone who really wants to work on this and has the requisite skills speaks up, I would be happy to let them do it, but I should take up this burden to make up for leaving so much work on Ungoliant's plate. I do have enough time, and I knew how to run it (although I may have forgotten some of the details by now). —Μετάknowledgediscuss/deeds 03:58, 20 May 2016 (UTC)
I'm happy to help out as time permits, but I definitely don't want to be the primary responsible for FWOTD. —Aɴɢʀ (talk) 14:25, 20 May 2016 (UTC)
Likewise, could for example help to clean up nominated articles, do research etc. Thanks Ungoliant for your work on this, FWOTD is one of the things I enjoy most here, both as a reader and contributor. – Jberkel (talk) 10:02, 21 May 2016 (UTC)
Thank you so much guys! I think you’re already familiar with the technical and regulatory aspects, so I’ll just mention the unwritten rules that I try to follow:
  • hard limit: no more than 2 FWOTDs in the same language per month;
  • soft limit: no more than 1 FWOTD " " " (I’ve only been able to pull this off a few times);
  • prefer featuring other people’s nominations over your own;
  • keep FWOTDs that are in the same language, or in chronological variants (i.e. Spanish and Old Spanish), somewhat far apart;
  • check the history page of entries; if the information was added by someone who you’re not sure is trustworthy, try to check the references and citations to see if they’re accurate;
  • no more than one focus week per month (I’ve only had this option once though);
  • wait for at least a day before featuring words that are posted in the nominations; (sometimes I had to ignore this rule because there were no other options that wouldn’t break the hard limit);
  • add {{was fwotd}} to the page immediately after featuring the word. You are going to forget it otherwise, have no doubt about it;
Thanks again. — Ungoliant (falai) 16:16, 21 May 2016 (UTC)
Thanks, Ungoliant. I'll start tidying up and setting words. @Angr, Jberkel: Please feel free to add words as you wish, or just nominate more if you prefer. I'm fine with being responsible, as long as you guys help out! —Μετάknowledgediscuss/deeds 16:52, 21 May 2016 (UTC)

Stress positioning in Estonian IPA[edit]

I changed something in küll only to notice there's the same thing in tool#Estonian. I'm correct that [t'oːl] implies that there is a syllable break between /t/ and /oː/ and hence this IPA practice is wrong? Korn [kʰũːɘ̃n] (talk) 09:35, 21 May 2016 (UTC)

There is no syllable break, but the difficulty for the template (and humans) is knowing where the syllables are broken up. —CodeCat 18:09, 21 May 2016 (UTC)
The stress sign isn't there by accident, it gets triggered by the actual input, which was "k`üll". So I think someone wanted that. Korn [kʰũːɘ̃n] (talk) 18:11, 21 May 2016 (UTC)
That's the pronunciation format used by ÕS, specifically to avoid having to be specific about where the syllable breaks are. The backtick ` indicates an overlong syllable, and is placed before the vowel. Why do we need to know syllable breaks to indicate stress in IPA? The vowel is the nucleus of the syllable, that's where stress should be placed. —CodeCat 18:12, 21 May 2016 (UTC)
See the documentation of {{et-IPA}} for the notation, btw. It's a simplification of what ÕS uses. —CodeCat 18:18, 21 May 2016 (UTC)
We don't need to know it, but in IPA, the character ˈ does not imply a long vowel but primary stress. So what we're currently displaying is not an overlong vowel but a long vowel carrying a primary stress which starts after the initial consonant. So the display ending up with the user doesn't make sense. Korn [kʰũːɘ̃n] (talk) 21:59, 21 May 2016 (UTC)
The overlength is not displayed in IPA because it's not obvious how. It's not the vowel that gets lengthened, but the syllable coda as well. Even consonant clusters can be lengthened, though I don't know exactly what that entails phonetically. In any case, the feature is suprasegmental, it exists not on the phoneme level but on the syllable level. That said, overlength is always accompanied by stress, so it's always ok to assume that a syllable indicated as overlong is stressed. That's what the template does. —CodeCat 01:24, 22 May 2016 (UTC)

New logo 2[edit]

I created Wiktionary:Votes/2016-05/New logo 2, to start in a week. It proposes a derivative of the tile logo for the English Wiktionary logo. A rationale is at Wiktionary talk:Votes/2016-05/New logo 2#Rationale. Let us postpone the start of the vote if required by discussion. --Dan Polansky (talk) 08:01, 22 May 2016 (UTC)

Merge all Prakrits[edit]

I think all the Prakrits should be merged into a single language for organizational purposes; do we really need ~5 languages all with the same entry and meaning at 𑀅𑀕𑁆𑀕𑀺 ‎(aggi)? For all intents and purposes, the Prakrits are just dialects. —Aryamanarora (मुझसे बात करो) 18:51, 23 May 2016 (UTC)

We'd need more information to decide this. How different are they? Mutual intelligibility? —CodeCat 19:12, 23 May 2016 (UTC)
[3] (see bottom of page 8, top of page 9) – they are mutually intelligible, but learning a little Sanskrit greatly helped communication. They were similar enough to be used interchangeably in the same works; see Dramatic Prakrits. Of course, there were minor orthographical differences in inflection, but we can settle on Maharashtri Prakrit as a standard (it's the best documented) and build off of it. —Aryamanarora (मुझसे बात करो) 19:28, 23 May 2016 (UTC)
A good analogy is Vulgar Latin, spoken by the common people and thus having many dialects and varying spelling systems. —Aryamanarora (मुझसे बात करो) 19:31, 23 May 2016 (UTC)
How do other sources handle it? I'm reminded of the situation with Ancient Greek, where there are sometimes quite striking differences between dialects (Doric -onti vs Attic -ousi(n)). But for Ancient Greek, Attic is mostly the standard form, except in a few cases (τέσσαρες ‎(téssares), which is apparently not the form of any dialect?). —CodeCat 20:11, 23 May 2016 (UTC)
Maybe not the form of any older dialect, but it is the Koine form (it's in both LXX and NT). —Aɴɢʀ (talk) 21:11, 23 May 2016 (UTC)
@CodeCat: Most dictionaries and grammars focus on Maharashtri Prakrit and detail the Dramatic Prakrits second, and often exclude the lesser Prakrits. We can use {{lb}} to differentiate between dialects. —Aryamanarora (मुझसे बात करो) 23:31, 23 May 2016 (UTC)
  • I don't really have an opinion, but I presume that @-sche would probably like to be made aware of this discussion. —Μετάknowledgediscuss/deeds 04:06, 24 May 2016 (UTC)
Thanks for the ping. I'm more knowledgeable of the other kind of Indian language than this kind. I'm intrigued that Wikipedia's article on w:Prakrit says Ardhamagadhi is the definitive Prakrit, but the literature supports Aryamanarora's statement that it is rather "Maharashtri, which [...] with orthodox Jain scholars generally, is Prakrit proper" (Ramananda Chatterjee, 1927, in The Modern Review, volume 41), "Maharashtri [is] considered the Prakrit par excellence" (Thomas R. Trautmann, 2006, Languages and Nations: The Dravidian Proof in Colonial Madras, ISBN 0520244559). Does the Wikipedia article need to be corrected?
A. C. Woolner (in his 1986 Introduction to Prakrit) says "it may be understood that the different Prakrits were mutually intelligible among the educated"; G. C. Pande (1990, Foundations of Indian Culture) says "the Prakrits were mutually intelligible". - -sche (discuss) 07:27, 24 May 2016 (UTC)
@-sche: I think I understand the discrepancy now; Maharashtri is the main Prakrit of Jainism, Ardhamagadhi is for Hinduism, and Pali is for Buddhism. (Yes, Pali is a Prakrit, but is considered a separate language for sectarian reasons). —Aryamanarora (मुझसे बात करो) 13:46, 24 May 2016 (UTC)

@CodeCat, -sche, Metaknowledge So is this a yes? —Aryamanarora (मुझसे बात करो) 00:14, 25 May 2016 (UTC)

Yes, merge the ones which have "Prakrit" in their names once we decide on a code. Should Pali also be merged, in your view? Authorities have traditionally treated Pali differently from the Prakrits, but for non-linguistic reasons, as you note. - -sche (discuss) 02:49, 25 May 2016 (UTC)
We should leave Pali separate; there are too many entries and Pali has some of its own developments that set it apart from the rest of the Prakrits (multiple scripts, strong East Asian Buddhist influences, etc). —Aryamanarora (मुझसे बात करो) 16:37, 25 May 2016 (UTC)

Also, we could use pra as a language code; it is the collective code in the ISO standard for all Prakrits. —Aryamanarora (मुझसे बात करो) 13:50, 24 May 2016 (UTC)

I'll point out that currently the Prakrit languages are acting as the ancestor languages for several different branches of the Indo-Aryan family (seen here if you scroll way down). We've had some issues in the past of people trying to say words are inherited from Sanskrit when Sanskrit has no direct descendants. If we do merge them, we definitely should have etymology only languages for them. —JohnC5 14:48, 24 May 2016 (UTC)
@JohnC5: Um, (Vedic) Sanskrit is the direct ancestor of all the Indo-Aryan languages; Classical Sanskrit seems to be what you're talking about. Anyway, we definitely need the current codes to remain intact, as many entries reference certain Prakrits (CAT:Hindi terms derived from Sauraseni Prakrit). —Aryamanarora (मुझसे बात करो) 00:14, 25 May 2016 (UTC)
Sorry, yes, Vedic is apparently what I meant. —JohnC5 00:29, 25 May 2016 (UTC)
Is pra a family code? If so, we shouldn't reuse it as a language. —CodeCat 00:43, 25 May 2016 (UTC)
pra is both an ISO-639-5 family code and an ISO-639-2 language code. If we merge the Prakrits, do we still need it as a family code? If not, we could use it as a language code, like nah. Otherwise, how about "inc-pra"? - -sche (discuss) 02:49, 25 May 2016 (UTC)
Both of them would work for me, but pra is shorter, and a family code wouldn't be needed if we merged all of the Prakrits. —Aryamanarora (मुझसे बात करो) 16:37, 25 May 2016 (UTC)
We already have Proto-Indo-Aryan inc-pro for general ancestor needs (and which is marginally distinguishable from Vedic in a few features); I am not sure how much benefit there is in maintaining further ancestor stages? Ardhamagadi as the ancestor of Easter IA (Assemese et al.) and Maharastri as the ancestor of Southern IA (Marathi et al.) is probably at least defensible, but my understading is that there's not a whole lot of consensus on the genetic classification of the New A varieties, including also the exact definition of the Eastern and Southern groups. --Tropylium (talk) 08:23, 27 May 2016 (UTC)
@Aryamanarora I would like to weigh in on the matter and say that the Prakrits should not be merged all together. They are independent languages with different grammars, even though they are very similar. The old Sanskrit plays that often incorporated all the Prakrits did so because they knew that their audience was of the class that would have knowledge of the various languages and their differences. There is a reason that these prakrits have been named separately and given individual grammatical treatises by the various Indian grammarians. And, to argue against merging them over mutual intelligibility, Scots is kept as a separate language despite extremely high levels of intelligibility with English. DerekWinters (talk) 21:06, 25 May 2016 (UTC)
@DerekWinters Their grammars are not that different; they have the same cases, numbers, genders, and inflections. The only differences are spelling, e.g. third-person singular indicative in verbs is marked by -aï in Maharashtri but with -adhi in Sauraseni. They are more similar to each other than the Ancient Greek dialects. There were no "old Sanskrit plays"; the plays were all Prakrit (see Dramatic Prakrits), but certain characters spoke different dialects. Finally, it would make entries so much easier if we merged all of them; do we really need 5-6 entries with the same meaning at "aggi" and "hattha"? —Aryamanarora (मुझसे बात करो) 22:41, 25 May 2016 (UTC)
Also, Prakrit was a vernacular; the people who spoke Sanskrit (Brahmins) simply ignored it as a lower-class language; they would have little knowledge of it. —Aryamanarora (मुझसे बात करो) 22:43, 25 May 2016 (UTC)
@Aryamanarora Sorry I meant the old Indian plays (but also, do look at Sanskrit drama, I believe the Mṛcchakatika is quite famous). And also one could say the same about the cases and numbers and all regarding Avadhi, Braj Bhasha, Kannauji, Hindi, etc. yet they are certainly separate languages. And again, we maintain Scots as separate regardless of its similarities to English. And it's not really a valid reason to say it would make the editor's job easier, because nothing is required of the editor. If you wish, all you need add are the Maharashtri prakrit ones, and someone else someday will add the others. But I do maintain that they are indeed separate languages. DerekWinters (talk) 00:51, 26 May 2016 (UTC)
Also I do believe that Magadhi was quite divergent from the other two Dramatic ones. DerekWinters (talk) 00:52, 26 May 2016 (UTC)
Also, regarding the Brahmins and the prakrits being a vernacular, they were thus spoken by the people, which would include a lot, if not all of the Brahmins. Classical Sanskrit was a very artificial register and during the prakrit era was most certainly only taught as a second language. Also, there are numerous grammars on the prakrits by native grammarians, so they certainly were not ignored. DerekWinters (talk) 00:57, 26 May 2016 (UTC)
@DerekWinters All your points are very good, and I realize some of my claims were false. However, I still think we should merge them. This is an analog of situation of the Ancient Greek dialects, where many dialects diverge from the traditional form (Attic Greek) but ultimately we classify them as one language. Our entries are very well organized as a result; see τέσσαρες ‎(téssares), which is what I think a good unified Prakrit entry would look like. Yes, Magadhi diverges quite a bit, and Gandhari uses a different script, and Elu somehow made it to Sri Lanka. However, they all have such similar characteristics that the would be decently comprehensible among monolingual Prakrit speakers. See the text example at Magadhi Prakrit#Pali and Ardhamāgadhī; even though Pali is a wholly earlier and more divergent stage from Prakrit, the two texts are nicely comparable.
Also, the inflection is more than just similar; it is often the same: See this grammar comparing Sauraseni and Maharashtri declensions of "putta" (son, < Sanskrit पुत्र ‎(putra)). —Aryamanarora (मुझसे बात करो) 01:56, 26 May 2016 (UTC)
@Aryamanarora I see where you are coming from with the inflections, but I believe this may be something like the unification of Chinese. Written, they seem similar (although I would argue that the grammars are much more divergent for the Chinese languages), but spoken, a monolingual speaker of one prakrit would have difficulties understanding the speech of a monolingual of another prakrit. I personally believe this is grounds enough to keep them separate, but I understand if the community doesn't agree. But I must caution, if we are to have entries for unified prakrit, we should have inflection tables for all the varieties attested, and we are likely to have citations for the various varieties. Furthermore, with the phonetic differences among certain words, I believe this would lead to very cluttered and messy entries. I believe that all of this information could be better handled in individual entries. DerekWinters (talk) 02:51, 26 May 2016 (UTC)
@DerekWinters While there would be some difficulty in comprehension, I doubt a monolingual Prakrit speaker wouldn't be able to at least understand the gist of another Prakrit. Literature agrees with me; see -sche's references above. We should definitely make inflection tables for all the Prakrits; we have enough information to do so. The phonetic differences aren't too bad. Mainly, there's a little bit of consonant dropping and sibilant mergers between Prakrits, but IMHO it isn't so bad. I can make some inflection tables right now if needed. —Aryamanarora (मुझसे बात करो) 22:26, 27 May 2016 (UTC)
@Aryamanarora I agree that there are similarities, but we also maintain such differences in several languages here, like Portuguese, Galician, and Fala; Spanish, Asturian, Leonese, and Extremaduran; Persian and Tajik; German and Yiddish, etc. You could definitely argue that they are individual languages, but one could also argue that they are simply dialects of one larger language. And you are correct, a monolingual prakrit speaker would probably understand somewhat another prakrit, especially in the educated, but I do not think that is a fair metric, as the educated would have learned Sanskrit, enabling them significant comprehension of any of its immediate daughter language. But, regardless, we have no way of truly knowing, and as such I think we should maintain the separation that has been held by the writers of the prakrits. They viewed them as separate, and I believe for decent enough reason. DerekWinters (talk) 02:21, 28 May 2016 (UTC)
@DerekWinters Primary sources aren't always reliable for language distinction; look at modern day Serbo-Croatian, Romanian-Moldovian, Hindi-Urdu, etc. You're right though, we really have no way of knowing. I'll stick with the status quo for Prakrit, and continue to treat them as seperate languages. —Aryamanarora (मुझसे बात करो) 15:22, 28 May 2016 (UTC)


Since the proposal at Wiktionary:Beer parlour/2016/March#Etymology section for non-lemmas was inconclusive, I've instead created this template to place in non-lemma etymology sections. The displayed text may need improvement, feel free to propose or make changes. —CodeCat 20:23, 23 May 2016 (UTC)

Perhaps it could say something less jargony like "See etymology on main entry/entries." rather than just "Non-lemma forms." (which wouldn't mean much to most readers) Pengo (talk) 10:45, 24 May 2016 (UTC)
I wholeheartedly concur with Pengo. I like their phrasing as well. —Μετάknowledgediscuss/deeds 07:45, 26 May 2016 (UTC)
I think you could say "See etymology on main entry." as a user has only one in mind. Why wouldn't there be a link to the appropriate L2 section or even the appropriate Etymology section? Presumably there is a language parameter in the template. DCDuring TALK 10:53, 26 May 2016 (UTC)

Rename Category:Fictional abilities to Category:Metaphysical abilities[edit]

The title says one half of the proposal. The other is it to move it to Category:Parapsychology. --Lo Ximiendo (talk) 21:22, 26 May 2016 (UTC)

Parapsychology is a real (pseudo)science that investigates actual events; that term would not be applied to deliberately fictional superpowers like those in comic books. Equinox 21:55, 26 May 2016 (UTC)
By the way, "metaphysical" means "beyond physical". And performing a metaphysical ability, such as telepathy, IS a paranormal activity. --Lo Ximiendo (talk) 21:59, 26 May 2016 (UTC)
One of the best places to hide things you don't want people to take seriously is in fiction. --Lo Ximiendo (talk) 22:00, 26 May 2016 (UTC)
@Equinox: Posted a belated reply. --Lo Ximiendo (talk) 10:26, 27 May 2016 (UTC)

Initialisms of proper nouns that wouldn't meet CFI[edit]

What is our criteria for including these? Should they be in lemma categories? DTLHS (talk) 23:49, 26 May 2016 (UTC)

Initialisms are lemmas regardless. They are full noun lemmas after all, and can have their own inflections. —CodeCat 00:16, 27 May 2016 (UTC)
They are not SoP; someone unable or too impatient to work it out from context might want to know what they mean. Whether they are truly useful is more questionable, but by that criterion many entries would be in trouble. DCDuring TALK 01:03, 27 May 2016 (UTC)

cuprum from Cyprium or from Κύπρος [edit]

Shouldn't we have entries for expressions like aes Cyprium? Cyprus does come from Κύπρος, but cuprum does not directly, it actually comes from aes Cyprium or at least Cyprium. I stated cuprum as derivative in Cyprium. Sobreira (talk) 08:50, 27 May 2016 (UTC)

I see nothing wrong with having an entry for aes Cyprium. It's not SOP, as "Cyprian brass" does not obviously mean "copper". —Aɴɢʀ (talk) 09:14, 27 May 2016 (UTC)
I think, given the difference in the vowel, that cūprum is an older borrowing. —CodeCat 12:36, 27 May 2016 (UTC)