Wiktionary:Beer parlour/2018/July

Definition from Wiktionary, the free dictionary
Jump to navigation Jump to search
discussion rooms: Tea roomEtym. scr.Info deskBeer parlourGrease pit ← June 2018 · July 2018 · August 2018 → · (current)

Entries for hyphenated attributive forms?[edit]

We have entries such as transitive-verb, at-sign, open-book, criminal-law, shoulder-blade, sea-urchin (see a more complete list here). The hyphen being a mere spelling device, I think these are pointless, and I would like to see them deleted.

However, people have argued that using an hyphen turns something into a single word automatically; I disagree with that, and I'm not aware of any policy to that effect. Has there been a vote, or might we need one?

My point is that we should restrict ourselves to creating lexicalised attributive-form entries, such as cookie-cutter (idiomatic meaning, adjectivisation). Per utramque cavernam 15:16, 1 July 2018 (UTC)

I have suggested a vote to the effect that hyphens that are added when a phrase is used attributively would be treated as spaces for the purposes of determining whether the phrase is SOP. It was in the middle of a long discussion last month that you may not have read, and it was marginally off-topic to that discussion. The universe of possible attributive phrases is just too unlimited for us to cover: "6-inch bolts" a "27-foot boat", "Reform-Jewish-rabbi-officiated weddings", etc. Chuck Entz (talk) 16:06, 1 July 2018 (UTC)
@Chuck Entz: I don't remember reading that discussion, no. Where was it?
Found it. Per utramque cavernam 13:59, 2 July 2018 (UTC)
Yes. The actual (attested) attributive forms will only be a tiny subset of all possible combinations, but even then it might be a huge set. Attestation is a necessary condition for having an entry, but I don't think it should be a sufficient one. Per utramque cavernam 17:20, 1 July 2018 (UTC)
OT: I have enough trouble with hyphens appearing in 'vernacular' organism names. Is it blue moor grass, blue moorgrass, or blue moor-grass (all attestable at Google Books) (just to mention one I just ran across)? At least I'm fairly sure that blue-moor grass, bluemoor grass, blue-moorgrass, blue-moor-grass, bluemoor grass, and bluemoorgrass can be ignored. DCDuring (talk) 17:48, 1 July 2018 (UTC)
You've got my vote. DCDuring (talk) 17:48, 1 July 2018 (UTC)
For reference, this was discussed at Talk:transitive-verb#RFDE: All English attributive forms (with hyphens) of noun phrases. Note that treating hyphen as space for the SOP determination is a separate issue; here, transitive verb is kept, yet someone wants to delete transitive-verb, where the sum is not transitive + "-" + verb but rather transitive verb + hyphenation-operator, or the like. --Dan Polansky (talk) 18:54, 1 July 2018 (UTC)
In general I think we should delete entries that are purely for attributive-noun uses, like transitive-verb, but keep entries that can function as non-modifying nouns themselves, e.g. at-sign and probably shoulder-blade (which might be a legitimate British spelling of shoulder blade, as suggested by being listed as an alternative form under shoulder blade). In such entries, I'm undecided about whether to list the attributive use as a possible definition (as it is done under at-sign), and also undecided about cases like open-book, which has both a definition as a non-SOP adjective and a definition as an attributive noun. The logic here is that the hyphen in attributive-noun uses is purely a typographic convention and shouldn't be treated differently from a space. It should be similar to German compounds, where words that function only as SOP compounds aren't included even though written as a single word. Benwing2 (talk) 15:08, 2 July 2018 (UTC)
If "The logic here is that the hyphen in attributive-noun uses is purely a typographic convention and shouldn't be treated differently from a space", then transitive-verb should be kept no less than transitive verb, since, again, "hyphen ... shouldn't be treated differently from a space". --Dan Polansky (talk) 08:40, 3 July 2018 (UTC)
"It should be similar to German compounds, where words that function only as SOP compounds aren't included even though written as a single word": That is not our practice, as per e.g. Talk:Zirkusschule. --Dan Polansky (talk) 08:42, 3 July 2018 (UTC)

Slovenian Pleteršnik orthography[edit]

@Atitarev, Guldrelokk, Dan Polansky Slovenian orthography is very confusing, as there are at least two incompatible diacritic systems (see Appendix:Slovene pronunciation). On top of this, Pleteršnik's dictionary uses yet another system that I don't understand; see [1] for an example. Apparently this system encodes a lot of additional dialectal information, but I haven't been able to find a description of this system and I can't read Slovenian. Can anyone help discover what the Pleteršnik symbols mean? Thanks! Benwing2 (talk) 17:53, 1 July 2018 (UTC)

BTW, see [2] for a somewhat blurry image of the page that explains the symbols. It's in Slovenian; maybe someone can read it? Benwing2 (talk) 17:58, 1 July 2018 (UTC)
The description is here. It seems to say:
ẹ and ọ signify close vowels, ę and ǫ signify diphthongs /ie/ and /uo/, which are always long and only occur in stressed syllables. e and o stand for open vowels.
ɐ is [ə].
ł is [u̯].
Three kinds of accent exist: two for long vowels, falling, marked with circumflex, and rising, marked with acute, and one on short vowels, marked by grave. Guldrelokk (talk) 18:12, 1 July 2018 (UTC)
(with e/c) I know very little about Slovenian. I am not sure what you are trying to do. Your link[3] shows "zdẹ̀"; are you trying to figure out how to render that in IPA? I suspect that "zdẹ̀" is not an actual attested form but rather a dictionary-only form adorned to show pronunciation, and that the attested usual form is "zde"; but I don't really know. For example, tukaj is shown in en wikt as "túkaj" and is shown in Pleteršnik as "tȗkaj" per Fran[4]. If I am right, we are not talking orthography but rather forms adorned to show pronunciation. --Dan Polansky (talk) 18:20, 1 July 2018 (UTC)
@Benwing2: It also says that macron in loanwords only signifies length and that these vowels are pronounced as ‘pure’. In the dictionary it seems to work like another kind of accent (there is only one per word), apparently it was pronounced as a long vowel with flat tone. Guldrelokk (talk) 18:52, 1 July 2018 (UTC)
From their description, can you figure out how "ȗ" is to be pronounced, used in "tȗkaj"? --Dan Polansky (talk) 18:56, 1 July 2018 (UTC)
@Dan Polansky: [ûː], i.e. [ú͜u]. Written identically in the tonal orthography from the Appendix, it’s also in the entry: Tonal orthography: tȗkaj (why are the lemmas in the ‘stress orthography’?). Guldrelokk (talk) 19:05, 1 July 2018 (UTC)
@Dan Polansky Perhaps "orthography" is the wrong word; maybe "notation" is better. In this case, the etymology for the Russian entry for здесь mentions Slovenian zde. I would rather cite Slovenian words in etymologies in the tonal orthography if possible, as it conveys more etymological information. However, some words (like this one) are available in Fran only in the Pleteršnik notation, and in that case my choices are either to cite it directly in that notation along with a note indicating that it's Pleteršnik's notation (which links to a page explaining that notation), or to try to convert it to normal tonal orthography. Cf. templates like Template:l/sl-tonal, which is used to cite Slovenian words in the tonal orthography and adds a note indicating that the word is in the tonal orthography, with a link to Appendix:Slovene pronunciation, the page that explains the diacritics. This is necessary because, unlike with Serbo-Croatian, there are (at least) two different possible notations, which are incompatible with each other, so without the note, it would be unclear which notation is being used. Benwing2 (talk) 19:06, 1 July 2018 (UTC)
@Guldrelokk IMO we should always be using the tonal orthography, but I've heard that nowadays most Slovenians pronounce words non-tonally, so they may be more familiar with the non-tonal notation. Benwing2 (talk) 19:08, 1 July 2018 (UTC)
@Benwing2: It seems that the tonal orthography is basically Preteršnik’s notation, except that the pronunciation somewhat changed: there is (apparently) no more short or unstressed ẹ/ọ, no diphthongs and no ‘flat tone’. No idea what happened to them. Guldrelokk (talk) 19:18, 1 July 2018 (UTC)
@Benwing2, Guldrelokk, Dan Polansky: Late response. I'm not too familiar with the Slovene tonal notation either and when adding Slovene terms in translations, etymologies or reconstructions, I mostly just use the plain spelling, unless it's already defined here by native/advanced speakers. Rather than making/copying a mistake, I prefer to use what can be confirmed. A good Slovene dictionary is [5] - no tonal notations. You can also use monolingual [6] with some tonal notations. --Anatoli T. (обсудить/вклад) 07:12, 2 July 2018 (UTC)
@Benwing2, Atitarev I wonder what monolingual dictionaries use stress notation? Guldrelokk (talk) 13:13, 2 July 2018 (UTC)
@Guldrelokk: If you learn the notation and the phonology a bit, it will give you the stress as well. The notation à la Ali govoríte slovénsko? is probably enough to know how to pronounce Slovene correctly, if you're already familiar with basic phonetic rules. And [7] I mentioned above is probably your best bet online. --Anatoli T. (обсудить/вклад) 13:35, 2 July 2018 (UTC)
@Atitarev: Yes, exactly, the ‘stress notation’ is redundant once you have the ‘tonal’, and the only reason it’s there is its alleged use by natives. However, I see that the monolingual dictionary you pointed out ([8]) uses the ‘tonal’ notation. What do other monolingual dictionaries use? Guldrelokk (talk) 13:45, 2 July 2018 (UTC)
@Guldrelokk: SSKJ2 appears to use both; headwords are in the stress notation but then they put the tonal notation in parens after. See for example [9], which has a whole bunch of dictionaries including SSKJ2 and Pleteršnik. Benwing2 (talk) 14:49, 2 July 2018 (UTC)

Default title for column templates[edit]

Views are sought on what the default title for column templates such as {{der2}}, {{der3}}, {{der4}}, {{rel2}}, {{rel3}}, and {{rel4}} should be. @Dan Polansky feels it should be the same as the relevant section heading (e.g., "Derived terms"), whereas I am of the view that it is more useful for the title to be "Terms derived from xyz", "Terms related to xyz", for two reasons:

  • It is (marginally) more useful for the template title to display the root term rather than simply repeat the section heading.
  • Where it is necessary to manually add a title, the practice is to put it in the form "Terms derived from xyz (noun)", "Terms related to xyz (verb)", and so on. Thus, for consistency, the default title should be in the same format.

SGconlaw (talk) 10:33, 2 July 2018 (UTC)

A discussion is at Wiktionary:Beer parlour/2018/June#Display text of Template:der3 and others. In that discussion, there is a post by -sche there that makes lot of sense. I acknowledge that repeating "Related terms" after "Related terms" is not so nice, so I propose other possibilities, like leaving the collapsible bar empty, or saying "Items:", or "List:", or coming up with other options that are user-friendly and non-repetitive. In that discussion, I give an example of how entry party looks; not so nice. --Dan Polansky (talk) 08:48, 3 July 2018 (UTC)
As for "Terms derived from xyz (noun)", that is another cruft that should ideally be reduced. The derived term section in question is in the noun section, so this does not need to be repeated. By my estimate, the practice originates from some people's taking pleasure in expressly stating obvious things and things of marginal relevance. These are two very different aesthetics. --Dan Polansky (talk) 09:38, 3 July 2018 (UTC)
I have no strong views on the latter point, but would just like to highlight that having a phrase like "Terms derived from xyz (noun)" does make the section easier to locate in a long entry. — SGconlaw (talk) 10:39, 3 July 2018 (UTC)

News from French Wiktionary[edit]

Logo Wiktionnaire-Actualités.svg


Sorry that we skip two months, we missed people available to translate to English our publication, but this edition is ready and it's a great pleasure to invite you to read the June issue of Wiktionary Actualités!

A lot of content this time! Articles are about examples, a new tool to record pronounciations, integration of a specialized lexicon offered by its authors, a dictionary about popular words in the XIXth, the linking with Wikipedia and a story about fishermen and fishes. As usual, there is also plenty metrics, a short resume of some newspapers articles and unidentified pictures!

This issue was written by nine people and was translated for you by Dara. This translation can still be improved by readers (wiki-spirit). I hope you will see some interest to know what's up in your neighborhood! Face-smile.svg Noé 09:11, 3 July 2018 (UTC)

@Noé: Merci! Was wondering what happened. I'm always willing to help out with translation work if needed. – Jberkel 09:39, 3 July 2018 (UTC)
The issues of April and May now also are translated ! @Jberkel: too bad I only see your message now; you can check on the mistakes for these issues, and maybe we could call on you for the next one ! DaraDaraDara (talk) 11:12, 4 July 2018 (UTC)

July LexiSession: sauces[edit]

This month, suggested topic is sauces! Because of Caesar sauce, maybe.

In French Wiktionary, we just started by the creation of a thesaurus.

LexiSession in short: a collaborative transwiktionary experiment. Several wiki, a same topic, learning by looking at what our colleagues do. You're invited to participate however you like and to suggest next month's topic. The idea is to look at other community improvements on the same topic to improve our own pages and learn foreign way of contributing. If you participate, please let us know here or on Meta, to keep track on the evolution of LexiSession. If you can spread the word to other Wiktionaries, you are welcome to do so. Also, sorry I was very busy this last two months and I forgot to notice you. Face-smile.svg Noé 09:16, 3 July 2018 (UTC)

Improving sorting of items in categories via Mediawiki customization[edit]

Currently, in categories, items starting in č are sorted after items starting in z. This is very unconventional. For instances, in Category:cs:Amphibians, čolek is after salamandr. The example is Czech, but a similar problem is there probably for other languages as well.

As a remedy, it seems we could customize en wikt Mediawiki instance to use uca-default for sorting of items in categories. Czech Wiktionary has this, created via cs:Wikislovník:Hlasování/Změna abecedního řazení v kategoriích. An example Czech category is cs:Kategorie:Česká substantiva; an example Russian category is cs:Kategorie:Ruská substantiva.

One consequence would be that, for instance, instead of č being after z, it would be collated together with c. That is still not the conventional Czech order, in which č is after c rather than being on equal footing, but still seems to be an improvement.

A relevant page is https://www.mediawiki.org/wiki/Manual:$wgCategoryCollation.

Maybe someone knows more and can explain impact across languages.

--Dan Polansky (talk) 10:14, 3 July 2018 (UTC)

There's no perfect solution: ä, for example, is sorted with a in German and after z (and å) in Swedish. But that would get it closer to how an English speaker would expect it.--Prosfilaes (talk) 03:39, 7 July 2018 (UTC)
Ideally what we need is a mechanism to enable per-category collation: that is, Czech collation for Czech categories, German collation for German categories, and so on; multiple sortkeys per page, for Japanese; and a way to write our own collation algorithms for languages that do not have collation algorithms available, such as Ancient Greek, Egyptian, and Coptic. (See Module:egy-utilities for a makeshift collation algorithm used for Egyptian words in Module:columns, Module:cop-sortkey for one that is used in Coptic categories. Module:zh-sortkey provides a sortkey for Chinese categories, but MediaWiki might have an equivalent collation algorithm that would be available if we had per-category collation.) At least per-category collation has been proposed (see phab:T30397), but I don't know what's happened with it since 2012.
Besides changing the default collation, a workaround is to add some sort_key replacements to the language table. It would be ugly, but I think you could impose the order c < č < d by replacing c with cc and č with . — Eru·tuon 00:06, 8 July 2018 (UTC)
Surely the "natural kludge" would be to use cˇ instead of č, o´ instead of ó, etc. (though of course this doesn't really work for cases like ł). --Tropylium (talk) 10:10, 14 July 2018 (UTC)

Most searched-for entries[edit]

Do we have a list of the most searched-for – or even viewed – pages here? I couldn't see anything under the Special Pages list. It would be nice to make a concerted effort to work on things that most people are looking up. Ƿidsiþ 11:43, 4 July 2018 (UTC)

[10]. DTLHS (talk) 16:45, 4 July 2018 (UTC)
Thanks for both the question and the answer. DCDuring (talk) 17:58, 4 July 2018 (UTC)
Nice! Thanks. Ƿidsiþ 06:42, 5 July 2018 (UTC)
Well, it was a good thought, but I think that I'd rather not waste your efforts on improving our pornographic content. —Μετάknowledgediscuss/deeds 06:54, 5 July 2018 (UTC)
:-D — SGconlaw (talk) 06:48, 6 July 2018 (UTC)
I think this is part of the same phenomenon as the rash of bogus "xx" content being added: as far as I can figure out, Wiktionary is being bundled with mobile operating systems in Africa and Asia, and there are lots of users who don't speak English well enough to realize that it's a dictionary and not part of the user interface. They apparently think they're searching the web for porn sites, but they're actually searching Wiktionary. Chuck Entz (talk) 07:31, 6 July 2018 (UTC)
What's odd is that people are searching for Roman numerals like XXXIX and XXIX. Must be typos, I guess! — SGconlaw (talk) 08:05, 6 July 2018 (UTC)
Not odd at all, considering the search engine's auto-suggestion feature. Chuck Entz (talk) 18:28, 6 July 2018 (UTC)
These are views, not searches. So if someone searches for "XXXIX definition" in google (because it's the number of the current Superbowl or something) they may end up here. DTLHS (talk) 18:38, 6 July 2018 (UTC)
In context, this is clearly not about SuperBowls... —Μετάknowledgediscuss/deeds 18:39, 6 July 2018 (UTC)
The February 2018 Super Bowl was LII. DCDuring (talk) 18:54, 6 July 2018 (UTC)
Look at the logs for Abuse Filters 54, 70, and 74 (to start with). Obviously people (probably horny teenagers) from the areas I mentioned are entering a lot of "x"es in the search engine, and when the auto-suggest doesn't land them in actual entries, they're going to the "not found" page, which gives them plenty of buttons for creating entries. Chuck Entz (talk) 19:50, 6 July 2018 (UTC)

Livonian alphabet[edit]

After some discussion it seems Livonian (ie. word Lețmō) should use ţ with cedilla (like Latvian), and not Romanian ț (T with comma below). Latvian-Livonian-English Phrase Book (gramata.pdf[11]) uses cedilla, but Tarto University Estonia-Lețkēļ dictionary[12] uses Romanian Ț in its entries for some reason. --Mikko Paananen (talk) 13:20, 4 July 2018 (UTC)

I think the usage of the Romanian ț is due to technical reasons. I think that Latvian ţ should be used here though, since there are no restrictions like that here. SURJECTION ·talk·contr·log· 23:00, 5 July 2018 (UTC)
I can't imagine what technical reasons would require the use of ț instead of ţ. Unicode-wise, ț was added in a later edition (3.0), whereas ţ was there from the start. Though I'm confused about Latvian; w:Latvian alphabet doesn't show any modified t's. w:ţ says it's only used in a Turkic language.--Prosfilaes (talk) 03:30, 7 July 2018 (UTC)
This is probably an artifact related to how k g n l r with cedilla as also used in Livonian (and Latvian) are rendered with a comma-like diacritic in most fonts; it seems clear that a single palatalization diacritic is what's aimed for here. --Tropylium (talk) 10:16, 14 July 2018 (UTC)
See w:T-comma#Software_support. It was only added in a later Unicode version, and as a result, many fonts did not support it initially, replacing all instances with T-cedilla instead. That is still done for many Romanian texts (at least according to the article). SURJECTION ·talk·contr·log· 18:52, 14 July 2018 (UTC)

Discord server[edit]

Hi. I just want to announce again that the English Wiktionary has a Discord server. If you are a Discord user and a Wiktionary editor, we would very much appreciate if you join in via this permanent invite. We would love to have you there. Cheers, and happy editing! PseudoSkull (talk) 05:44, 6 July 2018 (UTC)

Multi-stage borrowing[edit]

When a word is borrowed from language A into B, and then from B into C, would you say that C has borrowed the word from both A and B, or just from B? So for example, at kakao (Nahuatl > Spanish > Danish), one could say

From {{bor|da|es|cacao}}, from {{der|da|nah|cacahuatl||cocoa}}.

as I've put, or

From {{bor|da|es|cacao}}, from {{bor|da|nah|cacahuatl||cocoa}}.

depending on whether one thinks Danish can be said to have borrowed from Nahuatl.__Gamren (talk) 14:48, 6 July 2018 (UTC)

The first one. Whether Spanish borrowed from Nahautl or not doesn't really matter. —AryamanA (मुझसे बात करेंयोगदान) 15:21, 6 July 2018 (UTC)

Entries descending from themselves[edit]

On reconstruction pages the different Persian varieties are now normally grouped as descendants of Classical Persian (see, for example, Wiktionary talk:About Persian#Tajiki_Persian_is_not_descended_from_Iranian_Persian.). An example of such a layout is at *wŕ̥kah.

However, Classical Persian has (understandably) not been given separate headings and entries; instead Classical and Modern Persian words are united under the ‘Persian’ heading. See, for example, گرگ. The only difference between links to fa and fa-cls is the language name before the lemma.

Now, a page آهو listed itself as its own descendant. On the talk page @Victar argued that it is correct; it represents the same inheritance of the Modern Persian word from the identical Classical Persian word. However, I don’t think that the layout of reconstruction pages should allow entries to list themselves as their descendants, for the following reasons:

  1. It is confusing; the heading says Persian and one of the descendants is Persian as well, being no different. Moreover, Dari and Tajik words are first listed as regional variants, and then again as descendants; I understand that it’s supposed to represent different instances of the word, one as a modern Persian, one as a Classical, but it is still confusing.
  2. Nor am I aware of any other language that does so. As for Persian, @Victar says it is normal, but I haven’t been able to find any other entries that name themselves their descendants, not even among ones that are descended from Classical Persian on the reconstruction pages, like گرگ‎.
  3. @Victar argued that such descendants are there to be included into Reconstruction pages with {{desctree}}. However, in reality they break {{desctree}}, as @Chuck Entz pointed out on the talk page. Even if this can be fixed, I fear that such an unusual layout may cause other technical issues.

In my opinion, either Modern and Classical Persian should be separated, with one inheriting from another, or Persian entries shouldn’t list the same Persian entries as their descendants. Continuity with Classical Persian can be implied whenever a word is inherited; a modern word descending from Middle Persian or further must have gone through the Classical Persian stage. Likewise, English entries don’t list themselves as descendants meaning that they come from Early Modern English; it is sufficient to provide a further etymology, or label words that have come out of use since EMnE obsolete. In either case the current layout of (Indo-)Iranian reconstruction entries can be kept – they would simply be the only place to distinguish consistently Modern and Classical Persian in the latter case, as they have reasons for that.

Alternatively, if a آهو-like layout be the accepted one, then I think a bot can be made to automatically list every inherited Persian entry as its own descendant, together with useful notes like @Chuck Entz have added.

Guldrelokk (talk) 21:48, 6 July 2018 (UTC)

Continuing off the original discussion linked above, here is an example, building off what was discussed there. On the Old Persian entry 𐎠𐎰𐎥 (a-θ-g) we have the descendent tree constructed as so:
* Middle Persian: (/sang/)
*: Book Pahlavi: 𐮽𐮵𐯋𐮲 (sng), [script needed] (KYPA)
** Bakhtiari: سنگ (sang)
** {{desctree|fa-cls|سنگ|tr=sang}}
Now on the Classical/Modern Persian entry سنگ, we have the descendents like this:
* Iranian Persian: سنگ (sang)
* Tajik: санг (sang)
* Coptic: ⲃⲁⲥⲛϭ (basnc)
* → Hindustani:
** Hindi: संग (sang)
** Urdu: سنگ (sang)
* Ottoman Turkish: سنگ (seng)
  1. I've added a fa-ira etymology code, per @Calak's example in the previous discussion. If that isn't clear enough, I'm not vehemently opposed having some text in Persian descendant sections that reads (Descendents listed reflect that of Classical Persian). Most borrowings from Persian are from the Classical period, which means that virtually all Persian descendents sections would have that note above, so I do find it a tad excessive.
  2. The example of this, which is the basis of the Persian model we currently use, can be seen on Latin entries where we find descendent in the form of Medieval Latin, Late Latin, etc.
  3. There is nothing mechanically "broken" about this method, if that's what you mean, as you can see it working just fine on both pages.
--Victar (talk) 00:28, 7 July 2018 (UTC)
I see 𐎠𐎰𐎥 (a-θ-g) works fine, but *HaHĉúkah doesn’t. Guldrelokk (talk) 00:36, 7 July 2018 (UTC)
@Guldrelokk:, it's broken because an {{rfc}} tag got thrown in there. --Victar (talk) 00:39, 7 July 2018 (UTC)
Good. Even if the technical issues are resolved, others are not. Latin entries do not list ML and LL forms as their descendants when they are identical, so I don’t see how it’s comparable. Guldrelokk (talk) 00:40, 7 July 2018 (UTC)
And yet we do do that, especially on reconstructed entries, like *blavus. --Victar (talk) 00:53, 7 July 2018 (UTC)
What about non-reconstructed entries? These are very different cases: *blāvus and blāvus are different entries, Classical Persian: سنگ and Iranian Persian: سنگ are one and the same. Guldrelokk (talk) 01:00, 7 July 2018 (UTC)
Another point: your new solution makes {{fa-regional}} redundant. The ‘descendants’ will always be the same; either of these will have to go. Guldrelokk (talk) 01:00, 7 July 2018 (UTC)
Both Latin headers, both using the same language code la, both identical. There are non-reconstructed examples, as I know I've made some, but hard to sift through thousands of entries; I'll look though. It makes a difference, for example, in Latin descendants from a -w- form, and those from a -v- form.
Not really. It would generally only be used on pages with Classical Persian borrowings. --Victar (talk) 01:12, 7 July 2018 (UTC)
If the younger descendants are only for pages with Classical Persian borrowings and for others {{fa-regional}} will suffice, then I don’t see what additional info do they provide by duplicating {{fa-regional}} and introducing confusion by descending from themselves. If it’s important to show that the borrowings are from Classical Persian (if it indeed is important in general in Persian entries), then a note can be added that ‘the following words were borrowed at the Classical Persian stage’. Similar entries having distinct layouts are bad for consistency: it can confuse readers as well as less experienced editors, who may list New New Persian descendants in entries without borrowings and omit them in entries with borrowings. Guldrelokk (talk) 01:21, 7 July 2018 (UTC)
And no, *blāvus and blāvus are not identical: mind the link colours. Guldrelokk (talk) 01:26, 7 July 2018 (UTC)
You've already given your opinion, and I mine. Let others chime in so we're not just discussing this in a circle. --Victar (talk) 01:29, 7 July 2018 (UTC)
I don't think that showing descendants that are all the same as the headword and have no descendants themselves is at all useful- it doesn't explain anything, and the same information is included in the regional template- it feels like a tautology. In cases where dialectal variation at the Classical Persian stage is reflected in differences among the regional forms, or where there's borrowing from one of the descendants into another language, that's something you would want to show. Chuck Entz (talk) 02:44, 7 July 2018 (UTC)
I agree, @Chuck Entz. I'm certainly not advocating adding Iranian Persian, Tajik, and Dari to the descendents section of all Persian entries, as that would be needlessly redundant. This only becomes an issue when we have borrowings from Classical Persian or from Dari and Tajik, i.e. whenever a Persian entry merits a descendents section. --Victar (talk) 03:57, 7 July 2018 (UTC)
My main concern was with the state of the entry when I saw it and the complete lack of explanation in it of what was descending from what. Guldrelokk's initial reaction is basically what I would expect from any of our readers who don't know the finer points of Persian's historical stages and dialectology. Changing the language name in the descendants to "Iranian Persian" was helpful, and better than the qualifier method I used, but I'm still concerned that it's a bit opaque to the average reader. My edits were just a quick mock-up to show what I was talking about- the usage note, especially, is probably overkill.
As for my comment that "{{desctree}} can't use such things": it was based on a quick (mis)reading of the code and the comment about substituting to prevent template loop errors. When I looked at it again it was obvious that it was just substituting a module invocation for a template invocation that did the same thing. I never said anything about it being broken, just that (I thought) the code had a safety mechanism that prevented it from working. I still don't see how it avoids infinite recursion, but I'm not all that good with Lua. Chuck Entz (talk) 02:31, 7 July 2018 (UTC)
@Chuck Entz: The constraint that the module is avoiding is in the parser: a template isn't permitted to contain another instance of itself (for instance, if you put {{sandbox}} on its own page), and a module that's invoked on a page can't expand another instance of that page. But apparently you can have an invocation of a module function print the result of preprocessing another invocation of the same function. (I tested this with Module:doublet table. It just made the tables in Appendix:English doublets disappear; the preprocessing generated the empty string once for each invocation of the function. No recursion. Maybe I did something wrong, or the developers made the loop terminate somehow.) — Eru·tuon 05:49, 7 July 2018 (UTC)
It seems to me like there is a technical gap. {{alter}} could perhaps accept additional parameters which would mark alternative forms as being parents or childs or siblings. Forms created by {{fa-regional}} are probably too loosely connected and should go into the {{fa-noun}} template so they can picked up appropriately. Perhaps this would be realized in a way generalized for pluricentric languages by saving siblings into language data (being effective aside from Persian for Hindustani, Serbo-Croatian, perhaps Aramaic …).
Without wise technical solutions discord will stay real. What is wanted at the end is a “semantic web” where the logic as being imagined by the dictionary editor can also be picked up to be displayed in a different fashion (as in descendant trees) by machines. (A drawback would be that correct wikitext would become increasingly byzantinic for new editors).
In any case of course nobody wants to read duplicated entries. Fay Freak (talk) 02:34, 7 July 2018 (UTC)
I'll sit on the fence for now - interested in the outcome, though. Notifying (Notifying ZxxZxxZ, Dijan, Irman, Kaixinguo~enwiktionary): , @Vahagn Petrosyan. --06:49, 7 July 2018 (UTC) —This unsigned comment was added by Atitarev (talkcontribs).
Personally I think listing Modern Persian descendants on Modern Persian entries is somewhat redundant. However, if a term is only used in Classical Persian and has a different Modern Persian descendant, then the Classical Persian entry should have the Modern Persian listed as a descendant. But yeah, IMO having "entries descending from themselves" is unnecessary. —AryamanA (मुझसे बात करेंयोगदान) 16:08, 7 July 2018 (UTC)
@AryamanA: The true redundancy is having all the borrowing from Classical Persian manually duplicated on the Persian and OP/PIr entries. Imagine if we had to do that for Sanskrit. Also, when we have borrowings from CP, Dari, Tajik, Tati, etc., we run into the same problem with as the original discussion in having it look as though Tajik and the CP borrowings descend from Modern Persian. I think better to just have the descendants section on Persian entries represent CP, so we can be consistent and clear. And again, this only applies when we have borrowings from cliefly CP and Tajik, otherwise there would be no descendants section. --Victar (talk) 18:30, 7 July 2018 (UTC)
Repeating my stuffed-up call to Persian speakers and Vahagn, people who might be interested: (Notifying ZxxZxxZ, Dijan, Irman, Kaixinguo~enwiktionary): , @Vahagn Petrosyan. --Anatoli T. (обсудить/вклад) 02:08, 10 July 2018 (UTC)
As noted by others, Latin is not a good model in this matter for other languages; this new system causes redundancy for Persian as borrowings from language variants other than Classical Persian is rare (the same goes for Hebrew and Arabic). In those rare cases, we can simply use something like {{qual|via Iranian Persian}} in the descendands section.
Entries having the header "Persian" correspond to Classical Persian, Iranian Persian, Dari, and other regional forms of Persian that use the Perso-Arabic script. {{fa-regional}} should be probably deprecated in favor of {{alter}}. This model has already been used for other languages as well, e.g. Ancient Greek. Descendants section would cover the descendants from all of these variants of Persian, in practice, it reflects mostly descendants of Classical Persian. Rare cases from other forms can be indicated using {{qual}} as I noted earlier or by other similar means. --Z 08:50, 10 July 2018 (UTC)
@ZxxZxxZ We are currently using the Latin model, which is treating Classical Persian and Modern Persian as the same language. Did you mean something else?
You haven't addressed the problem of CP borrowings appearing as from Modern Persian and the original discussion of Tajik appearing as a descendant of Modern Persian when it has borrowing and we add it to the descendants section, which is surprisingly quite common. You also haven't addressed the content duplication of borrowings on PIr/OP entries and Persian ones without the use of {{desctree}}. Do you have any thoughts on those points? --Victar (talk) 16:39, 10 July 2018 (UTC)
Let me clarify my previous comment: "Persian" in wiktionary should refer to all forms of Persian (including Classical, Iranian, and Dari Persian), except Tajik. This has been our practice for a long time. This includes the descendants section, so seeing "Persian" in this section does NOT refer to for example modern Iranian Persian unless otherwise stated (a rare situation). So in my view the problem you mentioned does not actually exist.
Latin is good as a model in that particular area you mentioned right above, I meant it is a bad model in the descendant section, because, as I understood, it's not uncommon to have many descendants from each form of Latin for a single lemma. This is not the case for most other languages, and causes redundancy.
I'm not exactly familiar with the functionality of {{desctree}} and how and why it is causing problem here, so I can't comment on this. --Z 17:27, 10 July 2018 (UTC)
@ZxxZxxZ, OK, so let's use a real world example of Persian خرما (xormâ).
  1. As you can see, all the root borrowings are actually from CP, but to the untrained reader, they appear to be borrowed from Modern Persian.
  2. Tajik is also listed because of its borrowing into Uzbek, which again, makes it appear as descended from Mod. Persian. Now you could argue that the Uzbek borrowing should just be on the Tajik page, but that would a) hide the borrowing away from readers, and b) be contrary to the premise that Persian reflects both Mod. Persian and CP.
The fact is, readers are always going to assume Persian means Mod. Persian because we give them little to no indication otherwise. Personally, I think the only solutions are to
a) treat all Persian descendants lists as CL,
b) treat all Persian descendants lists as Mod. Persian,
c) have a note at the top of all Persian descendants lists specifying that it reflects one or the other, or
d) split Mod. Persian entries from CL, aka the Armenian method.
For the functionality of {{desctree}}, see 𐎠𐎰𐎥 (a-θ-g) and سنگ. Hopefully, that is enough for you to understand its use and the problem at hand. --Victar (talk) 17:52, 10 July 2018 (UTC)
On the other hand, that untrained reader may also be confused by seeing the names "Dari" and "Iranian Persian". Indeed, many times we follow this practice of mentioning the language variants simply as a poor replace for more accurate information regarding the time of the borrowings. Instead, I suggest adding a new feature to {{desc}} to add this information (year or century, e.g. "before 14th century"). We ultimately should be adding such information in Wiktionary. Doing that would automatically eliminate such problems with Persian and other languages. --Z 18:27, 10 July 2018 (UTC)
@ZxxZxxZ, I'm not understaning. Could you give us an example of your suggestion in the context of the Persian خرما (xormâ) and {{desctree}} problems mentions? --Victar (talk) 19:40, 10 July 2018 (UTC)
See my last edit there. This way it becomes clear to all readers that it's not a modern borrowing. --Z 11:52, 11 July 2018 (UTC)
Thanks, @ZxxZxxZ. So basically you're recommending that we should add the date prefix [11-15th century] to every CP borrowing in descendants lists. Although I do think adding the date of the earliest attestation of a borrowing is a good idea (I do that for Frankish borrowings), I don't think it's a very elegant solution, nor does doesn't address the Tajik or {{desctree}} issues. --Victar (talk) 14:53, 11 July 2018 (UTC)
Why is it even important to indicate that the borrowing is from Classical and not Modern Persian? If we don't make this distinction explicit in our Persian entries why make it explicit in descendant trees? Crom daba (talk)
For the same reasons made here. --Victar (talk) 17:54, 11 July 2018 (UTC)
Political correctness? That's a drag. Dating prefixes don't sound so bad, if we have the necessary data I'd say go for it. Crom daba (talk) 18:19, 11 July 2018 (UTC)
Accuracy, clarity to readers, etc. As I pointed out, date prefixes don't address the various other issues listed above. --Victar (talk) 18:54, 11 July 2018 (UTC)

More entries than English Wikipedia[edit]

As of now, and for the first time ever, we have more mainspace entries than Wikipedia. That makes us better than them. Now who can delete their main page to show them who's really in charge around here? DTLHS (talk) 02:51, 7 July 2018 (UTC)

THIS. --Victar (talk) 04:11, 7 July 2018 (UTC)
They say the main page is undeletable. They say it can't be done. But *handing out briefing materials, playing suspenseful music* we're assembling a team to do it.
  • SemperBlotto: the lookout. Ever-watchfully patrolling RecentChanges here, he has the skills to keep a lookout for any admins over on 'pedia who might get in our way.
  • Wonderfool: the demolition specialist. He knows how to delete pages that shouldn't be deleted. Pages that "can't" be deleted. He can evade any blocks and get us inside, especially with the help of...
  • BD2412: the inside man. He's been an admin on WP since 2005. Studying them. Gaining their trust (and sometimes their ire, like any rouge admin). He can edit and unprotect protected pages.
  • Equinox: the getaway driver. I don't know how we're gonna incorporate a getaway car into this, but most of the movies I've seen about this kind of thing have one, so we're bringing one. :)
  • Other spaces are still available: volunteer below.
The devs have made it impossible to use the "delete" function on Wikipedia's Main Page, but our haxx0rs have found a backdoor: replace the text of the page with the text of MediaWiki:Noarticletext.
</joke> don't ban me WMF...
- -sche (discuss) 05:49, 7 July 2018 (UTC)
Wow! It's slightly unfair though because we have one or two non-English entries and they have none. Equinox 12:47, 7 July 2018 (UTC)
No it is very unfair because we have a lot of bot-created entries that only contain non lemma forms. Dixtosa (talk) 13:20, 7 July 2018 (UTC)
Hmm, I wonder if there should be an entry for rouge admin? Imaginatorium (talk) 14:40, 7 July 2018 (UTC)
If it's attested outside WP. Otherwise, w:Wikipedia:Rouge admin covers it. - -sche (discuss) 17:28, 7 July 2018 (UTC)
Conversely, Wikipedia has lots of entries like w:List of Indian states and union territories by literacy rate, w:List of Indian states and union territories by GDP, w:List of Indian states and union territories by access to safe drinking water, w:List of Indian states and territories by highest point and w:List of Indian states and territories by Human Development Index, in addition to its entries on the actual states themselves. - -sche (discuss) 17:28, 7 July 2018 (UTC)
Wikt entries that have made me laugh today: National Teacher Appreciation Week. Equinox 17:29, 7 July 2018 (UTC)
  • -sche...you forgot to add <joke>. BTW, apparently WF has already WP Main Page, if this article is to be believed. --Harmonicaplayer (talk) 15:14, 9 July 2018 (UTC)
  • Also, I hope we have told the whole world about these feat - on our Twitter page, Facebook page, Instagram feed, in the Online Club of Dictionaries, on Wikicommons, and the Wikipedia Signpost itself. I'll do my bit and try to have it published in El Pais. --Harmonicaplayer (talk) 15:18, 9 July 2018 (UTC)
  • Yes, I could literally delete the Wikipedia main page. It is highly unlikely that I would do that, as I have zero familiarity with Equinox's getaway driving skills. By the way, Dixtosa, Wikipedia has millions of bot-created entries that have nothing but, i.e., census data for obscure localities. bd2412 T 19:42, 8 July 2018 (UTC)

Are comparative or superlative forms lemmas?[edit]

There seems to be some inconsistency when it comes to entries: in Finnish alone, there are ones marked as lemmas: katalin, and ones that are not: kallein. (The exceptions would naturally be if the forms are themselves used in some idiomatic way) SURJECTION ·talk·contr·log· 10:20, 7 July 2018 (UTC)

I don't think they should be. Big is the lemma; bigger/est are inflections. Equinox 12:46, 7 July 2018 (UTC)
It is probably better to unify either way for most languages. One could argue that because they can be inflected (in Finnish at least), they could be classified as lemmata, although I also think that they shouldn't be classified as such. Anyone got a bot lying around? SURJECTION ·talk·contr·log· 12:47, 7 July 2018 (UTC)
It seems Finnish and Spanish are primarily affected; I tried to check the comparative and superlative categories of other languages, and they do not seem to be set as lemmas. SURJECTION ·talk·contr·log· 13:59, 7 July 2018 (UTC)
Update: Also affects the adverbs. Russian adverb comparatives are also set to be lemmas, when they should not be. SURJECTION ·talk·contr·log· 14:03, 7 July 2018 (UTC)
I have started working on the Finnish entries - Russian and Spanish seem more numerous, so it is probably better to automate the conversion there. The head templates are what needs to be changed. SURJECTION ·talk·contr·log· 16:06, 7 July 2018 (UTC)
In Ancient Greek, many comparative and superlative adjectives are treated as lemmas. I think this makes more sense than it does in English, because they have inflected forms of their own, and a few adjectives have more than one comparative associated with them, sometimes with a different range of meaning. For an extreme example, see the bottom of the declension table for ἀγαθός (agathós), which currently lists six comparatives and five superlatives. I agree that consistency is a good idea, but would request that you get agreement from the editors who've worked hardest on a language before making any changes. For the record, I prefer treating Ancient Greek comparative and superlative adjectives as lemmas. — Eru·tuon 18:26, 7 July 2018 (UTC)
I did actually point this out a bit earlier: "One could argue that because they can be inflected -- , they could be classified as lemmata". The reason why I do not believe so though, is because how to actually derive the comparative and superlative forms is usually predictable and very much resemble how inflection works, making them inflected forms instead. SURJECTION ·talk·contr·log· 18:32, 7 July 2018 (UTC)
@Surjection: I guess comparatives and superlatives are usually predictable (in English and Ancient Greek at least), but I'm not sure if that is a feature that is used to distinguish inflected forms from derived forms. — Eru·tuon 20:02, 7 July 2018 (UTC)
The difference is made based on the words you can logically do it to. Most adjectives have comparatives and superlatives, with uncomparable adjectives being the exception. Being comparable is the status quo. That is the opposite for derived forms, where not being able to derive from a specific word is the status quo. The existing categories too say that comparatives are "adjectives that are inflected to display relative degrees of given qualities between nouns". SURJECTION ·talk·contr·log· 20:06, 7 July 2018 (UTC)
Okay, that makes more sense. I am not sure how to verify "most adjectives are comparable" though. I'm guessing that, for English at least, that would have to include phrasal comparatives like more fun, most fun (as opposed to the silly-sounding funner, funnest). — Eru·tuon 20:21, 7 July 2018 (UTC)
Based on the English entry for fun listing those as the comparative and superlative, I would assume they are included. SURJECTION ·talk·contr·log· 20:24, 7 July 2018 (UTC)
Well, despite that, I think funner and funnest usually sound silly, as if they are almost ungrammatical. I have no idea why, because short adjectives usually can have totally normal-sounding comparatives. But longer adjectives such as intelligent usually don't have synthetic comparatives. (Intellegenter, intellegentest sound even sillier than funner and funnest. That is, they are felt as more ungrammatical.) So, while I do think synthetic comparatives and superlatives in English can be considered inflectional forms, or at least that it is traditional to do so, and would be most practical to categorize them as such on Wiktionary, I'm not sure about the generalization that adjectives are comparable by default. — Eru·tuon 20:36, 7 July 2018 (UTC)
The fact that Category:English comparable adjectives is not a thing but Category:English uncomparable adjectives is should be sufficient evidence to say that comparable adjectives are the default. (This also applies to other languages) SURJECTION ·talk·contr·log· 20:41, 7 July 2018 (UTC)
There are various factors involved: I'd say, roughly, -er, -est are likelier to "sound right" on words that are older, Germanic, and have fewer syllables. Comparability is certainly not the default for long, modern, Latinate scientific words as found in biology/chemistry. Equinox 20:45, 7 July 2018 (UTC)
Scientific words tend to be uncomparable due to their rigorous definition, as well as the fact that many describe a "set" and you cannot really compare the degree something is included in a black-and-white set like that. SURJECTION ·talk·contr·log· 20:48, 7 July 2018 (UTC)
I don't agree: if something can be "rounder", why not "*subovater"? If "smaller", why not "*microscopicer"? Equinox 20:50, 7 July 2018 (UTC)
"more subovate", "more microscopic". I did not say all scientific words are uncomparable, but that they tend to be. SURJECTION ·talk·contr·log· 20:51, 7 July 2018 (UTC)
Well, category structure is based on a variety of concerns besides the linguistic concern of which state is the default. If the number of entries is any guide, uncomparable is the default in English because there are somewhat more adjectives in the uncomparable category (63,451) than outside it (116,594 - 63,451 = 53,143). — Eru·tuon 21:15, 7 July 2018 (UTC)
The large number of uncomparable adjectives to due to two distinct reasons: 1. large number of uncomparable scientific terms and 2. nationality and language terms (which are naturally not comparable). For other languages, there are more comparable than uncomparable adjectives. Beyond that, most basic adjectives in everyday use are comparable. SURJECTION ·talk·contr·log· 21:22, 7 July 2018 (UTC)
Okay, I guess that makes sense. (Though some nationality adjectives are given as comparable, like Englihs and Russian: after all, one can display more of the typical characteristics of a nationality.) — Eru·tuon 22:24, 7 July 2018 (UTC)
It actually would seem Ancient Greek is different - no "adjective comparative form" categories, but "comparative adjective" categories. Whether that is done should be decided on a language-by-language basis, and if we are going to do that, this is probably the time to decide for some languages. SURJECTION ·talk·contr·log· 18:38, 7 July 2018 (UTC)
No, there actually are adjective comparative forms and superlative comparative forms categories for Ancient Greek: see Ancient Greek adjective comparative forms and Ancient Greek adjective superlative forms. Remember to look under adjectives for comparative adjectives and superlative adjectives and under adjective forms for adjective comparative forms and adjective superlative forms. — Eru·tuon 19:37, 7 July 2018 (UTC)
I did actually find the former category later, and it only has a single entry, which is not an adjective comparative form but a comparative adjective form; it's an inflected form of a comparative adjective. As to the latter category, it seems inconsistent; I cannot find a rule to differentiate between the entries at Category:Ancient Greek adjective superlative forms and ones under Category:Ancient Greek superlative adjectives. SURJECTION ·talk·contr·log· 19:43, 7 July 2018 (UTC)
@Surjection: Oh, you're right. μεῖζον (meîzon) is the only entry in Ancient Greek adjective comparative forms, and it is the neuter form of μείζων (meízōn), the comparative of μέγας (mégas). That reminds me of another concern: if comparatives and superlative adjectives are categorized as adjective comparative forms and adjective superlative forms, what will we name the category for their inflected forms? (And are there practical difficulties with having a three-link chain: inflected forms of comparative or superlative forms of adjectives? Not sure.) — Eru·tuon 19:53, 7 July 2018 (UTC)
All of that will depend on whether we will consider comparatives or superlatives lemmas or not. If we do, comparative adjectives > comparative adjective forms, while if we don't, leaving only comparative adjective forms, we will probably have to rename to something else, like adjective comparatives. SURJECTION ·talk·contr·log· 19:56, 7 July 2018 (UTC)
And then there's words like northernmost, which can be turned around to read "most northern". DonnanZ (talk) 21:09, 7 July 2018 (UTC)
I meant, if comparatives and superlatives are not categorized as lemmas, then inflected forms of comparatives are a non-lemma form of a non-lemma form of a lemma. That is confusing. I think it is less confusing to treat Ancient Greek comparatives and superlatives (not English ones though) as lemmas. A similar case is participles, which are inflected forms of verbs, but in some languages have their own inflected forms. But actually participles are categorized as non-lemma forms that have their own non-lemma forms; there is no lemma–nonlemma split for participles. — Eru·tuon 21:15, 7 July 2018 (UTC)
I do not really find it that confusing - polysynthetic languages could go even further than that. Drawing the line between lemmas and non-lemmas based on whether they can be inflected comes across as somewhat disingenuous, as English adjectives are an exception here - many languages have comparative and superlative forms at least have plural forms. I would be completely okay with having comparatives and superlatives be adjective forms, while those categories would have their own subcategories for inflected forms of those. SURJECTION ·talk·contr·log· 21:30, 7 July 2018 (UTC)
Yeah, well, I can't comment on how to treat polysynthetic languages, because I haven't really studied any. — Eru·tuon 21:47, 7 July 2018 (UTC)
I've studied a couple, but not in the depth to help much here (and not recently- I'm fuzzy on the details)- @Stephen G. Brown could give you chapter and verse. Basically you may have one undisputed lemma, and then you have concentric layers of affixes that, depending how you look at it, could be derivation, inflection, or even parts of complete sentences- in lots and lots of different combinations. I remember Dr. Bright pronouncing for our class many years ago a string of 13 consonants, which he said was a single Bell Coola "word" for "I saw those two women come this way out of the water". Suffice it to say, you don't want to even try a binary distinction like this for polysynthetic languages- that way lies madness! Chuck Entz (talk) 23:28, 7 July 2018 (UTC)
As for participles, weelllllll... SURJECTION ·talk·contr·log· 21:33, 7 July 2018 (UTC)
Based on this, my proposal is: Category:LANGUAGE comparatives and Category:LANGUAGE superlatives, both of which are under Category:LANGUAGE adjective forms (and therefore not lemmata), with both having their respective Category:LANGUAGE comparative forms and Category:LANGUAGE superlative forms subcategories for inflected forms of such. SURJECTION ·talk·contr·log· 21:37, 7 July 2018 (UTC)
Hmm, but then where do you put comparative and superlative adverbs? — Eru·tuon 21:42, 7 July 2018 (UTC)
That is a good point, maybe the categories need the actual part-of-speech after the language to get Category:LANGUAGE adjective comparatives and Category:LANGUAGE adverb comparatives. I will admit that is a bit of a mouthful (especially the forms subcategories), but it is still better than the status quo or classifying comparatives or superlatives as lemmata. SURJECTION ·talk·contr·log· 21:44, 7 July 2018 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Well, comparative adverb and comparative adjective sound better. The reverse order seems quite awkward; I doubt it's very often used, if at all. — Eru·tuon 21:55, 7 July 2018 (UTC)

That is also good. Based on a quick Google search, it is actually used somewhat often, albeit comparative adjective is not as common as adjective comparative. SURJECTION ·talk·contr·log· 22:01, 7 July 2018 (UTC)
Wait no, that was all wrong. comparative adjective is more common and is the better option here. So LANGUAGE comparative adjectives and LANGUAGE comparative adjective forms. SURJECTION ·talk·contr·log· 22:02, 7 July 2018 (UTC)
Since this would be quite a major change, it is probably a good idea to create a vote. SURJECTION ·talk·contr·log· 22:10, 7 July 2018 (UTC)
Created: Wiktionary:Votes/2018-07/Restructure comparative and superlative categories. SURJECTION ·talk·contr·log· 22:28, 7 July 2018 (UTC)


Do we want systematic names of isotopes, like uranium-235 and oxygen-18? Including theoretical ones, a great deal of these can be attested, but I don't see them as being of lexical interest. (Some isotopes, like deuterium, have a special name that should obviously be kept.) —Μετάknowledgediscuss/deeds 04:00, 10 July 2018 (UTC)

  • I think that the hyphen makes them a "word in a language" so we should keep them. I went through a phase of creating lots of them some years ago but got bored. SemperBlotto (talk) 04:04, 10 July 2018 (UTC)
The contents of the ones that are formed systematically as element - number seems to be predictable from the entry name, so they seem lexically uninteresting; even pronunciation information is coverable by the entries for the element and the number. They seem as (mostly) useless as 58-degree (angle or day), 59-degree, etc (other hyphenated strings), so I am inclined towards deleting them. They also seem (mostly) harmless, so I don't feel too strongly about deleting them. (But it would be absurd, IMO, to keep these but delete attributive-hyphen forms.) - -sche (discuss) 05:20, 11 July 2018 (UTC)
Yes, they are predictable - so our most plurals. All words should be treated alike. Either keep them as well as attributive-hyphen forms (if they could pass RfV) or delete both. SemperBlotto (talk) 05:27, 11 July 2018 (UTC)
I'd favor not including them. They are predictable and uninteresting. It is also hard to imagine a human looking them up on Wiktionary. Almost any compound of a word and any of a range of numbers would seem unworthy of inclusion, though there could conceivably be exceptions. Obviously an expression like cloud 9 would be different, but 9 is, I think, the only number that can occupy that slot to create an expression with a distinctive meaning. DCDuring (talk) 14:20, 11 July 2018 (UTC)

Global preferences are available[edit]

19:19, 10 July 2018 (UTC)

Live vlogging fr.WT[edit]

Lyokoï has been occasionally live-editing fr.WT on video as an introduction and contributor recruiting project. The next event is scheduled for 12 July at 20:30 (not sure if that is UTC, url has a countdown) on YouTube. Commentary and editing in French, of course. - Amgine/ t·e 16:52, 11 July 2018 (UTC)

Do German participles get their own inflection tables?[edit]

I tried looking it up in the archives, but I couldn't find a clear answer. I'm talking about regular Attributive verbs, where the adjectival form has the same meaning as the verb.

  1. Do German participles get their own (non-comparative) inflection tables?
  2. If so, does the inflection table go in the existing Verb section or a new Adjective section?

Mofvanes (talk) 20:05, 11 July 2018 (UTC)

I tried to check what Finnish does, and it seems inconsistent... some participles have no declension tables, others do under the Verb section, others have their own adjective section and a declension table there, some have a "Participle" section... SURJECTION ·talk·contr·log· 15:50, 12 July 2018 (UTC)
Examples of all four: juotu, ajettu, keitetty, hakkeroitu. SURJECTION ·talk·contr·log· 15:54, 12 July 2018 (UTC)

Moving all Volapük entries to the appendix[edit]

A few months ago all Lojban entries were moved to the appendix. I think this should also happen to all Volapük entries. In the category Category:Volapük lemmas there are 2643 entries, but only one of them has any citations. There are currently 27 entries on the page Wiktionary:Requests for verification/Non-English and I doubt any of them will pass. Maybe it would be better if everything would be moved to the appendix instead. Robin van der Vliet (talk) (contribs) 15:46, 12 July 2018 (UTC)

My small contribution to this is that you should do a great job of communicating this if you do so. I felt hurt that the Lojban words were moved when I was busy with other non-Wiktionary business. I was actively editing, came back after a few months, and couldn't find out what had happened to my hard work. I understand the desire to not spend energy and time on a language that you don't know and don't use, I just would like to register that rare languages mean the number of people actively working on them is small, and there needs to be a LOT of communication to make sure the ones who care can find out about changes. Jawitkien (talk) 17:51, 12 July 2018 (UTC)
@DtheZombie, Lingo Bingo Dingo, Lunaris filia, Malafaya, Nielsheur, Pereru, Raekmannen: I would like to invite you all to this discussion, as you all have indicated Volapük in your Babel box. Robin van der Vliet (talk) (contribs) 18:02, 12 July 2018 (UTC)
The fact that a lot of Volapük words have had to be sent to RFV is not a reflection of the corpus so much as the fact that they were all created by a single problematic editor who seems to have made many of them up on the spot. Volapük does indeed have a corpus on Google Books that shows that a lot of vocabulary is indeed attestable, as was pointed out to me by User:Mx. Granger. The constructed languages that really need moving to the appendix are, in my opinion, Interlingua, Interlingue (Occidental), Novial, and potentially Ido. —Μετάknowledgediscuss/deeds 19:39, 12 July 2018 (UTC)
I agree that there is a literature in Volapük which simply does not exist in (e.g.) Lojban where all of the "literature" is purely used in experimental contexts amongst the dozen or so speakers exclusively to extend the language and test its underlying philosophy. There are several thousand lemmas in Volapük that can be attested from literature and that is not true for most constructed languages. —Justin (koavf)TCM 21:27, 12 July 2018 (UTC)

Before we do such a thing, I think we should solve the issues that have been raised at Wiktionary:Beer parlour/2018/June § On the placement of constructed languages, and on the attestation of appendix-only languages. Per utramque cavernam 21:31, 12 July 2018 (UTC)

  • Thanks for the ping. I agree with Metaknowledge and Koavf that the Volapük corpus seems to be large enough for us to cover a good number of words under CFI (unlike Lojban). It's true, though, that I've been RFVing a lot of Volapük words, mainly because one prolific user has been adding a huge number of unattestable Volapük words. It's tedious getting rid of all these entries through RFV, though, and it might be better to do it faster. Here's one suggestion: temporarily give me (or some other administrator) permission to delete on sight any entry for a Volapük noun or adjective that gets no hits on Google Books or Wikisource. Or something along those lines. That would at least put a dent in the mountain of unattestable Volapük entries, and there wouldn't be much risk of losing good entries, because a Volapük word that has no hits on Google Books or Wikisource would be unlikely to pass RFV if nominated. —Granger (talk · contribs) 01:13, 13 July 2018 (UTC)
    • Yes, the same editor of which we speak also has done similar things to other languages, including Esperanto. Anytime someone in an RFD discussion says "it's not like someone is going to create entries for all possible permutations", I think of him and say, honestly: "if you open that door, we have people who will try to bring the entire universe through it". Chuck Entz (talk) 03:34, 13 July 2018 (UTC)
      • I support the idea of giving you permission to delete under those criteria, for that editor's entries only. It's not unlike mass-deleting a vandal's contribs. —Μετάknowledgediscuss/deeds 04:17, 13 July 2018 (UTC)
        • Does anyone else support this idea? —Granger (talk · contribs) 14:02, 16 July 2018 (UTC)
          • @Mx. Granger: I do, but I'm not an admin nor do I work with Volapük. Per utramque cavernam 19:09, 17 July 2018 (UTC)
            Thanks. I'm happy to undertake the task myself. I just want to make sure I have the support of the community before I start. —Granger (talk · contribs) 00:11, 18 July 2018 (UTC)
            I think that if nobody airs a problem with it in another day or so, you should take that as good enough to run with it. We could ping people who have already commented here if you want, but it's not like they don't have this page watchlisted. —Μετάknowledgediscuss/deeds 00:24, 18 July 2018 (UTC)
        • I think it's a reasonable idea, for the reasons Metaknowledge said. The user is known to have created a lot of unattested terms. - -sche (discuss) 01:26, 18 July 2018 (UTC)

See Also vs Related Terms[edit]

Could someone tell me if it is better to have a sub-header of "See Also" or "Related Terms" ? I'm seeing both used in the Lojban entries, and I'd like to standardize if it has already been decided. Jawitkien (talk) 17:51, 12 July 2018 (UTC)

Related terms are for terms that are somehow etymologically related. "See also" is not really defined, you can put whatever you want in it. DTLHS (talk) 17:55, 12 July 2018 (UTC)
@DTLHS I see entries using "Derived terms" also.
My current usage will be:
if it is syntactically derived, I use "===Derived terms==="
if it is etymologically derived, I use "===Related terms==="
if it is related but not derived, I use "===See Also==="
Does this sound reasonable ? Jawitkien (talk) 23:30, 12 July 2018 (UTC)
Sure. DTLHS (talk) 04:18, 13 July 2018 (UTC)
What I do is put all derived terms in the Derived terms section, terms that are etymologically related but not derived (like etymological sisters, cousins, aunts, or nieces) in the Related terms section, and random odds and ends in See also (though I probably haven't used See also as much as the other two). I'm not sure what syntactically derived means. If it means phrases that contain the term in the current entry, then I put those in Derived terms. I think it's misleading to put etymologically related terms in See also rather than Related terms! But sometimes people put terms that are really derived in Related terms, or do other odd things. — Eru·tuon 04:41, 13 July 2018 (UTC)
One example of See also is in the Spanish entry gallo, meaning "rooster", where "pollo" (chicken meat) is listed. The words themselves aren't related and they aren't synonyms, but there is a clear connection between the two words. Andrew Sheedy (talk) 21:48, 15 July 2018 (UTC)

Eau, blast![edit]

Is there a way to nominate pages with only unattestable entries for deletion? I am thinking of eaublast. See also Wiktionary:Requests for verification/English#eaublast.  --Lambiam 09:38, 13 July 2018 (UTC)

{{speedy}}, if you're sure the page is too bad to merit discussion through normal channels. Equinox 12:34, 13 July 2018 (UTC)
I feel it was sufficiently discussed at WT:RFVE.  --Lambiam 14:20, 13 July 2018 (UTC)

Eye-dialect phrase alternative form entries[edit]

In 2017, the deletion discussion for thank ya so much happened. The result of this discussion was to delete the page, along with others like thank u so much, etc. Also in 2017, there were also a deletion discussion for fer cryin' out loud. The result of this discussion was different; the page was not deleted, but was instead redirected to the entry for crying out loud.

These two discussions had different results. This is inconsistent; we need a consistent way to deal with these entries, a clear community consensus on it. The problem with entries like these is that many of them have overabundant possibilities; see the comments by User:Mihia in the discussions. By current rules, technically, as I summarize some of Chuck Entz's statements in Talk:thank ya so much, these entries are not sums of their own parts, since you're inserting an eye-dialect variable (or more than one) into a phrase that is already not a sum of its own parts. Thus, I've brought up this discussion to propose that we modify WT:Criteria for inclusion to make a brief statement about these eye-dialect phrase entries, based on the consensus reached by this discussion.

Consensus from both deletion discussions clearly is that trivial eye-dialect forms of phrases should not have dictionary entries. However, there's also a similar discussion for for cryin' out loud. The result was to keep as a dictionary entry due to how common of an alternative form this one actually is. So the exception to the CFI policy I propose would presumably be if a phrase was particularly common in its eye dialect form (i.e. see ya < see you).

However, we can go either one of two ways with this. 1.) Entries such as let's get dis party started should hard-redirect to let's get this party started. 2.) Entries such as let's get dis party started should be deleted completely.

This might be a tough one to figure out. So, before starting a policy vote, I'm gonna need help forming such a vote, as I'm not even sure which direction this should necessarily go. For instance, how should we treat entries that are particularly common as eye dialect forms (such as see ya)? Should another exception be that phrases with only two words in them (X Y) or three (X Y Z) should allow as many eye dialects as possible for entries, or redirects for the second proposal? Please help me out here, much appreciated. Thanks for any input.

I'll go ahead and make some subsections here for some pre-support votes for either side of this debate. (As usual, if there was already a similar discussion to this, I don't recall it and wasn't able to find it, so don't pounce on me if there was.) PseudoSkull (talk) 22:35, 13 July 2018 (UTC)

Trivial eye dialect forms should redirect[edit]

Put support votes here if this is your opinion. PseudoSkull (talk) 22:35, 13 July 2018 (UTC)

  • Support. If someone goes to the trouble of typing an attested variation of a CFI-worthy phrase into the search bar, it should take them someplace useful. I do not trust the search function to produce useful results. I would require citations first (and shoot on sight the uncited), and use the citations page to gather citations for all incoming variations. bd2412 T 01:46, 14 July 2018 (UTC)
  • Support hard redirects, with exceptions for the eye dialect form being equally or more common than standard spellings. If someone wants to go through the effort of creating these, that's fine with me. Andrew Sheedy (talk) 02:13, 14 July 2018 (UTC)
    [thank|fank] [you|ya|yer|ye] [very|verra] much would be 2*4*2 = 16 combinations for just one phrase (and I'm sure each word has many spellings I've not thought of). The issue here is not disk space... Equinox 02:37, 14 July 2018 (UTC)
    Are each of these variations, as phrases, attestable? How many of them will ever be created if we require attestation in advance? bd2412 T 14:38, 14 July 2018 (UTC)
  • I doubt they are all attestable but I disagree with creating any of them. Eye dialect/nonstandardness IMO should be dealt with at word level and not phrase level, since one word with n variants will otherwise (potentially, depending on attestation) multiply the number of derived phrases by n. I bet there are reasons other than "space on paper" why professional dictionaries wouldn't countenance this. Equinox 17:28, 14 July 2018 (UTC)

Trivial eye dialect forms should be deleted[edit]

Put support votes here if this is your opinion. PseudoSkull (talk) 22:35, 13 July 2018 (UTC)

  • Delete entries like "let's get dis party started" and "thank ya so much" (obvious bullshit, nobody would ever search for them), keep "for cryin' out loud" since that's how the expression is usually written and said. It's important to be as restrictive as possible here since someone will inevitably add thousands of these. DTLHS (talk) 01:53, 14 July 2018 (UTC)
Symbol support vote.svg Support or somebody's going to go cray-cray and slippery-slope a billion stupid (but citeable) phrases in here. Oh, I just glanced upward and DTLHS said exactly what I am saying. Equinox 01:56, 14 July 2018 (UTC)
Symbol support vote.svg Support based on the examples, but what exactly is a “trivial” eye dialect? Something like cah or masta could be described as trivial. — Ungoliant (falai) 02:21, 14 July 2018 (UTC)
I understand the Skull to be talking about phrases, not individual words. Equinox 02:27, 14 July 2018 (UTC)

(edit conflict)

Stuff like let's get dis party started would be trivial, and that's just assuming it happens to be attested at all. By trivial I meant the phrases, not the words themselves. for cryin' out loud is a particularly common one, and it's even more often said that way than "for crying out loud". Also see ya is a very common collocation of this nature, so it should be kept as is too.
But that's part of the problem with this proposal; we need a way to measure by consensus how useful any particular one of these phrases is, but obvious trivial ones should be deleted/redirected according to either of these two proposals. Perhaps we should make the policy with these similar to how we treat misspellings (as in, if "desaire" is not a particularly common misspelling of "desire" it is not kept, regardless of if it has 3 citations as would normally be accepted). PseudoSkull (talk) 02:34, 14 July 2018 (UTC)

What questions concerning the strategy process do you have?[edit]


I'm Tar Lócesilion, a Polish Wikipedia admin and a member of Wikimedia Polska. Last year, I worked for Wikimedia Foundation as a liaison between communities and the Movement Strategy core team. My task was to ensure that all online communities were aware of the movement-wide strategy discussion. This year, my task similar. Phase II of the strategy process was launched in April. Currently, future Working Groups members are being selected, and related pages on Meta-Wiki are being designed.

I’d like to learn what questions concerning the strategy process would you like to be answered on the FAQ page? Please answer here, on my talk page, or on a dedicated talk page on Meta-Wiki. Thanks!

If you have any questions or concerns, please, do ask!

Thanks, SGrabarczuk (WMF) (talk) 18:29, 14 July 2018 (UTC)

I'm live streaming my editing![edit]

I'm live streaming my Wiktionary activity on YouTube right now, if anyone wants to watch. https://www.youtube.com/watch?v=r3-rNoIA7cU PseudoSkull (talk) 22:06, 14 July 2018 (UTC)

The stream is over but I might do it again sometime perhaps. However, you can still see the contents of the stream. I timed out at 56 minutes. PseudoSkull (talk) 23:04, 14 July 2018 (UTC)
Just remember that being an admin shows you things that shouldn't be visible to the public- be careful to limit the kinds of things you do while streaming, and make sure what you're working with is clear of any vandalism so you won't be giving it undue attention. Chuck Entz (talk) 20:48, 15 July 2018 (UTC)
Thanks for the video. It is interesting to see how other people contribute and especially on another Wiktionary (I contribute mainly on the French Wiktionary). Pamputt (talk) 06:12, 16 July 2018 (UTC)

Wiktionary:Foreign Word of the Day/Nominations[edit]

What's with the new layout? Are we in Europe?? Wyang (talk) 22:05, 16 July 2018 (UTC)

The layout is by User:Per utramque cavernam (see his talk page for recent discussion of it). It reflects the approximate division of FWOTDs, which is in turn based on our strengths at Wiktionary. Hopefully more non-European languages can be featured in the future, but that also means I'll need more such words to be nominated. —Μετάknowledgediscuss/deeds 22:31, 16 July 2018 (UTC)
This ‘strength’ at Wiktionary is something to be ashamed about. < 10% of the world’s population is in Europe, yet we still pride ourselves on this Eurocentrism. All words in all languages (in Europe)... with a smattering of words elsewhere? Wyang (talk) 23:20, 16 July 2018 (UTC)
Remember this is en.wikt and has a user base that somewhat reflects that. There is no automatic way to pull in all the content from other Wiktionaries. Equinox 23:23, 16 July 2018 (UTC)
The layout of the nominations page has nothing to do with the proportion of words from different regions that are featured. DTLHS (talk) 23:31, 16 July 2018 (UTC)
Then what's the point? Don't forget that the project's main page reads "Welcome to the English-language Wiktionary, a collaborative project to produce a free-content multilingual dictionary. It aims to describe all words of all languages using definitions and descriptions in English." NOT all words of all European languages. Some editors have been working very hard to increase the coverage of the world's major languages, such as Chinese ― the language with the most native speakers in the world, outnumbering the rest by a wide margin. Yet there are some who view Europe as the centre of the world and actively try to suppress the rest: National European language vs Minor or extinct European language vs Non-European Language. Are you kidding me??? Might as well split it into Wiktionary:European Word of the Day and Wiktionary:Non-European Word of the Day. Wyang (talk) 03:34, 17 July 2018 (UTC)
I have no idea what the fuck you're talking about. Again, how does the layout of the nominations page affect what words are chosen? Are you volunteering to run the FWOTD project? Are you actually complaining about the distribution of words that are actually featured, in which case why are you talking about the nomination page? DTLHS (talk) 03:47, 17 July 2018 (UTC)
I have no fucking interest in editing in this system either. Wyang (talk) 03:49, 17 July 2018 (UTC)
During the 60s in the US South, I'm sure there are white southerners who were asking "what's the big deal about separate lunch counters? The colored folks get served the same food as everyone else?" Chuck Entz (talk) 14:05, 17 July 2018 (UTC)
When I took linguistics at UCLA, we were required to take at least one year of a non-Indo-European language in order to graduate (I chose Mandarin). The fact is that European languages were so dominant that it was hard to find courses outside of major universities in other languages, so even linguistics students tended to have no exposure to other language families before they came to UCLA, and it was too easy to stick with what was already familiar (things have improved since then, but it's still true to some extent). In that case, it was necessary to address the bias explicitly in order to do something about it. Chuck Entz (talk) 14:05, 17 July 2018 (UTC)
The current layout is definitely wrong. Not only because it really is Eurocentric but because it doesn't reflect the huge contributions in some non-European languages, such as Chinese or Japanese, etc., the current or any future true distribution of lemmas and it shouldn't. I don't approve Wyang's slamming the doors, though. It doesn't achieve anything.
The layout has to change back to what it was. --Anatoli T. (обсудить/вклад) 13:49, 17 July 2018 (UTC)
What about convenience to the FWOTD caretaker (i.e. Metaknowledge)? Unless he says otherwise, I think it might help him run the thing.
However, I agree with -sche below that it shouldn't send an undesirable message either, and if it does it's a problem (Maybe I should have named the headers "Type 1", "Type 2" and "Type 3" :p). Per utramque cavernam 15:45, 17 July 2018 (UTC)
Yes, the split into European and non-European, while probably well-intentioned, is sending a undesirable message/effect and should be undone... the current layout with all the continents seems like an improvement...? What do you think? And though we're constrained by what words people enter in enough detail to feature, a la Equinox's and other people's point, maybe we could try to explicitly counter the preponderance of Indo-European a la Chuck's point by featuring one word from each continent per week? (So people might realize they could copy the formatting of when adding more words from that language?) With two days leftover for constructed languages and repeats of continents? Or at least we could try to feature, say, at least four different continents per week? - -sche (discuss) 15:09, 17 July 2018 (UTC)
We really don't have the ability to do one word per continent per week. You ran WOTD, so you know how hard it is already to avoid burnout. If anyone volunteers to help with these issues, I'd be happy, but I haven't seen any volunteering yet in this thread. —Μετάknowledgediscuss/deeds 16:14, 17 July 2018 (UTC)

What a lame discussion. And what a twisted accusation! Obviously, the layout has only reflected what had already amassed for long, not to segregate, just to sort, bringing what the mildest system of order has to comprise. It could even help to get away from Eurocentrism, but that progressive dogma whereby disparities disappear when they aren’t exposed is apparently too attractive. No, @Atitarev, that page, as a medium, cannot just simply reflect contributions across the Wiktionary, people are still invited to post them thither, and if the managers don’t have a secret agenda, then apparent unevennesses are just. And they are also expected, a priori, for a Wiktionary of an European language attracts users of European ties and the economic and even individual probabilities (who gets educated in which languages, becomes computer-literate and has the spare leisure to come hither) play an innegligible role too. Fay Freak (talk) 01:04, 18 July 2018 (UTC)

Replace {{unreferenced}} with {{rfr}}[edit]

Hey, could we replace {{unreferenced}} with {{rfr}}? It would fit to the scheme we use for {{rfe}} and {{rfv}}. --Victar (talk) 01:28, 17 July 2018 (UTC)

I agree the templates should be merged. If you can make {{rfr}}'s parameter 1 default to en or und when not specified (so existing uses of {{unreferenced}} don't break), we could just redirect {{unreferenced}} to {{rfr}}. (We should keep the redirect, of course, because why not? Some people might be used to typing it.) - -sche (discuss) 01:33, 18 July 2018 (UTC)