Wiktionary:Beer parlour/2016/November

From Wiktionary, the free dictionary
Jump to navigation Jump to search

what is the difference between uncountable and singularia tantum nouns?[edit]

I'm not sure what the difference is, but we categorize them separately. Benwing2 (talk) 02:56, 1 November 2016 (UTC)[reply]

Wikipedia, citing the Shorter Oxford English Dictionary, suggests that any "unique singular object" (something like Eiffel Tower?) could be a singulare tantum, even though it's not uncountable. Equinox 02:59, 1 November 2016 (UTC)[reply]
en.WP link, other examples from there include dust, wealth. - Amgine/ t·e 04:54, 1 November 2016 (UTC)[reply]
The WP article, following the SOED, holds that mass nouns (actually mass-noun definitions) are types of singulare tantum (definitions). Unless we include proper nouns, there are not very many "singulare tantum" definitions in English that are not mass-noun definitions. There is a clear grammatical/behavioral distinction between mass nouns and countable nouns and virtually no such distinction between "singulare tantum" nouns and standard nouns, besides that captured in the name. As one of the most common indicators that a speaker using English is not a native speaker is misuse of determiners, including those that distinguish mass nouns from countable nouns, it seems to me that this is an important aspect of English to emphasize.
BTW, using singulare tantum in a category name or a usage label is yet another example of the misappropriation of English Wiktionary by a language "elite" mostly concerned with linguistic technicalities rather than one attempting to serve the general dictionary-using community. DCDuring TALK 10:54, 1 November 2016 (UTC)[reply]
In my mind, a singulare tantum noun is one that refers to a singular object but happens to have no plural (so to refer to more than one you have to resort to synonyms). I can't think of any in English off the top of my head, but I'm sure they exist. --WikiTiki89 12:57, 1 November 2016 (UTC)[reply]
Would you say that any of the entries in Category:English singularia tantum meet the criteria? DTLHS (talk) 16:56, 1 November 2016 (UTC)[reply]
A mass noun can't be used with a singular article. Not only can you not have 'dusts' (as a noun) you can't have 'a dust'. So something that has countable singular use but no plural could be a singulare tantum. Examples to follow, I hope. Renard Migrant (talk) 17:11, 1 November 2016 (UTC)[reply]
I'm not saying this test covers every singulare tantum. Bite to eat might qualify. Renard Migrant (talk) 17:16, 1 November 2016 (UTC)[reply]
"a dust of" and "the dust" collectively have hundreds of thousands of hits on BGC, even when one excludes "the dust bowl". - Amgine/ t·e 20:59, 1 November 2016 (UTC)[reply]
The definite article can be used with uncountable nouns, like "the water". That's not special. The indefinite article is the unusual one. —CodeCat 00:02, 2 November 2016 (UTC)[reply]
There are, as I mentioned, many thousands of durably archived instances of both the indefinite and definite articles on BGC, and on SGC. (Although, to be fair, the first refers to a sense we do not have in en.WT but which would be countable.) - Amgine/ t·e
Both citations 2 and 3 also seem to use dust with a sense we do not have: "a kind of dust (sense 1)". (Almost?) all mass noun definitions have such a corresponding countable definition. DCDuring TALK 12:36, 2 November 2016 (UTC)[reply]
Perhaps you mean like soil, pollen, or hair? the exemplars in our sense #1? - Amgine/ t·e 18:37, 2 November 2016 (UTC)[reply]

Monotonic Katharevousa?[edit]

@Dimboukas, Eipnvn, Omnipaedista, Svlioras, Xoristzatziki, Rossyxan and anyone else who knows Modern Greek: Is/was Katharevousa ever written in monotonic orthography? I notice most of the entries in CAT:Katharevousa are monotonic and wonder whether they're actually attestable as such. —Aɴɢʀ (talk) 14:35, 1 November 2016 (UTC)[reply]

I understand why all this confuses you! I'll try to explain it to you. Normally, no, katharevousa isn't written in monotonic orthography. Even if we analyse this chronically, katharevousa was abolished in 1976 and the polytonic orthography in 1981, so theoritically only Demotic (or Standard Modern Greek) can be written in polytonic orthography and not vice versa. Have in mind that by 'abolished,' I mean that it stopped being teached at schools; by extension legal documents followed soon, albeit in a slower pace.
However, have in mind that Katharevousa is still Modern Greek, albeit a archaic fabricated variety. Κatharevousa and Demotic have blended into Standard Modern Greek. Finally, any archaic word (with archaic suffixes or an archaic declension, etc.) that can be still used in colloquial Greek is often said to be derived from katharevousa. As a result I can write in Standard Modern Greek using words of katharevousa in the monotonic orthography.
So these entries are attestable as such. Dimboukas (talk) 18:54, 1 November 2016 (UTC)[reply]
In durably archived sources? I have no interest in sending any of them to RFV, but in theory, if someone did, would they be likely to pass? —Aɴɢʀ (talk) 19:45, 1 November 2016 (UTC)[reply]

Katharevousa is a "variation" that can only be written in polytonic. Many words "created" or "modified" during the "civil existence" of Katharevousa (just like words coming from ancient Greek) are used in modern Greek and are written in monotonic. Therefore, if the lemma is in monotonic and the polytonic version is accented differently cannot be included in Katharevousa. e.g. word θάλασσα existed in ancient Greek, existed in Katharevousa, exists in modern Greek but αηδών is modern Greek (ok, is archaic but Greek) since the Katharevousa version is the not the same (aka ἀηδών, with ψιλή). I think the confusion comes from the needlessness to enlist all words used during the existence of Katharevousa (and therefore words that "might" be listed in a separate dictionary specific for Katharevousa). --Xoristzatziki (talk) 05:31, 6 November 2016 (UTC)[reply]

Fourth LexiSession: wine[edit]

Wine Aroma in English.

Dear all,

The Tremendous Wiktionary User Group, an official user group of Wiktionarians, is happy to introduce the fourth step of our collective experiment named LexiSession.

So, what is a LexiSession? The idea is to incentivize contributors from different languages to contribute on a focus topic, to enhance all projects at the same time! LexiSession was previously on cat, roads and ways and police. This new one is dedicated to Dionysus and we suggest a month - until the end of November - to fulfill the wine glasses! With the help of this multilingual Wine Aroma wheel, we hope to add several languages on several projects!

If you're up for this LexiSession, please indicate your contributions here! You can also have a look at what other Wiktionarians are doing, on the LexiSession Meta page. We will discuss the processes and results in Meta, so feel free to have a look and suggest topics for the next LexiSessions. And if you have some knowledge of other languages, you can translate this message to other beer parlour!

Thank you for your attention, and I hope you will be interested in this new way of contributing. Noé (talk) 23:58, 1 November 2016 (UTC)[reply]

Hey, feedback: we had a good session in French Wiktionary with a brand new thesaurus on wine and a dozen of creations and a dozen of pages improved. It's quite a good result. If you have made some improvement, please let us know Noé (talk) 14:38, 1 December 2016 (UTC)[reply]
I edited the following articles relating to wine (in no particular order): (history) יין, (history) חמרא, (history) גפן, (history) ענב, (history) عنب, (history) عنبة, (history) סתם יינם, (history) מבושל, (history) mevushal. --WikiTiki89 15:53, 1 December 2016 (UTC)[reply]

Suggestion: treat taxonomic names as a dialect of Translingual[edit]

My reason is: because "Translingual" is Wiktionary jargon-ish. I'm thinking that maybe using "taxonomic name" (or a synonym) instead should be clearer, and with new categories because Category:English terms derived from Translingual contains a few derivations from chemical symbols.

EDIT: I forgot to say one thing. If we use a new code "mul-tax", we can make the resulting text be italic, no matter if we are adding mentions in etymologies ({{m}}), items in lists ({{l}}) or use other templates. Personally, I don't like it when an entry has a "Synonyms" or "Derived terms" section with a list of Translingual taxonomic entries using {{m}} (I don't remember any examples), because these are not mentions; they're just italicized list items.

For example, in lycaenid:

Current Proposed
Code mul mul-tax
Etymology
(wikitext)
From {{der|en|mul|Lycaenidae}}.
From {{der|en|mul-tax|Lycaenidae}}.
Etymology
(result)
From Translingual Lycaenidae.
("Translingual" is currently a redlink in Wikipedia)
From taxonomic name Lycaenidae.
Category Category:English terms derived from Translingual Category:English terms derived from taxonomic names

--Daniel Carrero (talk) 07:11, 2 November 2016 (UTC)[reply]

Then all we Anglo-Taxons would need is to raise an army. DCDuring TALK 10:58, 2 November 2016 (UTC)[reply]
LOL what? Also your input is valuable because you work on taxon names a lot -- do you think what I suggested is a bad idea, DCDuring? --Daniel Carrero (talk) 11:03, 2 November 2016 (UTC)[reply]
"Translingual" is probably the most heterogeneous of the 'languages' we have. Taxonomic names is a relatively homogeneous sublanguage with more than ten thousand entries. So it is an obvious choice to begin the process of lexiconic cleansing of Translingual by moving taxonomic entries to their own enclave.
Italicization is not universal within the class of taxonomic sames. It only applies to genera and sub-generic taxa. Subgenera, subspecies, varieties, (sub)sections, and possibly other types of names may have words, usually abbreviations (subg., subsp., var., sect., subsect., cultivar), that are not italicized.
A further problem is that italicization is supposed to occur only in contrast to surrounding text. If surrounding text is italic, then such names should appear without italics. It seems to me that some of our templates have not allowed this possibility, I imagine because of the use of CSS to impose formatting.
Whether the kludge of using the language-dialect structure is appropriate is not for me to say, though I dread discovering what unforeseen consequences will be in the short run and, even more, in the long run. Perhaps once the Anglo-Taxon army has been raised, Anglo-Taxon can become a language. DCDuring TALK 12:29, 2 November 2016 (UTC)[reply]
I don't think "a dialect of Translingual" makes any real-world sense and so I don't think we should treat anything as being it. Equinox 19:23, 2 November 2016 (UTC)[reply]
If anything shouldn't they be a dialect of Latin? DTLHS (talk) 21:01, 2 November 2016 (UTC)[reply]
I wouldn't object to that, but others have in previous discussions of the matter. We might consider medical and legal Latins as well, both of which are also principally used in running text of contemporary languages, mostly European, but also Chinese, eg, herbals. DCDuring TALK 22:23, 2 November 2016 (UTC)[reply]
'From taxonomic name' isn't good English. Perhaps 'from taxonomic Latin', that's a bit more like it. Renard Migrant (talk) 23:00, 2 November 2016 (UTC)[reply]
I prefer to keep taxon names away from the Latin section, for two reasons:
  • I assume there are people on Earth who don't speak Latin but have an interest in taxon names. So, I believe that burying taxon names together with normal Latin terms would be doing a disservice to these readers.
  • Taxon names are really used (or usable) in basically all languages, like SI units and chemical symbols. Citations:Homo sapiens has citations in English, Portuguese, Spanish, etc. Arguably, we could almost say: "Homo sapiens is a Portuguese word, and a Spanish word too, and an English word too, the citations prove it!" -- I see the Translingual section as a shortcut for specific types of terms and symbols that could have a ton of separate language sections for the exact same thing.
Then again, as I said above, the problem I'm trying to solve is mainly that "From Translingual Homo sapiens" does not make a lot of sense in etymologies. Alternative suggestion: @Renard Migrant mentioned "taxonomic Latin". Do you think we could move all taxon names to a new separate language called "Taxonomic Latin" (code: qtx)? --Daniel Carrero (talk) 02:40, 3 November 2016 (UTC)[reply]

Category:Currencies has very few entries in it. Should it be merged with Category:Currency? DonnanZ (talk) 14:23, 2 November 2016 (UTC)[reply]

The former is for terms related to currency, the latter is for names of currencies. —CodeCat 14:28, 2 November 2016 (UTC)[reply]
I didn't know the latter category existed until today, and the fine distinction you mentioned isn't made clear in either category, and most users have been using Category:Currency for names of currencies, including myself. DonnanZ (talk) 14:34, 2 November 2016 (UTC)[reply]
Many of our category pages do not have an explanation of what they are supposed to contain in concept, nor of how they are populated. Perhaps we should facilitate the identification of instances where this leads to confusion by applying a specialization of {{rfc}} or {{attention|lang=nolang|topic=category}}. Or perhaps the equivalent of "Inquire within" directing users to members of the appropriate list of category mavens. DCDuring TALK 14:55, 2 November 2016 (UTC)[reply]
If we are going to introduce a "topic" parameter in templates to choose the correct categories, I'd prefer if the categories themselves had "topic" (or a synonym) in the name somehow. --Daniel Carrero (talk) 14:59, 2 November 2016 (UTC)[reply]
As long as something positive is done, I don't care how you do it. In the meantime I have put the Norwegian house in order. DonnanZ (talk) 17:15, 2 November 2016 (UTC)[reply]
Actually, Category:Currencies wasn't created until 7 March 2016 by (guess who?) CodeCat. No wonder other users, as well as myself, didn't know of its existence. DonnanZ (talk) 17:23, 2 November 2016 (UTC)[reply]
Consider this: on each category page "Inquire within." (perhaps in a box) or equivalent directing users to one or all of the creator of the category page, the person(s) who hard-coded category membership in the category, or the creator of any template that populates the category. That way user questions about the category could be directed to a presumably knowledgeable party. Even if there were intelligible explanations, it would still be possible for users to misunderstand in unanticipated ways, so such "Inquire withins" could be permanent features of all category pages. It's a shame that it is so hard to find the right contributors' names to insert in the "Inquire within".
I think I'll do this for the taxonomic name categories to see whether they confuse anyone. DCDuring TALK 19:16, 2 November 2016 (UTC)[reply]
I think we could easily include a link to the data module where the category information is defined. DTLHS (talk) 22:02, 2 November 2016 (UTC)[reply]
I'm pretty sure we already do. —CodeCat 23:39, 2 November 2016 (UTC)[reply]
Huh I somehow never noticed that. DTLHS (talk) 23:51, 2 November 2016 (UTC)[reply]

I created Wiktionary:Votes/pl-2016-11/One vote per week, based on these discussions:

Feel free to discuss, suggest something different, give feedback etc. --Daniel Carrero (talk) 07:57, 4 November 2016 (UTC)[reply]

<facepalm> - Amgine/ t·e 23:03, 4 November 2016 (UTC)[reply]
@Daniel Carrero: This is a really, really bad way to handle this. People are annoyed because there are too many votes on silly things... and then you create a vote on a silly thing to address it. —Μετάknowledgediscuss/deeds 23:36, 4 November 2016 (UTC)[reply]
Are you annoyed, and what silly votes? More to the point, what do you suggest should be done instead? The only serious way to edit a policy (in this case, Wiktionary:Voting policy) is with a vote; the other options are editing it without a vote and not editing it. --Daniel Carrero (talk) 23:45, 4 November 2016 (UTC)[reply]
I also don't know why you seemed to believe I created that vote to address what you described. We may be having a communication problem. The actual reason I created this vote is on the rationale of the vote page. I may be wrong, but I think @DCDuring is the only person on the planet Earth that is annoyed by my votes, and I've tried to talk with him repeatedly to see what he thinks should be done. --Daniel Carrero (talk) 23:51, 4 November 2016 (UTC)[reply]
I just intended to remove that 24-hour rule without creating any votes (diff), but @Equinox said: "Can we revert this please, or (sigh) have a vote on it?" (diff), so I'm just fulfilling his request (albeit not an enthusiastic one, it has a "sigh" on it).
If proper procedure is creating votes only about what had been already discussed, note that this vote has 2 discussions in the OP. --Daniel Carrero (talk) 00:07, 5 November 2016 (UTC)[reply]
In neither of those discussions do I see any indication that anyone would support a one-vote-per-week limit. --WikiTiki89 15:42, 7 November 2016 (UTC)[reply]
@Wikitiki89: Sorry, you are right. I was being stupid. The word "week" was mentioned in the first discussion a little, but it was presumptuous of me to create a whole vote without other people wanting to vote on it. I'll withdraw that vote now. --Daniel Carrero (talk) 01:07, 10 November 2016 (UTC)[reply]
FYI: In the talk page of the vote, there is some conversation as to whether the vote should proceed or not. --Daniel Carrero (talk) 21:01, 14 November 2016 (UTC)[reply]

Mandarin pinyin pages[edit]

Today, I visit some Mandarin pinyin pages. I found that both tone systems exist: diacritic mark and numeral. And the problem is that editors must edit both pages for a new word; it overworks and gets redundant. I suggest it would be better if the diacritic mark system is primary, and make the numeral system to be alternative. For example I did to yi1, yi2, yi3, yi4. How do you think? --Octahedron80 (talk) 08:51, 4 November 2016 (UTC)[reply]

I agree it'd be better to have one be the alternative form. Both have pros and cons. Diacritics are more commonly seen, numbers can easily be entered. From a user perspective I'd have to prefer the latter, but I personally prefer the first. Korn [kʰũːɘ̃n] (talk) 10:33, 4 November 2016 (UTC)[reply]
Pinyin with tone numbers should be made soft or hard redirects to pinyin with tone marks. There was some opposition to this in the past. In any case, we should reduce duplications. --Anatoli T. (обсудить/вклад) 10:50, 4 November 2016 (UTC)[reply]
Hard redirects may not be done because Cantonese jyutping shares some pages with pinyin. --Octahedron80 (talk) 00:16, 5 November 2016 (UTC)[reply]
Yes, that was one of the reason for not going down that path. Soft redirects will require a new template, which is not be hard to make, if there is an agreement. Some (mostly unregistered IP-users) objects to converting numbered pinyin to redirects but nobody looks after numbered pinyin any more. --Anatoli T. (обсудить/вклад) 01:38, 5 November 2016 (UTC)[reply]
Well, we already redirect simplified Chinese anyway; redirecting numbered pinyin (a non-standard form of pinyin) can't be any worse. —suzukaze (tc) 20:12, 5 November 2016 (UTC)[reply]

So, I will convert all numbered pinyin into alternate spellings. About its category, I will solve it later; I just make sure they are all converted at the moment. --Octahedron80 (talk) 07:38, 15 November 2016 (UTC)[reply]

I already cleanup all numbered pinyin pages. Additionally, I add Category:Mandarin pinyin with tone numbers back to cmn-pinyin and then drop out usage of cmn-alt-pinyin. --Octahedron80 (talk) 04:34, 26 November 2016 (UTC)[reply]

Cleanup category?[edit]

Could entries with an {{rfc}} tag (request for cleanup) go into a category, for easier finding? I sometimes spot entries that were RFCed years ago where nothing has happened, e.g. T-Play. Equinox 21:59, 4 November 2016 (UTC)[reply]

They do go in a category (Category:Requests for cleanup) DTLHS (talk) 22:01, 4 November 2016 (UTC)[reply]
Oh, seems odd that it's not visible in entries. Equinox 22:04, 4 November 2016 (UTC)[reply]
It's a hidden category. DTLHS (talk) 22:05, 4 November 2016 (UTC)[reply]
Hidden categories can only be found by clicking on "edit". Should this be changed? DonnanZ (talk) 12:39, 5 November 2016 (UTC)[reply]
Just go to WT:PREFS and you can check a box that allows you to see them. —Μετάknowledgediscuss/deeds 18:01, 5 November 2016 (UTC)[reply]

Lifetime blocks[edit]

Per community consensus, please indef block User:MglovesfunBot, User:Renard Migrant and User:Mglovesfun. Obviously using the full name for a template instead of a shortcut is a very serious offense and anything other than a lifetime is sending the wrong message. Renard Migrant (talk) 12:41, 5 November 2016 (UTC)[reply]

Please stay. I don't think anyone wants to block you. --Daniel Carrero (talk) 13:33, 5 November 2016 (UTC)[reply]
This appears to be an overreaction. Can the request be refused? DonnanZ (talk) 13:51, 5 November 2016 (UTC)[reply]
@Renard Migrant: Using the full name of a template is not in any way a serious offense--it is trivial. Why are you requesting a lifetime block of yourself? —Justin (koavf)TCM 14:08, 5 November 2016 (UTC)[reply]
If the authorities say that it's a serious offense, it must be. After all, we elected them because they know what they are doing. DCDuring TALK 14:54, 5 November 2016 (UTC)[reply]

Victar getting away with vandalism[edit]

NOTE: Moving this here as Victar tried to hide it, despite replying in the first place.

@Angr, Anglom, CodeCat, JohnC5 He's started butchering entries by adding butchered Latin forms like *Urboɣen and quoting some unsourced paper written by some random person. It seems someone else pointed out he was wrong (I think? The whole discussion was bizarre) but he just responded with "-_-". UtherPendrogn (talk) 10:00, 5 November 2016 (UTC)[reply]

You're ignorance is complete if you can say Anna Morpurgo Davies is "some random person". --Victar (talk) 10:07, 5 November 2016 (UTC)[reply]
You just called at least seven and a half billion humans ignorant. Also, thanks for not denying what you're doing is essentially vandalism. UtherPendrogn (talk) 10:09, 5 November 2016 (UTC)[reply]

Verifying reconstructions[edit]

Last year, Wiktionary:Votes/2013-10/Reconstructions need references passed with consensus that reconstructions can be challenged and sent to a verification process. Today, User:UtherPendrogn tried to challenge Reconstruction:Proto-Brythonic/Urboɣen by adding an {{RFV}} tag and starting a discussion at WT:RFV,[1][2] but User:Victar and User:Chuck Entz said that reconstructions can't be RFVed.[3][4] Do other editors agree that reconstructions shouldn't be taken to RFV? If so, where should users start a discussion if they want to challenge a reconstruction? —Mr. Granger (talkcontribs) 20:03, 5 November 2016 (UTC)[reply]

The referenced vote suggests to me a process like RFV should be used for challenging reconstructed terms. While WT:RFV has traditionally been used to handle WT:ATTEST, it could be extended in scope to handle challenging of reconstructions. Those who oppose this use of RFV should clarify which other process page should be used. A downside of using WT:RFV for reconstructions is that it will then be driven by two quite distinct manners of verification. --Dan Polansky (talk) 12:40, 6 November 2016 (UTC)[reply]
Reconstructions are really part of the etymologies, and should be verified in the Etymology scriptorium. We should develop a separate template (rfv-r?) so we aren't asking "Can this etymology be sourced?", but otherwise the procedures for verifying etymologies would work fine for verifying reconstructions Chuck Entz (talk) 15:35, 6 November 2016 (UTC)[reply]
Yeah, we generally do them a bit less formally at the WT:ES. I don't think we need special templates for them. --WikiTiki89 16:02, 7 November 2016 (UTC)[reply]
Recent events have shown that debates about the validity of reconstructions sometimes end up being quite heated... would it not be prudent to have some clear guidelines in place for such cases? — Kleio (t · c) 16:09, 7 November 2016 (UTC)[reply]
Reconstructions are term-like items with their own pages such as Reconstruction:Proto-Brythonic/Urboɣen, and as such are not just part of etymologies. It is not obvious that etymology scriptorium is more appropriate than RFV. --Dan Polansky (talk) 12:38, 12 November 2016 (UTC)[reply]
Reconstructions are always going to require a lot of dialog, citing of sources, and most importantly, bringing together the people that work in that area, each reconstructed language having their own guidelines and practices. Also, unlike other entries at WT:RFV, many times they simply need to be modified, and are not candidates for deletion. As such, I think taking the discussion to the larger community should only be the very last course of action, behind simply using the entry's talk page, followed by the reconstructive language's about talk page. --Victar (talk) 04:07, 9 November 2016 (UTC)[reply]

Timeless final syllable variants and erhua[edit]

As a Chinese learner, I am never sure whether to use the original pronunciation or the final syllable variant pronunciation of words that I come across here on Wiktionary. I find the expanded table in certain entries which lists the "Beijing" pronunciation and then the "Taiwan" variant very useful, but the "toneless final syllable variant" system is not as black and white. Of course, pronunciation isn't black and white, and it may be unconstructive to simply always pronounce the original or always pronounce the variant pronunciation, as pronunciation is based on context and region. However, I wish there were some sort of system that let the reader know whether the original or the variant is considered "standard" in Putonghua, or if the variant is simply a regional variant that isn't standard. The same goes for erhua. Beijingers use a lot more erhua than what is considered standard, though Standard Chinese does consider some erhua to be standard. Again, context is important, since it may be more formal even in Standard Chinese to say 这里, for example in a speech, than to say 这儿, but we should be able to point out these differences and let the reader know. If you go to the entry for 一点儿, you'll see that it is nothing but "erhua form of..." That doesn't seem right to the reader. I understand that this probably seems like a petty issue and one that probably doesn't have an easy solution, but I just thought I'd bring it up, because it definitely affects learners like me.  WikiWinters ☯ 韦安智  00:47, 6 November 2016 (UTC)[reply]

@Wyang?  WikiWinters ☯ 韦安智  01:50, 9 November 2016 (UTC)[reply]

Like you said, this is not black and white. Beijingers probably pronounce all of those words as toneless, the Taiwanese would retain the tone in about 80% of them, and a Mainlander not from Beijing would pronounce anywhere between 0 to 100% of them as toneless. A non-native learner is probably more likely to be understood if (s)he doesn't reduce the tones at all initially, and only does so after being able to converse with natives fluently. The problem is when Beijingers pronounce a word as toneless, and non-toneless sounds unnatural, and this is recognised as Standard Chinese, but the majority of speakers outside of Beijing don't fully pronounce it as toneless. The correctness of the toneless pronunciation would need to be marked in {{zh-pron}}. BTW, a Chinese-specific discussion page (Wiktionary:About Chinese/discussions) may be better for centralised language-specific discussions. Wyang (talk) 02:45, 9 November 2016 (UTC)[reply]

Suggestion: Mention on WT:EL the fact that external links ≠ references[edit]

The current WT:EL#External links was voted and approved in Wiktionary:Votes/2011-07/External links. I'd like to edit it.

Current text:

Any line of text whose only purpose is linking to an external website (for example, a link to an encyclopedia, such as Wikipedia, or 1911 Encyclopædia Britannica), should be placed within an "External links" section, and never within a "See also" section.

Proposed text:

Use the "External links" section to link to external websites, such as other dictionaries and/or encyclopedias (including Wikipedia, or 1911 Encyclopædia Britannica). Usually, the link is to look up the same word in the alternative source. Don't use the "See also" or the "References" sections for the sole purpose of looking up the same word in another dictionary or encyclopedia.

Rationale:

  • I have the impression that quite a lot of entries use "References" merely to point to another dictionary, but rather I believe that "References" serves to back up claims in the etymology and usage notes. I'd also like to avoid this notion: "another dictionary has a certain word, so it serves as a reference proving that the word exists, e.g. we should have it too". This would be wrong because we use citations, not references, to attest new words and senses.

EDIT: This proposal is based on Wiktionary:Votes/pl-2015-12/References, in which the references section itself in EL was edited, but some people pointed out that the distinction between "External links" and "References" is not clear.

--Daniel Carrero (talk) 11:15, 6 November 2016 (UTC)[reply]

I support this. I also think that it would make existing references sections neater because we'd no longer mix two types. —CodeCat 13:11, 6 November 2016 (UTC)[reply]
There seem to be cases where specific definitions are footnoted, which leads to the need for a References section in which dictionaries or similar sources appear. Even after we get rid of copyright violations for which such footnotes might indicate, there will be legitimate uses of footnotes in definitions, IMO. I personally use footnotes to indicate a source of taxonomic hypernym and hyponym information when there are multiple lists of one and/or the other in the entry, usually corresponding to different subsenses. I'd hate to lose the ability to use footnotes in definitions and the lists. DCDuring TALK 14:06, 6 November 2016 (UTC)[reply]
@DCDuring: I searched for Translingual entries using References (I typed Translingual reference on Special:Search) but did not find any taxon entry using references like you described. Could you please provide any examples? --Daniel Carrero (talk) 12:16, 7 November 2016 (UTC)[reply]
[[Jungermanniopsida]], for one, but many higher taxa have current and older definitions, some few of which we have. As I am systematically entering specific definitions for taxa in Ruggiero MA, Gordon DP, Orrell TM, Bailly N, Bourgoin T, Brusca RC, et al. (2015) A Higher Level Classification of All Living Organisms. PLoS ONE 10(4): e0119248. PMID 25923521, doi: 10.1371/journal.pone.0119248, using {{R:Ruggiero}}, there will be more of these. DCDuring TALK 13:05, 7 November 2016 (UTC)[reply]
@DCDuring: I understand. Are we accepting (or should we accept) the word of authorities to create new entries and senses for taxon names, instead of 3 citations in running text? For example, if the sense "A taxonomic class within the phylum Marchantiophyta." of Jungermanniopsida were sent to RFV and there were no citations in running text, just the page referenced "A Higher Level Classification of All Living Organisms.", should that sense be deleted or kept? --Daniel Carrero (talk) 13:17, 7 November 2016 (UTC)[reply]
Of course not. The lists used for sources of definitions contain what are essentially mentions. Any taxonomic name can be challenged. But almost any correctly spelled taxonomic name mentioned in a list or reference is highly likely to have at least three uses in online sources.
This search contains all the Translingual entries that have such footnotes. Marchantiophyta has multiple uses of the same reference. DCDuring TALK 13:38, 7 November 2016 (UTC)[reply]
@DCDuring: Allright. Do you think that apart from Translingual taxon names, are there any other groups of entries that benefit from having references for definitions and lists? I have the impression that many entries should have the "References" section replaced by "External links". For example, water#References (within the English section) simply contains 2 links to other dictionaries, which are not footnoted and seem to be mere suggestions of other sources to look instead of serving as "references" to build up on the current senses. I believe it could be changed to "External links". Am I mistaken? --Daniel Carrero (talk) 13:56, 7 November 2016 (UTC)[reply]
I would be very surprised if there weren't other groups. That really seems the wrong way around to address the matter, if, indeed, it needs addressing at all. DCDuring TALK 14:28, 7 November 2016 (UTC)[reply]
I was not assuming that other groups don't exist; I was just asking if you know any other groups that do exist. Here's an actual suggestion, originally presented by @Dan Polansky in Wiktionary talk:Votes/pl-2015-12/References#References are especially encouraged for more obscure words:
  • External links = answers the question "Where else do you recommend me to look?"
  • References = answers the question "How do you know the information that you are presenting?"
--Daniel Carrero (talk) 14:36, 7 November 2016 (UTC)[reply]
We very commonly put {{wikipedia}} outside of any such section. Is that bad? Equinox 14:50, 6 November 2016 (UTC)[reply]
We could deprecate it in favour of {{projectlink}}. That might be cleaner. —CodeCat 14:53, 6 November 2016 (UTC)[reply]
@Equinox: If I'm not mistaken, I think you already know that it's not bad at all, and you asked it to check if the proposed policy text is consistent with what we actually expect in entries. We should probably add it in the proposed text: "this does not apply to boxes" or something. Incidentally, the current text does not mention floating boxes either, even though it was added by the vote Wiktionary:Votes/2011-07/External links which did specifically mention them, in the "Note C".
@CodeCat: I don't mind having both {{wikipedia}} boxes and {{projectlink}} randomly placed in millions of entries, except I tend to use the latter when I add a floating image. It may be just me, but I think it's ugly when there's an image directly above, or below, a {{wikipedia}} box. Other than that, the proposal of nuking all {{wikipedia}} boxes and using only {{projectlink}} appeals to the part of me which seeks consistency in all entries, so I feel there's at least a slight possibility that I could support it. But in the defense of {{wikipedia}}, using boxes pointing to other Wikimedia wikis seems to be normal in the Wikipedia itself, Commons and probably others Wikimedia projects, so we may want to keep in the bandwagon and allow it to be used in case other people expect us to have it. --Daniel Carrero (talk) 15:06, 6 November 2016 (UTC)[reply]
Can we please get rid of the floating boxes? Deprecating would be a good start. They are ugly and often placed inconsistently. We don't need to copy all practices from other Wikimedia projects. – Jberkel (talk) 21:56, 6 November 2016 (UTC)[reply]
The benefit of floating boxes is that you get more content per horizontal portion of page. I am concerned by sites being increasingly designed for mobile phones and not taking advantage of the width of desktop screens. On many modern news sites you actually have to hit Page Down just to get past the enormous headline and reach the story text! Incredible. Equinox 21:40, 7 November 2016 (UTC)[reply]
@Daniel Carrero: I meant to write earlier. I would point out that there seems to be a consensus in Latin and Greek for using the References section for external links. This leads to the unsightly mixing of numbered footnotes with bulleted hyperlinks, but please see here: [5]. As User:Chuck Entz writes: For what it's worth, I've always used the References header for LSJ, because that seems to be what everyone else was using. Since we don't rely on authoritative references for sourcing (except for etymologies), the References section tends to be more like the Further Reading header on Wikipedia rather than strictly bibliographic. I accordingly standardised the Latin and Greek sections by this precept. If you wish to change it, please make sure there is agreement amongst the Latin and Greek editors to change their convention. If necessary I am willing to make any changes those editors should feel is necessary. Thanks. Isomorphyc (talk) 14:53, 18 November 2016 (UTC)[reply]
@Isomorphyc: Allright. I'll start by pinging some people that maybe could be interested: @Erutuon, Saltmarsh, Atitarev, Metaknowledge, I'm so meta even this acronym, Angr, Embryomystic, Fsojic.
Yes, I'd like to propose this in all languages, including Latin and Greek: using the External links for simple suggested links to other dictionaries and encyclopedias, and References for sources that answer the question "How do you know the information presented in the entry?". Feel free to discuss this idea. I feel that one benefit of this proposal would be making the References "stronger": an External links section containing links to the same word in other dictionaries may not offer much added value to readers, provided that the Wiktionary entries themselves are good enough, but a References section with evidence for claims in the entry may be more likely to offer added value. The proposed categories are very different things that may appeal to readers with different interests. --Daniel Carrero (talk) 22:14, 20 November 2016 (UTC)[reply]
That's fine with me, although the fact that we have a lot of templates for online dictionaries beginning with "R:" and categorized as "reference templates" suggests that we do consider online dictionaries to be references. —Aɴɢʀ (talk) 11:20, 21 November 2016 (UTC)[reply]
Good point. But even if people have considered these "R:" templates to be references up until now, I feel it's a good idea to split and clarify the purposes of "References" and "External links" sections as proposed above. If there's no prior agreement, these two sections are basically interchangeable: the (online) references are external links, and the external links are (online) references. (Before we had any agreement, even the "See also" section could have links to external links, or references -- until Wiktionary:Votes/2011-07/External links, that is.)
If this proposal passes, I assume that all our online dictionary templates will be able to be freely used either in the EL section or the References section, depending on the current purpose, correct? For example, {{R:Webster 1913}} seems like a template likely to be used in "external links" sections only (or mostly). But if we find reason to place it in the references section too, I would not like to split it into 2 templates, called {{R:Webster 1913}} (just for references) and maybe {{E:Webster 1913}} (just for external links), because that template returns the same contents (a link to Webster), no matter the section where it's used. We may want to keep a single list of templates to be used in either section, unless someone objects. I only fear that if we don't want to use the "R:" prefix anymore, then having to rename a lot of templates would make this proposal more cumbersome to implement and thus probably harder to pass. --Daniel Carrero (talk) 21:53, 21 November 2016 (UTC)[reply]

FWIW, I created Goldbach's conjecture, in which the EL section is just a link to Wikipedia, and the References section is a link to another website serving as a source proving what is said in the definition. This is what I have in mind. --Daniel Carrero (talk) 21:55, 21 November 2016 (UTC)[reply]

Now I'm at a loss for where to put {{R:ga:Dinneen}}. It's a reference to a paper dictionary, so there's no link, so it can't really go under External links. But it's also just a dictionary entry that isn't proving anything in the etymology or definition other than the existence of the word and, in some cases, the word's pre-reform spelling, so strictly it shouldn't go under References either. —Aɴɢʀ (talk) 23:49, 21 November 2016 (UTC)[reply]

────────────────────────────────────────────────────────────────────────────────────────────────────@Angr: Ok. I've been thinking about what you said yesterday. This is a real problem: where to place suggestions of printed dictionaries? Even if we found a good online copy of Foclóir Gaeḋilge agus Béarla (it's in public domain, right?), the problem remains as long as we have 1 printed dictionary to mention in an entry. Presumably, we want a single section (like "External links") to add suggestions of dictionaries and encyclopedias too.

Could we rename "External links" to something else? "External sources", maybe?

Other possibilities:

  • "In other dictionaries" has a nice ring to it IMO, but we want to link encyclopedias too.
  • "In other dictionaries and encyclopedias" is too long, so I don't like it very much.
  • "In other reference works" is too close to "References", so it appears to make the proposed distinction less meaningful.

Example:

==English==

===Noun===
{{en-noun}}

# Sense 1
# Sense 2

====External sources====
* {{R:whatever}}

--Daniel Carrero (talk) 18:34, 22 November 2016 (UTC)[reply]

I'm not sure what this policy means exactly for Ancient Greek entries. In some entries, there is a footnote in the etymology section, but no footnotes in the POS section, even though the definitions are usually based on the LSJ's entry, or sometimes on Cunliffe's entry. (I'm speaking mainly for the entries I've created.) Would the proposed policy prohibit us from putting the LSJ or Cunliffe in the References section when that's the source of the definitions? I've never seen footnotes used in the POS section, yet in many cases the source is clearly one of the dictionaries in the References section. — Eru·tuon 18:35, 22 November 2016 (UTC)[reply]
@Daniel Carrero: I like "External sources". The only online version of Dinneen's dictionary I know of is [6], which is difficult to use, full of scannos, and incomplete. There are scans of the must shorter first edition at [7]. The first edition is public domain in both the EU and the US; the second edition is public domain in the EU for sure but might not be in the US, depending on what Ireland's copyright laws were in 1996. @Erutuon: I'd say References sections can be used for footnotes, but that's not all they can be used for. They can be used to link to pages where you can find information corroborating any claims we make that might not be immediately intuitive. If LSJ and Cunliffe have been used as the source of definitions, I'd still put them in External links, because since this is a wiki, the definitions may get changed at some point in the future, and then they won't be taken directly from those sources anymore. And ideally (but admittedly totally unrealistically), we should be writing our own definitions "from the bottom up", i.e. on the basis of citations, rather than taking them from other dictionaries. For example, we should be saying that μῆνις (mênis) means "wrath" not because LSJ tells us that's what it means, but because we observe that that's what it means in "Μῆνιν ἄειδε, θεά, Πηληιάδεω Ἀχιλῆος οὐλομένην". —Aɴɢʀ (talk) 20:32, 22 November 2016 (UTC)[reply]

Proposed EL change[edit]

@CodeCat, DCDuring, Equinox, Jberkel, Isomorphyc, Angr, Erutuon, Wikitiki89

This is a proposed WT:EL change that is supposed to implement what has been discussed here.

I'm not really sure if one single vote should cover it all or if we should create multiple votes. For example, we could have a first vote specifically about only converting "External links" into "External sources" in all entries, and then vote on the other stuff later. --Daniel Carrero (talk) 09:33, 8 December 2016 (UTC)[reply]

The EL proposal is below this line. Feel free to edit it if you want. I'm not saying it's definitely perfect yet, even though it's the best I could do for now.

Project scope and rationale:

  1. Replace "External links" by "External sources", because external sources may be available in print, not always online.
  2. Organize and clarify the exact purpose of "See also" (for internal pages), "References" (reference works that prove what is being stated in entries) and "External sources" (all other external pages, simple recommendations that are not meant to prove hat is being said). -- For consistency, so people don't have to guess which section to use and what to find in them.
  3. While we are at it, mention numbered references in the "References" section in WT:EL.

Step 1:

  • In WT:EL#List of headings, the section "External links" appears three times. They should be replaced by "External sources".

Step 2: Remove these two sections from WT:EL, to be replaced by the revised versions below.

External links

Any line of text whose only purpose is linking to an external website (for example, a link to an encyclopedia, such as Wikipedia, or 1911 Encyclopædia Britannica), should be placed within an "External links" section, and never within a "See also" section.

References
Main article: Wiktionary:References

The References section contains external sources where the information available on our entries can be verified. This improves the reliability and usefulness of Wiktionary. References are especially encouraged for unusual or disputable claims in etymologies — such as the etymology of windhover — or usage notes.

References are listed using bullet points (the character *). References may be given in a normal bibliographic format showing author, title, place of publication, publisher and year of publication. Reference templates (beginning with “R:”) are used for some of the most common sources. See the example below for two references used in the entry water:

Code:

* {{R:Century 1911}}
* {{R:Webster 1913}}

Result:

Step 3. Add these revised sections on WT:EL. (the "See also" is new)

See also

The See also section may be used to link to entries and/or other pages on Wiktionary, including appendices and categories. Don't use this section to link to external sites (such as Wikipedia) or other external sources.

External sources

The External sources are simple recommendations of further places to look.

  • This section may be used to link to external dictionaries and encyclopedias, (for example, Wikipedia, or 1911 Encyclopædia Britannica) which may be available online or in print.
  • This section is not meant to prove the validity of what is being stated on the Wiktionary entries. (the References section serves that purpose)
  • If a definition was taken directly from an external dictionary in public domain, the external dictionary may be linked in the External sources section. (Our definitions reflect how words are used in real life. Don't use the References section to link back to the public domain dictionary, because its appearance on the external dictionary is merely incidental to how the word is actually used. Words and senses are attested using citations.)

Examples:

References
Main article: Wiktionary:References

The References section contains reference works where users can verify the information available on our entries. This improves the reliability and usefulness of Wiktionary. References are especially encouraged for unusual or disputable claims in etymologies — such as the etymology of windhover — or usage notes.

  • References may be listed using bullet points (the character *) or numbered lists (the character # or generated automatically by the <references> tag).
  • References may be given in a normal bibliographic format showing author, title, place of publication, publisher and year of publication.
  • Reference templates (beginning with “R:”) are used for some of the most common sources.
I would like an end put to the mixing of bullet-pointed references and <references/>-style references. This looks quite ugly. —CodeCat 14:48, 8 December 2016 (UTC)[reply]
@Daniel Carrero: I agree with User:CodeCat about the unaesthetic look of mixing bullets with numbered items and think the ideas proposed by User:Angr and User:Erutuon for Latin and Greek answer my uncertainties also. I'll edit the classical sections accordingly if the vote passes, unless anyone else wants to, and I'm glad to help with any other repetitive tasks. Thanks for your work on this cleanup effort. Isomorphyc (talk) 16:27, 8 December 2016 (UTC)[reply]
I don't like the "External sources" section idea. --WikiTiki89 16:33, 8 December 2016 (UTC)[reply]
@Wikitiki89: Is there a better way to separate bibliographic from non-bibliographic references, or do you instead think the aesthetic value of separating bullets from numbers is too little, or involves an unwanted trade-off? Isomorphyc (talk) 17:36, 8 December 2016 (UTC)[reply]
I don't think readers care whether a particular reference is bibliographic or not, as long as they point the reader to more information and/or more reliable sources. --WikiTiki89 18:15, 8 December 2016 (UTC)[reply]
About the mixing of bullet-pointed references and <references/>-style references. In the proposal above, I suggested mentioning on WT:EL the fact that we mix these two types, but my intention was just describing our current state of affairs. I think having mixed types in the first place is not a great idea, too. I have a proposal: maybe the References section should have only the second type (<references/>-style), because if the references are always proving something, then presumably we can always point the reference to something in the entry, as a footnote. Am I wrong? Plus, the tag itself is called "references". --Daniel Carrero (talk) 19:56, 8 December 2016 (UTC)[reply]
Everyone I know who works in etymologies is conscientious about footnotes when it matters; and this can be said with more emphasis for Reconstructions sections. I still feel that since the system treats References sections as an exception, with respect to ref and references semantics, it would look far less sloppy to respect this by putting non-footnote links elsewhere as originally intended. The existing treatment is not a major problem, but if we are working on this, I do think it is currently at least modestly broken. Isomorphyc (talk) 03:08, 10 December 2016 (UTC)[reply]
Can we get rid or rename "See also"? (Internal sources?) It seems to get used as a catch-all heading for a lot of things which should be in synonyms, related terms etc. Linking to appendices seems to be a valid use case though. – Jberkel (talk) 12:26, 9 December 2016 (UTC)[reply]
@Jberkel: At least in the first vote, maybe we should just try to organize/separate "External sources"/"References" as suggested above, without trying to change or remove "See also" too, because the more proposals a single vote has, the less likely it is to pass. --Daniel Carrero (talk) 14:38, 11 December 2016 (UTC)[reply]

I created the vote: Wiktionary:Votes/2016-12/"References" and "External sources".

The vote includes the idea of requiring all references (in the "References" section) to use the <references/> tag to generate only numbered lists and avoid mixind bulleted/numbered lists. --Daniel Carrero (talk) 19:10, 31 December 2016 (UTC)[reply]

Something is wrong there, most entries do have a plural form. – Jberkel (talk) 21:59, 6 November 2016 (UTC)[reply]

French plurals are generally very trivial to generate, anyway. Anti-Gamz Dust (There's Hillcrest!) 15:01, 7 November 2016 (UTC)[reply]
I think the point is: these nouns are being automagically added to the category, but many *do* have plurals already generated. Therefore hidden away in a module/template somewhere there is a bug. - Amgine/ t·e 15:30, 7 November 2016 (UTC)[reply]
Should be fixed now: diffJberkel (talk) 15:40, 7 November 2016 (UTC)[reply]

Looking for beta testers for new Wiktionary iOS app[edit]

Hope you don't mind a bit of self-promotion, but I thought some of you might be interested in this. I've been using and editing Wiktionary for quite some time now and always wanted to be able to carry an offline version of it around. I wasn't able to find a good existing application for this so decided to build my own. The result is motî (from Walloon motî), an iOS app which offers offline access to entries from a subset of 10 different languages, all extracted from a recent database dump. I plan to add more languages later but want to get the initial feature set right first. The code is not open-source (yet) and iOS-only, but the database format has been designed in a portable way, so other platforms can follow later.

If you're interested to help test prior to the launch leave your email address on the website linked above, or directly via Special:EmailUser/Jberkel. Thanks! – Jberkel (talk) 17:38, 7 November 2016 (UTC)[reply]

Why only 10 languages? What's stopping you to let us choose any subset of all 2500 languages? Does the processing take too long? --Dixtosa (talk) 18:50, 7 November 2016 (UTC)[reply]
Yes, running the extraction is quite CPU intensive at the moment, that's why. I also expect some problems with languages which have a slightly different template setup, such as Japanese and Chinese. Jberkel (talk) 19:05, 7 November 2016 (UTC)[reply]
I'm curious how you're running your extraction process. I'm able to completely parse all templates of the site in a few hours with a low-end computer and without even putting the entire dump into memory. DTLHS (talk) 21:04, 7 November 2016 (UTC)[reply]
Parsing is done with a library called JWKTL, this is usually quite fast. The slow part is the template expansion. For this I use a library called bliki, which works well but is a bit slow. Maybe there are better options for this now (Parsoid?) – Jberkel (talk) 11:38, 8 November 2016 (UTC)[reply]
This could be pretty cool, I have an Android phone but otherwise I'd have loved to test it! — Kleio (t · c) 18:55, 7 November 2016 (UTC)[reply]
In case you care, you're missing a word on your front page: "it’s also for anyone with an interest in words or who just loves dictionaries." Equinox 21:42, 7 November 2016 (UTC)[reply]
I think it's better to rephrase that as "it’s also for anyone who has an interest in words or just loves dictionaries." --WikiTiki89 21:53, 7 November 2016 (UTC)[reply]
Updated, thank you for your suggestions. – Jberkel (talk) 22:03, 7 November 2016 (UTC)[reply]

2016 Community Wishlist Survey[edit]

Hi,

Non-official information. Sorry if you already have received this information, but there is an on-going Community Wishlist Survey November 9 to 20 and there is a specific attention for small projects this time. I invite you to write proposals for Wiktionaries, asking for improvement or for a pony, as you prefer. I think there is an amount of tech improvement to be done, including the ones already made locally and not integrated into MediaWiki. So, I wrote a suggestion to add a Insert Citation button in VisualEditor and I invite you to endorse it. Well, I hope to read interesting suggestions emerging from this community and I hope Wiktionary to be definitely up in Tech Community task flow. -- Noé (talk) 10:06, 8 November 2016 (UTC)[reply]

Adding numbers to the CFI[edit]

We are finishing up an RFD for yet another set of numbers, it would be nice if we codified which numbers we are interested in keeping and which can be deleted without prejudice. I would propose that we include only non-compound numbers, with the obvious exception for idiomatic usage. There may be languages in which this criteria is inappropriate, I would be interested in hearing about those and discussion possible solutions. - TheDaveRoss 14:01, 9 November 2016 (UTC)[reply]

If by including only non-compound numbers, you mean only numbers from 0 to 9, (plus exceptions for idiomatic usage) I'm completely OK with that. --Daniel Carrero (talk) 19:37, 10 November 2016 (UTC)[reply]
Sorry, I was not clear. I meant for long-form numbers (one, two ... hundred, million). Thus one million would be no good, nor would ninety-three. For digits I think a similar principle is good though. - TheDaveRoss 21:30, 10 November 2016 (UTC)[reply]
I believe that we are close to consensus on this. It might be useful to vote on the inclusion/exclusion principle to enable speedy deletion. DCDuring TALK 23:50, 10 November 2016 (UTC)[reply]
Also to be included, ordinal numbers. - TheDaveRoss 01:01, 11 November 2016 (UTC)[reply]
As I interpret it, numbers like one hundred and one are already prohibited as SOP. Chuck Entz (talk) 01:43, 11 November 2016 (UTC)[reply]

Transitivity and non-accusative verbal arguments[edit]

I'm not sure if the beer parlour is the right place for this question, but I have been filling in missing information in Ancient Greek verb entries, labeling transitivity and sometimes listing the cases that the verb takes as arguments. To this point, I have labeled verbs that have dative objects as transitive. Some examples are ἁνδάνειν (handánein, to delight) and ἀκολουθεῖν (akoloutheîn, to follow). But I could be doing this wrong. The Wikipedia article says that a transitive verb has a direct object. I assume that means accusative case-form in Ancient Greek, but I wish the article would say this explicitly. This question is also relevant for many other Indo-European languages; Icelandic, for instance. How is transitivity defined for them? — Eru·tuon 18:01, 9 November 2016 (UTC)[reply]

This is also relevant for Finnish, which has many cases in which a verbal argument can appear. So pinging @Hekaheka, Tropylium too. —CodeCat 18:06, 9 November 2016 (UTC)[reply]
In German, verbs that take dative objects, e.g. helfen, are traditionally called intransitive; only verbs that take accusative objects are called transitive. —Aɴɢʀ (talk) 18:19, 9 November 2016 (UTC)[reply]
That's interesting, because now that I look back at the Transitive verb article, it says that the verb trade is tritransitive because it takes a direct object, indirect object, and prepositional phrase as arguments (I'll trade you₁ this bicycle₂ for your binoculars₃). This definition of transitivity under which trade is tritransitive is wider than the one you describe in traditional German grammar (and the one that I thought the Wikipedia article was describing): the number of obligatory verbal arguments. For that matter, the verb give that is described as ditransitive would ordinarily take an accusative and dative in German, Latin, Ancient Greek, and many other languages, so I guess ditransitivity is defined with this wider definition of two obligatory verbal arguments. Perhaps, then, Wiktionary needs to decide which definition of transitivity (obligatory verbal argument or accusative argument) the transitive label should refer to. Under the first definition, helfen would be transitive, but under the second, it wouldn't; and the same with the Ancient Greek verbs I mentioned above. — Eru·tuon 19:05, 9 November 2016 (UTC)[reply]
In languages without cases, it may well be impossible to decide if a single object is direct or indirect. —CodeCat 19:10, 9 November 2016 (UTC)[reply]
In fact, in some languages that do have cases (or some way of explicitly marking direct and indirect objects), some verbs can take two direct objects; so it would be naive to assume that it is always the case in languages without explicit marking of direct and indirect objects that if a verb has two objects one must be direct and one indirect. --WikiTiki89 19:16, 9 November 2016 (UTC)[reply]
That's true. German has some verbs that take two accusative objects; my favorite example is das geht dichACC einen ScheißdreckACC an "that's none of your damn business". A more prosaic example is Sie nannte ihnACC einen LügnerACC "she called hima liar". —Aɴɢʀ (talk) 21:38, 9 November 2016 (UTC)[reply]

fr.WT is throwing a party[edit]

Wiktionnaire is throwing a party to celebrate passing three million pages. - Amgine/ t·e 18:23, 9 November 2016 (UTC)[reply]

Ho, yes. You're welcome if you are in Paris November 20th! Noé (talk) 14:24, 10 November 2016 (UTC)[reply]
Congratulations on 3M! Wonder if en.wikt should do something for our next milestone, which is coming up very soon... — Kleio (t · c) 19:38, 10 November 2016 (UTC)[reply]
I will happily host an event for en.WT or fr.WT, but let's talk about dates. 20th is a Sunday. And I am in Vancouver… - Amgine/ t·e 19:45, 10 November 2016 (UTC)[reply]
Maybe not very feasible to do something IRL, considering how the en.wikt community is literally all over the place and most people'd have to travel pretty far just to get to Vancouver :/ — Kleio (t · c) 14:15, 11 November 2016 (UTC)[reply]
I hereby promise I will arrange some Wiktionary thing in London and/or Oxford when we hit 5 mil. Maybe buy a beer or two for people who correct my awful typos. Most of you guys are in North America so it won't help much. Anyone who wants a beer can remind me when the time comes. If it's just me and Wonderfool I guess I can live with that. Equinox 03:27, 12 November 2016 (UTC)[reply]
Maybe we should create a page for those of us who want to meet other Wiktionarians, giving our general locations (more specific than WT:Wiktionarians, though). I'd like to grab a beer with you next time I'm in London (which will likely be years from now, unless I move to Europe). —Μετάknowledgediscuss/deeds 03:32, 12 November 2016 (UTC)[reply]
Wouldn't mind having a drink with some of y'all at some point, kinda curious as to what sort of people (other than me) edit this place. But, seeing as I'm a student and thus permanently sorta broke, I don't think I'll be traveling very far from Amsterdam for meetups, unless I'm already on vacation or it's in the Benelux somewhere. — Kleio (t · c) 16:44, 15 November 2016 (UTC)[reply]
Isn't it really easy to get from Amsterdam to Paris? Nothing like getting there from the US at least. --WikiTiki89 16:54, 15 November 2016 (UTC)[reply]
When are we throwing a party? — Ungoliant (falai) 16:07, 10 November 2016 (UTC)[reply]
I suppose the easiest/laziest way to celebrate the upcoming 5 million would be to join one of Wikipedia's regular local gatherings. Equinox 20:25, 10 November 2016 (UTC)[reply]
Does anyone live on the eastern seaboard of the US? I'm in DC and would love for once to actually meet another Wiktionarian. —JohnC5 03:48, 12 November 2016 (UTC)[reply]
I shall be celebrating 5 million pages on my own. --Derrib9 (talk) 11:49, 17 November 2016 (UTC)[reply]
WT:Main_Page, it was nice to know you. - TheDaveRoss 12:42, 21 November 2016 (UTC)[reply]
Hey, we had a good time! Preliminary announcement: We are planing a conference in June about Dictionaries, Wiktionaries and Francophonie (French language network) with a day dedicated to talks on lexicography (including prestigious people) and a second day with a workshop to share contribution habits and to train newbie on editing Wiktionary. We are still in a very early phase but we are thinking to link this event with the first WiktiCon, a global meeting for Wiktionarians. We aims to ask to the Wikimedia Foundation to help people to get flight tickets through a specific grant. It is doable, as Wikisource already did a similar event two years ago. What do you think about this idea? Is there some people interested to come to Lyon, France in June to meet other contributors? -- Noé (talk) 22:00, 22 November 2016 (UTC)[reply]

Automatic interwiki links : you can test the feature now[edit]

Hello all,

The first part of our plan about improving lexicographical data in Wikidata and improving structured data in Wiktionary, is to create automatic links between page on different Wiktionaries, that have the exact same name.

This feature (which is not technically related to Wikidata) is now available in a test environment, you can learn more and try here. Feel free to add feedbacks and specific examples, this will be very helpful for us to understand your needs.

Thanks, Lea Lacroix (WMDE) (talk) 10:36, 10 November 2016 (UTC)[reply]

Symbol redirects and infoboxes[edit]

FYI, sometimes when I find two symbols that are basically the same thing with differences merely in typography or appearance, I redirect one to the other and add multiple character boxes ({{character info/new}}) in the main entry.

For example, I redirected 💲 to $. I added 3 character boxes in 🛇, and in 🌎 too. I added 2 charboxes in . See also and for entries with 6+ boxes. --Daniel Carrero (talk) 14:11, 11 November 2016 (UTC)[reply]

OrphicBot edits[edit]

This bot is carrying out a lot of edits, replacing alternative forms with {{also}} at the top. I would rather have "Alternative forms". This is also happening with "alternative form of" entries like non-suspicious, which is even more puzzling. DonnanZ (talk) 16:03, 11 November 2016 (UTC)[reply]

It doesn't replace anything, it adds a template. This is not related to alternative forms at all (the word added may be an alternative form in some cases, but it's not the intention). Lmaltier (talk) 20:35, 11 November 2016 (UTC)[reply]
I admit I was wrong in the first instance, but why do it with "alternative form of" entries? It's giving the same information twice, which is silly. DonnanZ (talk) 00:15, 12 November 2016 (UTC)[reply]
It doesn't harm anything, and it would be too complex to try to parse the alternative form sections, especially if there was more than one language on the page. DTLHS (talk) 00:16, 12 November 2016 (UTC)[reply]
Hmm, OK, I agree no harm is caused, nor would there be any harm in manually reverting those particular bot edits. DonnanZ (talk) 00:23, 12 November 2016 (UTC)[reply]
It will just get added again the next time the bot is run. DTLHS (talk) 00:24, 12 November 2016 (UTC)[reply]
The same problem as MglovesfunBot. I've come to hate same bots.
@Isomorphic: Can this bot be modified so it ignores entries using {{alternative form of}}? DonnanZ (talk) 10:31, 12 November 2016 (UTC)[reply]
@Donnanz: This is not hard, but not advisable. It was discussed during the initial iteration in September, also. Imagine a very long entry with multiple languages, and an alternative-forms template somewhere in the middle, within the entry of a language you do not know. This would occlude the also-template for a homonymous word from a different language which is not an alternative form.
I have implemented functionality that prevents the robot from adding the same item twice to the same entry. This prevents edit-warring with human editors. At present, I have it enabled for my references template updating, but not for the also-templates, because I feel the value of the also templates is diminished the more exceptions are admitted. Isomorphyc (talk) 12:51, 12 November 2016 (UTC)[reply]
I disagree with making these exceptions and encourage you to continue as before. —CodeCat 13:07, 12 November 2016 (UTC)[reply]
I'm a little confused. Does this mean I can revert an edit to, say non-suspicious which uses {{alternative form of}}, without it being edited by the same bot again, or not? DonnanZ (talk) 13:13, 12 November 2016 (UTC)[reply]
@Donnanz: You cannot. It is technically possible, but my experience in the few months I have done this is that arbitrary exceptions create more confusion for users than they save clutter.
@CodeCat: Mainly I implemented a no-duplicate-edits option because I wanted to respond cooperatively to events on the recent changes stream for the classical references, not for also templates. I don't have all of the concurrency issues worked out yet, though. Isomorphyc (talk) 02:16, 13 November 2016 (UTC)[reply]
I'm not terribly impressed by that, but as a small cog in a big wheel I'll have to accept it. DonnanZ (talk) 11:18, 13 November 2016 (UTC)[reply]
Your contributions to Wiktionary have been much more significant than mine, over a longer time. Moreover, you are not the only major contributor with whom I have discussed this; you might see here if you would like: User_talk:Isomorphyc#Overkill. However, my sense is that there is approximately an 80/20 balance in favour of not treating alternative forms as exceptions in also templates. Drawing a line somewhere isn't hard, and I have made more complicated changes by broad request, such as the letters equivalences table. But I don't think a consensus exists for the policy you propose; and I have trouble advocating for it, since I think it is slightly reckless to make lexical exceptions in an orthographical, language-agnostic template. If I am wrong about this, though, I will implement something that can satisfy as many people as possible. I'm sorry if we can't agree about this. Isomorphyc (talk) 15:08, 13 November 2016 (UTC)[reply]
I am very much in favour of {{also}} despite this. I also use it as a means of creating a word (today it was Norwegian Bokmål macheten, which has the came spelling as German Macheten), even though there's another way round, {{also}} still has to be added to both entries, so I kill two birds with one stone. DonnanZ (talk) 16:09, 13 November 2016 (UTC)[reply]
As I understand it, looking at a few pages that have English sections and {{also}}, there are significant numbers of entries that should have an Alternative forms header (and content) and do not or have an Alternative forms header and omit some content. I limit myself to English, but the same should be true of some other languages.
Do I understand correctly that we can take advantage of what OrphicBot is doing or has done to add some missing Alternative forms content?
It should seems to me that it should be possible to create lists of English lemma entries that have {{also}} and for which {{also}} contains an English alternative form and for which English lemmas either there is no Alternative forms header or an existing Alternative forms header does not include the alternative form. Though the logic is not very complicated, it seems to me to be beyond what can be done effectively with search. This seems like a job for dump-processing to identify the candidate entries. The lists of candidate entries that need Alternative form headings and content or need additional content under an existing Alternative form heading could be processed using AWB to allow human review. DCDuring TALK 18:58, 15 November 2016 (UTC)[reply]

Boilerplate to new translation requests[edit]

How can I improve the boilerplate to new translation requests for effectiveness so that the problem of requests without destination language can be alleviated? --kc_kennylau (talk) 13:17, 12 November 2016 (UTC)[reply]

I'm not sure. The new boilerplate seems good, but in practice it looks like it's just been causing more confusion among anons who leave translation requests. —Μετάknowledgediscuss/deeds 19:20, 13 November 2016 (UTC)[reply]
I hope this will improve things. --WikiTiki89 16:58, 28 November 2016 (UTC)[reply]

Adding sublanguages to Module:etymology languages/data[edit]

Do I need some consensus to add the following to etymology languages? :

  1. Common Turkic (non-Oghur Proto-Turkic).
  2. Old Kirghiz (language of the Yenisei inscriptions, subsumed under Old Turkic).
  3. Buddhist, Manichean and Christian Sogdian (these can be inferred from the script, but the distinction seems to be philologically important).
  4. Written Mongolian (under Mongolian) for when an exact periodization (Middle, Classical or Modern) can't be found.
  5. Ordos, Khalkha, Chakhar Mongolian (All Cyrillic entries are Khalkha so this won't see much use, but Ordos has a very conservative phonology so it comes up often in etymologies).
  6. Khamnigan Buryat (this should probably be a language of its own at least according to Janhunen).

Also,

  • Some Greek lects seem to be wrongly subsumed under el instead of grc, someone knowledgeable about the details should check it.
  • Philistine is placed under datax, but I doubt we plan to have entries for it.

And what is the precedent for selecting codes? Does ISO cover sublanguages too? Crom daba (talk) 17:11, 12 November 2016 (UTC)[reply]

I corrected some of the obvious errors in the Greek section, but I wasn't sure about some of them (Thessalian, Arcadian, for instance), which could be Ancient or Modern. — Eru·tuon 17:35, 12 November 2016 (UTC)[reply]
Most of these would be straightforward. "Written Mongolian" on the other hand sounds iffy: do I gather that this is not a specific variety, but instead just a cover term? We would not add entries like "some variety of Slavic" or "Greek, dunno if Modern or Ancient", so this sounds like a bad idea as well.
As for script variants of Sogdian, are these needed just for technical reasons, or is there some other motivation? --Tropylium (talk) 13:44, 14 November 2016 (UTC)[reply]
I thought Philistine was discussed somewhere, but I can't find it. The rationale for adding it as a full language and not an etymology-language is that it is not a dialect of anything else the way the etymology-languages are, but rather its own language, even if only be preserved in loans — and Wikipedia says that there are some inscriptions of it in addition to the loans, anyway.
Is Khamnigan Buryat the same as Khamnigan Mongol, proposed for possible inclusion as a full language at WT:RFM#Even_more_languages_without_ISO_codes.2C_part_3?
The rest, aside from Written Mongolian, seem straightforward, as Tropylium says. Regarding Written Mongolian, we do sometimes have etymologies saying "From a Slavic word *foo", don't we (using the family code of "Slavic")? so "Written Mongolian" is not much different. - -sche (discuss) 17:05, 14 November 2016 (UTC)[reply]
Yes, Khamnigan Buryat is the same thing as Khamnigan Mongol.
There are supposed to be some subtle variations between various forms of Sogdian (monophthongization of diphthongs in Manichean and Christian texts for example), but I myself am not convinced of the utility of having this distinction.
I wasn't aware of Philistine inscriptions. We have languages that aren't dialects among etymology lects, like pre-Roman substrate of the Balkans for example.
The thing about Written Mongol is that its orthography reflects a very archaic form of Mongolic (Janhunen would call it Pre-Proto Mongolic, but I'd call that an abuse of terminology), but an attestation of a specific spelling in Middle or Classical Mongolian might not be found (or I might not be able to find it). So the simplest way to indicate that the word's spelling might be the best key to the word's history in absence of comparanda from other languages or historical inscriptions is to note something like "note Written Mongol [such and such spelling]". Crom daba (talk) 16:59, 15 November 2016 (UTC)[reply]
"Pre-Roman substrate" is not a language, though, it's an umbrella for potentially several unidentified languages, whereas Philistine is thought to have been a language. AFAICT the three substrate umbrellas are the only things in Module:etymology languages/data that aren't dialects. - -sche (discuss) 04:07, 16 November 2016 (UTC)[reply]


I've added all of these save Written Mongol and plus Solon (under Evenki). Crom daba (talk) 03:01, 3 December 2016 (UTC)[reply]

Wonderfool's back[edit]

This (on en.wikipedia) is obviously Wonderfool. KATMAKROFAN (talk) 19:04, 12 November 2016 (UTC)[reply]

Fascinating. DTLHS (talk) 19:04, 12 November 2016 (UTC)[reply]
Not everyone who deletes the main page is Wonderfool- it's the obvious thing one would do to show that one has complete privileges at a site. I'm glad you brought this to our attention, though. We all need to make sure our passwords are secure. Chuck Entz (talk) 22:14, 12 November 2016 (UTC)[reply]
WF is happy to take the blame for that. --Derrib9 (talk) 11:47, 17 November 2016 (UTC)[reply]
OurMine is Wonderfool. - TheDaveRoss 13:32, 17 November 2016 (UTC)[reply]

Password reset[edit]

I apologise that this message is in English. ⧼Centralnotice-shared-help-translate⧽

We are having a problem with attackers taking over wiki accounts with privileged user rights (for example, admins, bureaucrats, oversighters, checkusers). It appears that this may be because of weak or reused passwords.

Community members are working along with members of multiple teams at the Wikimedia Foundation to address this issue.

In the meantime, we ask that everyone takes a look at the passwords they have chosen for their wiki accounts. If you know that you've chosen a weak password, or if you've chosen a password that you are using somewhere else, please change those passwords.

Select strong passwords – eight or more characters long, and containing letters, numbers, and punctuation. Joe Sutherland (talk) / MediaWiki message delivery (talk) 23:59, 13 November 2016 (UTC)[reply]

Any idea who’s behind this? — Ungoliant (falai) 00:55, 14 November 2016 (UTC)[reply]
@Ungoliant MMDCCLXIV: Apparently a group called OurMine hacked several accounts and vandalised the ENWP main page twice, leaving the comment "Just testing your security". --Yair rand (talk) 02:12, 14 November 2016 (UTC)[reply]

Adding to the above section (Password reset)[edit]

Please accept my apologies - that first line should read "Help with translations!". Joe Sutherland (WMF) (talk) / MediaWiki message delivery (talk) 00:11, 14 November 2016 (UTC)[reply]

Favicon proposal[edit]

The current favicon is basically just ['w], which was compatible with the previous logo, not with the current logo.

I suggest using File:Wiktionary-favicon.svg as our favicon, which is the red slanted "W" from the current logo.

I requested the creation of this favicon in this diff from Commons. --Daniel Carrero (talk) 07:06, 15 November 2016 (UTC)[reply]

I'd completely forgotten about this, although I proposed it here. In any case, I obviously support; I'm not sure if we need a vote for this (historically, we have voted before changing the favicon, but there's no reason we have to). —Μετάknowledgediscuss/deeds 07:12, 15 November 2016 (UTC)[reply]
Something [like this], more distinct from Wikipedia's [than the current logo] would be nice. Something askew seems appropriate. DCDuring TALK 13:21, 15 November 2016 (UTC)[reply]
Hm, I foresee my bookmarks menu changing appearance rather dramatically! I suppose we should change it for consistency, yes. Equinox 13:35, 15 November 2016 (UTC)[reply]
I made a more modern looking alternative. What do you think?—Enosh (talk) 20:03, 15 November 2016 (UTC)[reply]
I think it should look like the actual W in our logo (i.e. gradient darkest at top left). Equinox 20:08, 15 November 2016 (UTC)[reply]
Honestly, I think it would look better if it were not angled. --WikiTiki89 20:09, 15 November 2016 (UTC)[reply]
I second what Equinox said: I think it should look like the actual W in our logo. I prefer the "W" slanted like in the logo, too. --Daniel Carrero (talk) 20:36, 15 November 2016 (UTC)[reply]

I don't quite like this; why the W is red and likely to tilt or spin around? It makes me think about some politics. Just my opinion. --Octahedron80 (talk) 04:06, 16 November 2016 (UTC)[reply]

I feel the same way as Octahedron. Why red? It is reminescent of dreaded redlinks. I’d prefer seeing it in blue, or at least in the same shade of red used by other WM projects (commons). — Ungoliant (falai) 11:30, 16 November 2016 (UTC)[reply]
There is something to be said for consistency between the various logos, it is red and tilted because the main logo has a red-tilted W. - TheDaveRoss 12:52, 16 November 2016 (UTC)[reply]
The Wikipedia W is also tilted in the logo. — Ungoliant (falai) 13:02, 16 November 2016 (UTC)[reply]
Not on their favicon, which is very similar to what we have now. I would prefer the third option among those presented. I think I prefer the current over the new, however. - TheDaveRoss 21:30, 16 November 2016 (UTC)[reply]
The Wikipedia logo is basically a globe made of various jigsaw pieces, each piece with one character, including that "W". The "W" is not tilted as in a 2D image; it's just a bit distorted by our perspective, because it's part of a 3D object.
If anything, Wikipedia's favicon is a normal "W" (non-slanted, non-distorted) apparently because it's the W from "Wikipedia" written below the globe.
(Incidentally, their is an awesome logo; I think it would fit Wiktionary better than it fits Wikipedia.) --Daniel Carrero (talk) 21:54, 16 November 2016 (UTC)[reply]
Then how do you explain the favicon of fr.wikt, for example? The tile is tilted in a the opposite direction from the one in the logo, and yet it still looks good, but only because it is a tile; and even so, if the tile had been tilted the way it is in the actual logo, that would be too far. Favicons shouldn't have such a tilt. Ours probably shouldn't have any tilt because by itself it's just a letter, not a 3D tile. --WikiTiki89 21:58, 16 November 2016 (UTC)[reply]
Another version per Ungoliant.
—Enosh (talk) 13:24, 16 November 2016 (UTC)[reply]
I'd go for either the first or the third. DCDuring TALK 16:09, 16 November 2016 (UTC)[reply]
I'd go for either the first or the third, too. I still think that the 1st has a good advantage for being exactly the same as what we find in the logo, but the 3rd is one beautiful alternative option. --Daniel Carrero (talk) 21:21, 16 November 2016 (UTC)[reply]
I agree with those who've said the favicon W doesn't need to be tilted. Tilting it looks unprofessional (looking at other websites, they have upright letters and other logos) and makes it harder to recognize as a W. OTOH, an upright W is not very distinct from Wikipedia's, and almost seems to suggest that the main difference between us and Wikipedia is that we're "red" in some way, which makes little sense. (Perhaps a tilted red W suggests that the difference is that we're red and slanted...) We could derive our favicon from fr.Wikt's the way we derived our logo, by removing the tile, but I don't know if that would look any good or not. I don't mind the current favicon, personally. - -sche (discuss) 04:39, 17 November 2016 (UTC)[reply]
What if we made it go from red on the left to blue on the right, like how we turn red links into blue links? --Derrib9 (talk) 11:45, 17 November 2016 (UTC)[reply]
I prefer not having a logo with the idea of turning red links into blue links, because when Wiktionary is complete (let's try to finish it before 2017) we won't have any more red links. --Daniel Carrero (talk) 23:08, 20 November 2016 (UTC)[reply]

"From Translingual"[edit]

Was just looking at stevia, but this affects all the words in Category:Terms derived from Translingual. Is it just me that finds it rather ridiculous to see a word's etymology given as "from Translingual", as though "translingual" is a language, or for that matter means anything at all to anyone outside the Wiktionary community? I have no problem with using m|mul|foo to link to the source word in question, but the text itself should really say something like "from scientific Latin" (which is what it is in 90 percent of cases). Maybe we already talked about this somewhere. Ƿidsiþ 07:46, 16 November 2016 (UTC)[reply]

I don't think we've covered this matter thoroughly, but see these Nov-Dec 2015 and June 2016 discussions.
There don't seem to be too many of such entries: a regex search for {{etyl}} followed immediately by "|mul" found only 102, but relatively common ones. There are many more entries (494) that have "From the genus name", but omit categorization. Both (and other similar) conditions need correction and we need to prevent recurrence.
Sadly, this is yet another way that taxonomic entries (and other Translingual entries?) don't fit the standard that works for other entries. Further, a change requires differentiating etymologies in taxonomic entries from those (few?) in other translingual entries, so it cannot be instantly implemented by a "simple" change in a template or module. Perhaps we can use a kludge workaround like having "tax" or "taxo" as a "language" displaying different text when used in {{etyl}}. Candidates for the text include "the taxonomic name" as well as "Scientific Latin" and "New Latin".
A more radical solution that may give more degrees of freedom for other changes would be removing taxonomic name entries from "Translingual" and placing them in their own pseudo-language. I doubt that a majority of Latinists would welcome taxonomic names as "Latin". DCDuring TALK 16:08, 16 November 2016 (UTC)[reply]
@DCDuring: I imagine that the proposed taxo-pseudo-name would only compound the problem. I agree that no term is "derived from Translingual"--terms are only considered to be translingual and then it's basically only because it's a scientific term or an individual character. —Justin (koavf)TCM 16:18, 16 November 2016 (UTC)[reply]
@Koavf Those are the most numerous cases by far, but, like every junk drawer, Translingual contains other things. Being a pragmatist and nominalist as well as a fallibilist, I disagree with your invidious use of considered. Are you objecting only to the "more radical solution" or to the less radical ones as well. DCDuring TALK 16:31, 16 November 2016 (UTC)[reply]
@DCDuring What's offensive about that...? Since "Translingual" isn't a language, no terms can be derived from it. It's also just a local convention here--can you find any source which says that "[x] was derived from Translingual" other than a curiosity of some modules and templates at this project? I think your initial proposal is probably best and is a logical way to think of how certain terms are derived. —Justin (koavf)TCM 16:36, 16 November 2016 (UTC)[reply]
Yeah "from the taxonomic name" also works. To be honest the consistency doesn't even bother me, as long as we avoid "from Translingual". Ƿidsiþ 16:32, 16 November 2016 (UTC)[reply]
To reduce the number of future reintroductions of the objectionable wording, I think we could have {{etyl|Tax.}} categorize as {{etyl|mul}} does now, but display "(f/F)rom the taxonomic name". The "from/From" probably should not be part of the template display text. DCDuring TALK 17:12, 16 November 2016 (UTC)[reply]
I believe the following would implement the "Tax." proposal in Module:etymology_languages/data
-- Translingual varieties / m["Tax."] = { / CanonicalName = "taxonomic name" / parent= "mul" } [ "/" indicating line breaks ]
AFAICT, this would lead {{etyl|Tax.|en}} Rosa to display "taxonomic name Rosa" and categorize the entry into Category:English terms derived from Translingual.
This would mean that the candy "From" and, possibly "of" or "of genus" would need to be added, but that {{etyl|Tax.}} could be used flexibly. DCDuring TALK 20:21, 16 November 2016 (UTC)[reply]
I did propose that last month in Wiktionary:Beer parlour/2016/November#Suggestion: treat taxonomic names as a dialect of Translingual. But I prefer "taxonomic Latin", instead of "scientific name" or "taxonomic name", because otherwise people may be tempted to say "from Late Latin" in addition or instead of "Translingual". How many taxonomic names say "From Late Latin"? I recall I've seen quite a few. --Daniel Carrero (talk) 21:07, 16 November 2016 (UTC)[reply]
@Daniel Carrero: Three of the troubles with "taxonomic Latin" are:
  1. It is not only the least common of "scientific name", "New Latin", "scientific Latin", "taxonomic name", and "taxonomic Latin" (order of decreasing contemporary frequency), but does not appear at all in Goggle Ngrams, and about 20 independent times in Google books with preview.
  2. It implies that the taxonomic name is Latin and will be found in a Latin language section, which is a notion repeatedly rejected here by Latinists and by others, for pretty good reasons.
  3. It addresses a problem that does not exist.
There are 16 instance of "(f/F)rom Late Latin" appearing on pages that use {{taxoninfl}}, ie, virtually all and only taxonomic name pages. Of these, there are exactly 0 that are followed by a taxonomic name; they are all followed by - wait for it - a Late Latin word that usually was found in my copy of Souter's A Glossary of Later Latin to 600 A.D..
The other "Latin" names share troubles 2 and 3 and each have there own troubles. "Scientific Latin" implies that other scientific fields besides biological taxonomy have terms in scientific Latin that are not in other vintages or sorts of Latin. This may be true, but we have approximately none of them. "New Latin" is principally used in Translingual-entry etymologies to account for the alteration (change of script, inflectional endings, etc) of terms from other languages into the Latinate forms used in taxonomy.
"Scientific name", like "Scientific Latin", implies that there are similar 'scientific names' from fields other than biological taxonomy. DCDuring TALK 23:06, 16 November 2016 (UTC)[reply]
Making mul-tax (which could have the above-proposed Tax. as an alias, but not tax which is already the ISO code of the language Tamki) an etymology-only variant of mul, as proposed a few sections up, seems like an elegant solution. - -sche (discuss) 04:30, 17 November 2016 (UTC)[reply]
I implemented "Tax." and tested it. It still has one problem, that ought to be fixed: see https://en.wiktionary.org/w/index.php?title=stevia&oldid=41681829 (it's the revision of stevia using code "Tax.") The derivation category currently is Category:English terms derived from taxonomic name, it should be Category:English terms derived from taxonomic names (with "s" in the end). --Daniel Carrero (talk) 22:38, 20 November 2016 (UTC)[reply]
I think that mul-tax should be the code, as proposed by -sche. I also think that we should retire the other irregularly-named codes like "LL.", but that's another discussion. —CodeCat 22:49, 20 November 2016 (UTC)[reply]
Sure, for the sake of supporting the project of converting all codes like LL. -> la-lat, I'd support using mul-tax (actually I suggested using "mul-tax" in the previous discussion). --Daniel Carrero (talk) 22:53, 20 November 2016 (UTC)[reply]

A tool for language learners[edit]

Hi all,

Is there some students in here that are learning foreign languages? Well, it is not easy, but sometimes, doing thematic lists of vocabulary may help to record them. Wiktionary do provide the vocabulary and extensive categories with too much verbs, too much animal and so on. The idea is to provide a tool, maybe a core feature that allow readers to register a page into a Custom list page, and during the registration process to provide a list of subsections or to offer to create a new one. I can not do that, but I wrote a proposal in the 2016 Community Wishlist Survey and I invite you to provide advices, critics and suggestions on this idea. There is other nice propositions, few for Wiktionaries, but this is one Noé (talk) 09:46, 16 November 2016 (UTC)[reply]

Hi @Noé: It would be intriguing if you could get WMF to agree to store user data of this sort. Perhaps your users will agree to store it in public, albeit that to do so in Wiktionary user space would not be very efficient. Even so, I suspect this is more of a policy question than it would be technically difficult, even for ordinary users to implement. One could imagine quite a few applications that could be built on top of such lists that would make Wiktionary more useful to its users. Isomorphyc (talk) 20:50, 16 November 2016 (UTC)[reply]
You could conceivably create a tool that would store the lists client-side. --WikiTiki89 20:53, 16 November 2016 (UTC)[reply]
This is true; users would hate the device dependency. Since WMF suggests a 30-day expiry for user data, it's possible to invent a hybrid protocol. For example, one could store in browser `permanent' local storage in the short term, encrypted on the server with a key which only the user knows in the middle term to seed devices, and in user-controlled local or cloud storage (with a key we don't know) in the long term. Storing encrypted data in user space would be attractive, but it would horribly bloat the xml-serialised database archives, and therefore is impossible. I would want to clear the protocol and intent with WMF first, but I think it is technically feasible, if cumbersome. The main regulations which concern me are here: wikitech:Wikitech:Labs_Terms_of_use. Isomorphyc (talk) 21:46, 16 November 2016 (UTC)[reply]
What about storing them as files the old fashion way? That way they can be shared at will. --WikiTiki89 21:53, 16 November 2016 (UTC)[reply]
It won't attract anything but very dedicated users; remember that we get about thirty pageviews per second, but only one non-robot update per minute. Almost all Wiktionary users never enter data into Wiktionary and most are not likely using devices with keyboards. Only users who are already active contributors will be likely to use something which requires moving files around to keep devices in synch. A monthly download, however, I think is not too much to ask in exchange for privacy. That said, what you propose (with a browser local storage cache to avoid a new download for every action) is certainly correct for a first version. Isomorphyc (talk) 23:18, 16 November 2016 (UTC)[reply]
Many of us keep word lists in "sandbox" pages. Some users who contribute via an ip address keep such lists in their talk page (see User talk:218.150.200.171 as a current example. SemperBlotto (talk) 07:38, 17 November 2016 (UTC)[reply]
Great idea, I use Wiktionary as one my main learning tools. It will be tricky to implement something on top of the existing infrastructure, as has been pointed out. I think it would have to be public, and ideally collaboratively. Maybe it could work similarly to the watchlist, where you can add and remove items with a simple click. Or it could would work like the "HotCat" gadget, except that the changed data + categories are local. Don't have any good ideas for an easily implementable solution right now, though. – Jberkel (talk) 11:45, 17 November 2016 (UTC)[reply]
It would almost certainly have to be done with a gadget front end, at least on the web side. Another option could be an extension, but I don't prefer it. @SemperBlotto: Lists are interesting; I like the possibility of creating a someplace slightly more structured than sandboxes, for people who already make lists, and those who don't, which other gadgets can interface with, for example. Thanks for getting involved in this discussion. Isomorphyc (talk) 12:14, 17 November 2016 (UTC)[reply]
Interesting, just read a proposal to extend watchlists which might be useful for creating word lists: meta:2016_Community_Wishlist_Survey/Categories/Watchlists#Watchlist_FoldersJberkel (talk) 12:51, 17 November 2016 (UTC)[reply]

Nominate a Foreign Word of the Day![edit]

This is just a reminder that we always need more nominations for Foreign Words of the Day. We're especially lacking interesting words in languages other than the biggest ones here (like Spanish, German, or Latin) that have both IPA (or an audio file) and a quotation (or a reference, for LDLs). That said, any words you can nominate would help tremendously.

Nominate here: WT:Foreign Word of the Day/Nominations. Thanks! —Μετάknowledgediscuss/deeds 02:13, 18 November 2016 (UTC)[reply]

Suggestion: Not accepting wikis if we accept citations from the internet[edit]

If we accept citations from the internet for attestation purposes (as proposed in this noticeably huge discussion from last month), I think it would be a good idea if wikis were an exception, at least for now. That is, we would not attest words by using citations from Wikipedia, Wikia and whatever other wikis out there.

Maybe it's just me, but I feel that excluding wikis would be a good idea because their content is "dynamic" in some sense; that is, a given wiki page is likely to be an unfinished page, and the current revision is likely to change at any moment. An "alternative" spelling or misspelling may be changed into the "main" spelling, and the current wording is likely to disappear and be replaced by a different wording.

I'm thinking this as a form of "baby steps". If attesting words by using citations from the internet minus wikis is accepted and works well, we may want to discuss about accepting citations from wikis in the future. --Daniel Carrero (talk) 11:47, 20 November 2016 (UTC)[reply]

I would say nearly all Web pages are "dynamic" in that sense, likely to change at any moment. Not just wikis! Equinox 12:39, 20 November 2016 (UTC)[reply]
@Daniel Carrero: We can just link to permanent versions of pages that are made with MediaWiki. —Justin (koavf)TCM 19:25, 20 November 2016 (UTC)[reply]
So, wikis should not be an exception then? Ok I guess. --Daniel Carrero (talk) 11:29, 22 November 2016 (UTC)[reply]
The main problem with wikis is verifying independence: it's not uncommon for one person to go all over the internet to promote their pet protologism. I regularly see the same IP adding wording to Wikipedia so they can refer to it on Wiktionary, and vice versa. I can only guess about logged-in edits, because we have no way to know if the same person is using two different accounts. Chuck Entz (talk) 14:43, 22 November 2016 (UTC)[reply]
  • We obviously need special templates with date stamps that flag possibly short-lived internet-only ("IO") citations. We would also benefit from software that periodically confirmed that the link was live and was to a page that still contained the citation in its original context. In the new day that is dawning this should be easily constructed from readily available software tools. We should proceed in baby steps, first developing the template, them using it for any IO citations we already have and for new RfVs. We should initially only test for validity of older IO citations, but we could aspire to first complete annual validity checks, proceeding to greater frequency, and eventually real-time change monitoring of all web pages linked to by the IO citation template. DCDuring TALK 15:03, 22 November 2016 (UTC)[reply]
We should not wait too long for this, but rather be proactive. The US NSA probably already has this capability. It is just a matter of time before Wikileaks makes the source code available. DCDuring TALK 15:06, 22 November 2016 (UTC)[reply]
We do have {{quote-web}}. It already has fields for date, author, site name, page name, etc. so it looks complete enough to me; but if we need more information or anything else, I'd suggest just implementing the new stuff on the existing template. A crawler seems OK to look for the presence of text in the websites, but if we want to use quotes from images, comics and videos available on the Internet, then presumably the crawler would have to check for the presence of the original file, as opposed to checking for text. Incidentally, I'd probably oppose using random memes with anonymous authors as sources of quotes, because of the problem of verifying independence that Chuck Entz described above. Maybe we should implement this rule: all quotes from the internet must have the "author" parameter filled in, otherwise they are void. --Daniel Carrero (talk) 15:21, 22 November 2016 (UTC)[reply]
Given that we're so far along on this, the remaining issues may already be solved too. Are file names guaranteed to have the same content? Could the crawler run on MW servers? DCDuring TALK 15:49, 22 November 2016 (UTC)[reply]
I really prefer citations to Wikimedia sites, since with fairly rare exceptions, a revision link will be as permanent as we are, whereas a web page is likely to disappear at some point.--Prosfilaes (talk) 06:15, 24 November 2016 (UTC)[reply]

IPA superscript parentheses?[edit]

Currently Chichewa Mmalawi is transcribed as /m.maˈɽá.w⁽ᵝ⁾i/ (superscript parentheses, not recognized as valid characters) or /m.maˈɽá.w(ᵝ)i/. Should Module:ny-IPA generate the second transcription (Wyang stated that superscripts without superscript parentheses look odd), or is there a third option that would be better? DTLHS (talk) 15:01, 20 November 2016 (UTC)[reply]

Neither one. Even without parentheses, ᵝ is not a valid IPA character. According to Wikipedia, the sound formerly spelled ŵ is a "closely lip-rounded [w] with the tongue in the close-i position" that has completely merged with /w/ in most dialects. It seems to be commonly transcribed with /β/ in the literature, so I'd prefer the module to generate "/m.maˈɽá.wi/, /m.maˈɽá.βi/" as two separate forms, rather than trying to merge them. Alternatively, if people feel that /β/ doesn't adequately represent the sound in question, a "closely lip-rounded [w] with the tongue in the close-i position" could probably be transcribed /w̹ʲ/. —Aɴɢʀ (talk) 15:37, 20 November 2016 (UTC)[reply]
Thanks. Metaknowledge came up with the transcription scheme, let's see what they have to say. DTLHS (talk) 18:55, 20 November 2016 (UTC)[reply]
The use of the ŵ sound is very rare, and none of my Chichewa-speaking informants have it. It's simply that for those few that do have it, it is phonemic (it has minimal pairs), and some people still reflect it in their orthography even if they don't say it. The literature on Chichewa tends to be lazy and use simpler IPA characters, by the way, like /l/ instead of /ɽ/; I figure we might as well be closer to reality on that one. But using a new character will be confusing... In the end, I guess that the best solution is to generate two prons as Angr suggested, with the second labelled (now rare). —Μετάknowledgediscuss/deeds 19:19, 20 November 2016 (UTC)[reply]

Recent academic paper about Wiktionary[edit]

I just found this: How Many People Constitute a Crowd and What Do They Do? Quantitative Analyses of Revisions in the English and German Wiktionary Editions: [8]. Equinox 05:59, 23 November 2016 (UTC)[reply]

Here is a particularly charming passage from the conclusion: We cannot expect Wiktionary to become a better dictionary on a wide basis than established dictionaries. In consequence, if no professional work is conducted on dictionaries, Wiktionary will be no long-term compensation. On the other hand, it is a pleasure to see that there is a language-interested community that works on dictionaries voluntarily. Is this not also a sign for the relevance of dictionaries?Μετάknowledgediscuss/deeds 07:23, 23 November 2016 (UTC)[reply]
For the German wiktionary that's somewhat correct. For example, the German wiktionary requires references and sources or several examples for and in every entry. Without established dictionaries like Adelung, Grimm's DWB, Duden, DWDS, Canoo, wissen.de or for English dictionary.com, oxforddictionaries.com, merriam-webster.com, Pons etc., the German wiktionary would have much less content. But on the other hand, information could then be more correct. Duden and canoo contain auto-generated forms which sometimes are incorrect - for the German wiktionaries that's acceptable as it's sourced. -84.161.52.152 08:12, 23 November 2016 (UTC)[reply]
@Metaknowledge: I don't really have the time to read the article--can you summarize why the authors think that Wiktionary cannot be expected to become a better dictionary on a wide basis than established dictionaries"? @84.161.52.152: The German-speakers higher standards are also true on de.wp and de.wq. On the latter project, they've adopted a policy that a quotation can only be include if it is also quoted by another source. I think that's actually a good policy in many respects. Generally, their tack seems to be really emphasizing quality over quantity and they are aggressive about deletionism. —Justin (koavf)TCM 08:14, 23 November 2016 (UTC)[reply]
It's not about having higher standards. It's just that we attempt to be a primary source while they apparently don't. DTLHS (talk) 08:18, 23 November 2016 (UTC)[reply]
@DTLHS: Can you elaborate on that? I'm actually really interested in the discrepancies between language communities in WMF projects but it's very difficult to get a grasp on it without knowing 276 languages. —Justin (koavf)TCM 08:31, 23 November 2016 (UTC)[reply]
@Koavf: As for Wiktionary: The English wiktionary is usually based on real usages as found in texts, not on dictionaries and mentionings. The German wiktionary instead also uses dictionary entries as sources. So instead of finding usages in books and finding out the meaning, one could simply go to en.oxforddictionaries.com and rephrase their definitions. In de:back for example the definition could even be copied from reference duden.de. Furthermore, the German wiktionary even uses wikipedia articles as sources. There might be a rule like the wikipedia article has to have sources, but some wiktionary entries are based on unsourced wikipedia articles, e.g. de:Jarimani. | For wikiquote that sounds like a good idea, but for wiktionary it's not. And considering news and current events, it wouldn't be a good idea for wikipedia too. -84.161.52.152 10:33, 23 November 2016 (UTC)[reply]
English Wiktionary's policy allows it to beat other dictionaries to the punch for new words, whereas de.wikt's does not. Whether we actually do depends on the breadth of regular contributors' connection' to word-producing communities and our openness to attempts by others to add terms to Wiktionary. The "hotwords" idea seems really good, as it allows us to fulfill this potential without violating good attestation rules if we follow up to add cites that span a sufficient period. The hotwords exception to our attestation rules may not go far enough in allowing new terms. DCDuring TALK 12:25, 23 November 2016 (UTC)[reply]
Another difference in methodology between the German and English Wt and Wp communities is that when the English find dubious information, they discuss it or try to improve it and leave it be until a verdict is found. The Germans are very strict and prone delete at once without attempts to improve. Korn [kʰũːɘ̃n] (talk) 13:14, 23 November 2016 (UTC)[reply]

Great! Publications on Wiktionary are increasing. I try to keep track of them in Meta. I invite you to update this page if you find more publications. Noé (talk) 12:47, 23 November 2016 (UTC)[reply]

"character info/new" in all single-character entries[edit]

Greetings! FYI, I believe I added {{character info/new}} in all single-character entries that are not redirects. (unless there are any entries for private use or otherwise undefined codepoints, I didn't check for those)

See the subcategories of Category:Unicode blocks.

I'm pretty sure I added that template to at least 28,000 entries for Han script characters that did not have any charbox at all.

I deleted all the older charbox templates, which required manually filling in important fields like "name", "codepoint", "previous" and "next". I did not count, but I think over 90% of the pages that had an older charbox template did not have the "previous" and "next" filled at all. A few entries like diff, diff, and diff had incorrect block names. All the character information should be properly filled now (as long as the modules are correct).

P.S.: Actually, is an unassigned codepoint, but it exists and has a charbox because it's cofirmed as a future Unicode character; 𪜁 does not have a box because it's a "no entry". --Daniel Carrero (talk) 14:44, 23 November 2016 (UTC)[reply]

Ideophones[edit]

Grammars of many Bantu languages have a separate part of speech for particles called ideophones (w:Ideophone), which can often behave as adverbs but carry the semantic load of adjectives and do not inflect. I propose that we add this as an allowed PoS to WT:EL so that those languages which have editor communities that determine that it should be used can use it. —Μετάknowledgediscuss/deeds 23:23, 23 November 2016 (UTC)[reply]

I can't say I've run into these in Zulu, but there is a part of speech called "Relative". Relatives act like adjectives but inflect somewhat more like a verb, though really it may be better to just treat them as adjectives with a different kind of inflection. —CodeCat 23:44, 23 November 2016 (UTC)[reply]
There are lots of ideophones in Zulu; IIRC, Doke derives many verbs from ideophones (sometimes inaccurately, as the ideophone may actually derive from the verb). There is an argument to add a header for relatives, but I think that they can be handled as adjectives. —Μετάknowledgediscuss/deeds 23:48, 23 November 2016 (UTC)[reply]
I'm not aware of ideophones as a separate part of speech, though. The term itself is new to me. —CodeCat 00:29, 24 November 2016 (UTC)[reply]
Yep, ideophones may come from onomatopoeic origin and can be frozen (grammaticalize) in a specific way in some languages, and describe as such. It is known for Bantu and similar in Tupi-Guarani languages. I can provide you this quotation by a typologist on this matter Noé (talk) 15:02, 25 November 2016 (UTC)[reply]

"In So. Bantu languages, hundreds of complex predicates may be built by combining a single light verb - say/do - with ideophones that carry a large variety of meaning, quite a few of them manner adverbial. Many ideophones are derived by suffixation from known verb stems. Others are clearly onomatopoeic sounds, and many are of undetermined origin. The light verb say/do is the only finite verb in the ideophonic clause. The ideophones themselves carry no verbal morphology." - from Talmy Givón, 2001, Syntax. An Introduction, Volume 1, John Benjamins Publishing Company, page 167.

Requesting indef block[edit]

Could you block my account, please? Thank you. --Fsojic (talk) 10:24, 24 November 2016 (UTC)[reply]

Wiktionary:Blocking policy makes no provision for blocking someone just because they ask us to. If you want to stop editing, just stop editing. —Aɴɢʀ (talk) 13:06, 24 November 2016 (UTC)[reply]
I don't want to stop editing, that's the problem. --Fsojic (talk) 15:19, 24 November 2016 (UTC)[reply]
I could insult someone (I have a person in mind). Could that work? --Fsojic (talk) 16:02, 24 November 2016 (UTC)[reply]
@Fsojic: What is the problem here? Do you have some compulsive disorder? I'm trying to figure out why you are having a problem just not editing. —Justin (koavf)TCM 16:41, 24 November 2016 (UTC)[reply]
I might have a compulsive disorder, I don't know. I'm depressive, and have a propensity to develop various addictions (particularly Internet-related, not the least of which is my addiction to Wiktionary). --Fsojic (talk) 20:01, 24 November 2016 (UTC)[reply]
Change your password to a very long random sequence of characters and log out. That's the working solution right there.--Dixtosa (talk) 20:56, 24 November 2016 (UTC)[reply]
And don't forget to change your email to something that you can't access (or you'll be able to reset your password). DTLHS (talk) 20:59, 24 November 2016 (UTC)[reply]
Fsojic was block as requested in French Wiktionary, in regards of Right to vanish policy. Noé (talk) 15:05, 25 November 2016 (UTC)[reply]

@Fsojic, get some help. Restricting yourself by force will only work in the short term. If this is as serious as you indicate, please avail yourself to finding a more long-lasting solution. Leasnam (talk) 17:01, 25 November 2016 (UTC)[reply]

Another suggestions is to go on a vandalism spree with that account. That way you're bound to get blocked. I've tried it before, it's very effective. --Derrib9 (talk) 11:09, 26 November 2016 (UTC)[reply]
That's exactly why we should be blocking people per their request.--Dixtosa (talk) 11:41, 26 November 2016 (UTC)[reply]
Yes, and I thought we had blocked some people by request in the past. I see no problem with it; as long as the user is not blocked from sending e-mail, they can e-mail an admin if they want to be unblocked for any reason, including e.g. that it's their little sibling who make the block request after running off with their computer. - -sche (discuss) 23:52, 26 November 2016 (UTC)[reply]
I agree that blocking people by request seems fine. —Granger (talk · contribs) 00:21, 27 November 2016 (UTC)[reply]
Yes, we have done so, e.g. User:Logomaniac. Equinox 03:23, 27 November 2016 (UTC)[reply]
I have blocked the user per their request and the above discussion. They can still use the "e-mail user" feature to e-mail an admin if they wish to be unblocked. - -sche (discuss) 07:05, 27 November 2016 (UTC)[reply]

Rethinking the approach to the presentation of senses[edit]

Currently, the practice on Wiktionary regarding the presentation of word senses is that the senses are split and grouped by parts of speech, and all other information relating to the word senses (e.g. synonyms, antonyms, translations) is presented outside the blocks of senses. The entire senses-related content is generated mostly by plain Wiki code, with some use of templates. This approach to the presentation of word senses (IMHO) has several characteristics that may hinder either efficient dictionary-building by us or clear presentation to readers. They include:

  • Separation of the senses from sense-specific see-also terms (synonyms, antonyms, etc.), and consequently required repetition of the senses outside of the senses block (e.g. machen).
  • Cluttered, unclear and fragmented presentation of sense information to readers, as a result of multiple headings (e.g. post) and duplication of sense-specific information in multiple places.
  • Using parts of speech as headings is less intuitive than a Definitions heading and may make definitions hard to find, especially if they are swamped by other text. A recent feedback was at Talk:bog-standard.
  • For less inflecting languages, arbitrariness in the choice of parts of speech, for a sense which can often be used in multiple speech environments (e.g. link). This results in disjointed and disorganised presentation of the word senses. An example is this – four headings, but essentially one meaning of “working hard”.

This may be the cause of the presence of a much lower proportion of synonyms and antonyms information on Wiktionary, compared to other dictionaries; only 49,890 pages (< 1%) contain “Synonyms”, and only 725/38,947 Russian lemma pages (1.9%) contain “Antonyms”. The repetition in layout discourages editors from adding in such information. I think an integrated approach to senses, which regards synonym, antonym and part of speech information as ancillary to the individual senses, will be more conducive to an efficient and structured dictionary layout. It also allows the easy generation of sense-specific ids ({{senseid}}), customised linking (e.g. to English sections of links) and unchallenging response to any requirement of sense organisation, presentation and formatting that may arise in the future (e.g. first letter capitalisation, final periods, expand all sense-related information, etc.).

Below is something I created as a test for the Chinese word 努力 (“to strive; hard-working; diligently; endeavour”). The code for this and more examples can be found at User:Wyang/zh-def.

{{User:Wyang/zh-def/努力}}

I'm interested to hear people's thoughts on this. Cheers! Wyang (talk) 14:52, 24 November 2016 (UTC)[reply]

I support listing synonyms and antonyms with the senses, but I'm not in favour of lumping all the PoS together. The "Definitions" proposal failed before for a reason. —CodeCat 15:01, 24 November 2016 (UTC)[reply]
  • @Wyang: The presentation you suggest would be a big surprise to anyone accustomed to English print dictionaries. Would their surprise be delight or horror? What classes of users (age, first language, level of education, contributor/noncontributor) would prefer each of the two main elements of the proposal:
  1. merging definitions across PoSes and
  2. placing synonyms and antonyms (but not hypernyms, hyponyms, coordinate terms, meronyms, etc) immediately beneath(?) the definitions?
What would this look like for an English entry like [[head]]?
How does this address the situation of the users who are just looking for English definitions?
How do Chines monolingual dictionaries present lexical information? Chinese translating dictionaries?
Assuming that this would be superior for Chinese, for what other languages would this be a superior way to present lexical information? DCDuring TALK 15:33, 24 November 2016 (UTC)[reply]
Maybe I could still change my mind based on this discussion, but for now unfortunately I didn't like the new layout very much. IMO, each POS is important enough to merit its own header. Also, maybe it would look bad in entries with many definitions in the same POS. I"m wondering if we would have 30+ definitions of line starting with "noun". On the flip side, I like synonyms nested in each sense. --Daniel Carrero (talk) 15:58, 24 November 2016 (UTC)[reply]

Accessibility I think this proposal is generally fine--aesthetically and logically--but we really need to stop having content in collapsible lists, as it's not accessible to users who have scripts turned off or who have certain neuro-motor issues and have trouble with fine control of a mouse to click "expand". —Justin (koavf)TCM 16:44, 24 November 2016 (UTC)[reply]

I believe the script is what adds the box in the first place, so someone without scripts will see the content just fine. It's only when the script is interrupted after the box is added that content is unviewable. Also, there's an option to always show boxes as expanded. I think the damage to usability from having to page through some of the enormous tables for highly-inflected and agglutinative languages is pretty serious in itself. Chuck Entz (talk) 17:00, 24 November 2016 (UTC)[reply]
I like it, but I would want the expansion links next to the definition and not floating to the right (also the color could be improved). I also think you should make an example of what a complex English page like set would look like. DTLHS (talk) 17:15, 24 November 2016 (UTC)[reply]
Also I think it would be easier for humans to edit if each definition was in its own template, rather than lumping them all together. DTLHS (talk) 17:22, 24 November 2016 (UTC)[reply]
  • We have little factual information about users other than ourselves. We are hopelessly unrepresentative of most users, even of most contributors. But I think that we can safely assume a few things:
  1. The main things that normal users seek are definitions (monolingual) and translations (bilingual)
  2. Normal dictionary users are accustomed to the presentation of dictionary information in formats not entirely dissimilar from our current one. See any word”, in OneLook Dictionary Search. for numerous examples of these formats.
I think a great deal follows from these simple premises, mostly contrary to the inclinations of some contributors here. They are apparently motivated by the occasional (or chronic) boredom that they suffer with the work that needs to be done to improve the basic content of Wiktionary (ie, definitions and translations). I really doubt that reformatting and reorganizing our mostly mediocre to poor content will do much to positively change our competitive position relative to other on-line dictionaries.
To me identifying and correcting specific types of content gaps and defects in existing entries would do more. What would be even better would be to understand the behavior of current users, and the reasons non-repeating users don't come back. DCDuring TALK 18:13, 24 November 2016 (UTC)[reply]
I like how it looks, but editing-wise, I'd hate to sift through all that data to get to the definition line I'm looking for (I already hate going through a jungle of {{ux}} templates whenever I want to find a definition), but if each sense gets an edit button that wouldn't be a problem.
Also this would be useful for Mongolian, where adjectives can be translated as nouns or adverbs in many cases, and verbs (their lemma forms) as abstract nouns.
But unfortunately I can't imagine how this could be done painlessly. Crom daba (talk) 19:08, 24 November 2016 (UTC)[reply]

Thanks everyone for the comments. I think I mainly looked at the issues from the angle of the issues I see in Chinese entries on Wiktionary. There are three issues that this can be decomposed into:

  1. Ectopic placement of see-also-type terms (such as synonyms, antonyms, hypernyms, coordinate terms, etc.) outside of the senses block, and compulsory back-reference to the senses due to this separation of sense-specific information.
  2. Lack of adequate aesthetic formatting for the presentation of senses.
    Compared to other online dictionaries, such as Oxford Dictionary (English) and MoeDict (Chinese), I would argue that the formatting layout for definitions on Wiktionary is outshone and quite unsatisfactory.
  3. The assumption that using parts of speech to group senses of a term is necessarily the most appropriate practice for every language. For less inflecting languages, this translates to the neglect of the fluidity of the concept of part of speech, and creating a much-greater-than-warranted gap between senses in the definition display (that is, using parts-of-speech headers to dissociate a group of closely-related meanings).
    For example, in Chinese, the part-of-speech information is important, but is truly not sufficiently important and suitable to warrant its role as a sense-divider. There are some pictures here demonstrating how various monolingual Chinese dictionaries present lexical information. None of the monolingual Chinese dictionaries, such as Hanyu Da Cidian (most comprehensive Chinese dictionary with some 370,000 words) and Taiwan's official Ministry of Education Dictionary, contained any part of speech information.

Each of these issues should be discussed and targeted separately. Feel free to change and improve on any technical details - the above snippet of User:Wyang/zh-def is only a crude demonstration of how the issues may be targeted, and there are many people here who are informatically more capable. We would all like to see a format that is more intuitive, aesthetic and more efficient, not one that will display 41 senses of set (Etymology 1, verb), then 41 senses of synonyms, then 41 senses of antonyms, etc. Wyang (talk) 03:38, 25 November 2016 (UTC)[reply]

I've always thought that contributors in a given language needed some autonomy, possibly more than they have now. This is more of a change than I ever contemplated. To me the issues become those relating to mapping between English and Chines and other languages. DCDuring TALK 14:44, 25 November 2016 (UTC)[reply]
I think we should at least give the changes to -nyms a try. —CodeCat 14:52, 25 November 2016 (UTC)[reply]
I believe so too. Wyang (talk) 05:48, 26 November 2016 (UTC)[reply]
  • I definitely think we should have a way of collapsing citations, synonyms, antonyms and translations within each individual definition line. (This has actually been proposed before, with code, by…I forget who, a Dutch editor who has since stopped contributing.) The current system, where we essentially write every definition out multiple times, is silly. However I do not support lumping all parts of speech in together – that doesn't seem workable to me in anything other than the most simplistic entry. Ƿidsiþ 15:22, 25 November 2016 (UTC)[reply]

The idea to lump all parts of speech is obviously separate from the idea to group synonyms etc with senses. I oppose lumping parts of speech, at least in most languages.
I share the concern one user expressed above, that grouping synonyms with senses makes it harder for editors to find a particular definition; however, as that user mentions, "complete" entries with many citations, usexes, {{defdate}}s and maybe {{qualifier}}s already present this kind of "sea" of text to wade through to reach the definition, and we are always becoming more complete, so rejecting this proposal doesn't save us from that situation long-term. Whereas, grouping synonyms with senses has several attractive benefits, like preventing a translation table for a certain sense from being orphaned when the sense is deleted but the table is not, preventing things from going out-of-sync when senses are re-ordered but synonyms are not, etc. The suggestion of making each sense its own template, potentially on its own line, seems more appealing to me than putting all the definitions as parameters of one [invocation of a] template, from a standpoint of editing the wikitext (and more easily putting an "edit" button by each sense) and from a standpoint of not forcing a single template to accept ~84 parameters just for the definitions of take, plus 84 more parameters for each sense's synonyms, 84 more for each sense's antonyms, ... - -sche (discuss) 23:42, 26 November 2016 (UTC)[reply]
I note that the Spanish Wiktionary seems to have synonyms under senses.—suzukaze (tc) 19:20, 14 December 2016 (UTC)[reply]
Is someone willing to set up a vote? @Daniel Carrero? I'd recommend not putting too much detail into how -nyms would be presented in this proposal, because what we're voting on is the concept and not the exact realisation. The looks can always be changed with templates and CSS whenever we decide to. —CodeCat 20:26, 14 December 2016 (UTC)[reply]
I like the idea to have synonyms, antonyms, translations and all the stuff that is per sense info somehow grouped with the senses. This would improve machine-readability and editing. However I also see the danger of adding too much clutter to the senses block, which is what most readers might be looking for. Is there maybe any way to have everything grouped by senses in the wikitext but still show the additional info below the senses block, maybe via a template? Concerning the POS grouping I think we should stick to the layout found in other online and printed dictionaries, which is was readers expect and will understand most easily. Matthias Buchmeier (talk) 23:05, 14 December 2016 (UTC)[reply]
As mentioned by a couple people above, putting -nyms and translations under their corresponding senses would really help prevent them from falling out of sync with each other. This is something I run into a lot when working with translation tables. Another benefit is that it would be so much easier to find. With the way things are now, glosses are often inconsistent (some repeat the whole definition, sometimes they differ between the -nyms and the translation tables, sometimes they consist of the first part of the definition and sometimes the simplest part, sometimes they are a summary of the definition, etc.). That makes it far harder to find the right translations because the tables aren't numbered, and don't always correspond well with the definitions. Additionally, figuring out which translation one wants often requires looking back and forth between the definition and the translation table, especially if there are many definitions. All in all, I would welcome moving such information under the defintions, though I am very opposed to the other suggestion about parts of speech. Andrew Sheedy (talk) 23:58, 14 December 2016 (UTC)[reply]

I've now created {{syn}} and {{ant}}. The formatting is purposely left as plain as possible, but the entire thing is wrapped in class="nyms synonyms" and class="nyms antonyms" so that it's easy to change how it looks with CSS and JS if desired. —CodeCat 14:28, 4 January 2017 (UTC)[reply]

I love the idea of grouping semantically related terms together with the definitions. — Ungoliant (falai) 15:07, 4 January 2017 (UTC)[reply]
My main objection is to how it looks, currently it looks too much like the definition itself, some design changes would probably be desirable. But the idea has merit, I think the way it looks in Wyang's original proposal above (with the use of colors and drop-down menus) is quite attractive. — Kleio (t · c) 22:09, 5 January 2017 (UTC)[reply]
@Wyang I support your idea for those minimalistic languages where meaning transcends the parts of speech. Have you any special thoughts on this concerning the average European language? Korn [kʰũːɘ̃n] (talk) 23:52, 5 January 2017 (UTC)[reply]

North American English disappeared[edit]

Category:North American English has been emptied of terms.

When I label something North American, it gets put into both categories Canadian English and American English. That’s wrong. Cat:Canadian English is for “terms and senses unique to Canada,” but the term in question is not.

If I wanted to label a term both US and Canadian, for some reason, then I would do so. Michael Z. 2016-11-24 18:24 z

It was converted to an alias for "US, Canada" quite a while ago because it was unhelpful that some words were categorized as "US"+"Canada" and some were categorized as "North America", so someone looking for words used in the US or Canada had to check both categories. (At the time of conversion, and still today, no e.g. Spanish words used the label with other countries in mind.) Nations are used as the largest unit of dialect specificity in almost all cases around here (e.g. one does not AFAICT say a term is used in "Australasia", one says "Australia, NZ"). The only big exception I can think of is the not-unproblematic use of "Commonwealth" for spellings that are used in a set of countries which does not exactly overlap with those that are in the Commonwealth. - -sche (discuss) 23:26, 26 November 2016 (UTC)[reply]

Abbreviations vote[edit]

FYI: I created Wiktionary:Votes/pl-2016-11/Abbreviations, as suggested in Wiktionary:Beer parlour/2016/October#Suggestion: Edit the abbreviation policy. --Daniel Carrero (talk) 19:51, 24 November 2016 (UTC)[reply]

Improvements needed on Category:en:Language families[edit]

I created this category and filled it up as far as I could, but there's probably still some missing. More pressing, though, is that many of the entries are missing a sense for the family as a (proper?) noun. So if someone needs something to do, here you go. —CodeCat 22:43, 24 November 2016 (UTC)[reply]

Five millionth entry[edit]

By my count it is Kurdish dibirim, although if we want a lemma we could say it's ანახშირებს (anaxširebs) or ათარიღებს (atariɣebs). DTLHS (talk) 15:39, 25 November 2016 (UTC)[reply]

Cuss! I only just got home. Good work, everyone. Equinox 16:41, 25 November 2016 (UTC)[reply]
Awesome ! Good work guys (and gals :) Leasnam (talk) 16:55, 25 November 2016 (UTC)[reply]
W00t! Could we get an example sentence there? --Derrib9 (talk) 17:03, 25 November 2016 (UTC)[reply]
Haha I also only just got here, oh well.. I'll have a shot at ten million in a decade or so :p Awesome work in any case, five million is an insane amount of entries! — Kleio (t · c) 17:14, 25 November 2016 (UTC)[reply]
Five millionth entry is Kurdish or Georgian? Fantastic, really does highlight the variety of languages represented here. <3Jberkel (talk) 17:22, 25 November 2016 (UTC)[reply]

Splitting pages by language (again)[edit]

(moved hence by request)

It is possible that, to the founders of Wiktionary, it made sense to let English come before other languages in entries; but I do not see the justification. The only reason given here is that "this is the English Wiktionary", but all that means is that English is a) the language of public discourse, b) the language on whose entries we centralize our translations and c) the defining language. As a language of study, it has no special status except being by far the best-documented one. I see that there was a vote, although that was apparently just meant to cement existing usage, so it's not clear to me whether there is any actual support for this convention.
If we want to signal that we are (or want to be) an omnilingual dictionary, I feel we should tone down our Anglocentrism.__Gamren (talk) 15:54, 24 November 2016 (UTC)[reply]

In my opinion, it makes sense placing English as the first language (not counting Translingual), because: 1) this is a dictionary for English speakers, so in an entry with multiple language sections such as sea, probably more people will be interested in English rather than the other languages, and 2) all languages point to English in their definitions, so even if you came looking for a foreign word in the first place, there's a significant chance you'll end up wanting to see an English word. Correct me if I'm wrong. --Daniel Carrero (talk) 16:02, 24 November 2016 (UTC)[reply]
It's been proposed numerous times before, but I still like the idea of having separate pages for each language's token of a word. Thus, instead of sea#English, sea#Irish, sea#Old Irish, sea#Old Swedish, and sea#Spanish, we'd have en/sea, ga/sea, sga/sea, gmq-osw/sea, and es/sea. That would totally eliminate the problem of how to sort languages on a page as well as making pages like a navigable again. —Aɴɢʀ (talk) 17:06, 24 November 2016 (UTC)[reply]
It's an interesting idea. Of course, it would make plain-linking ([[]]) in def-lines non-functional, but I've always found that to be sloppy anyway. The first drawback I can think of is that a user might not know from what language a word is; would it be conveniently possible to make e.g. sea a sort of disambig page that automatically knows what "instances" it has and lists those? If not, the scenario requires users to know the ISO codes, which seems a bit harsh.__Gamren (talk) 17:42, 24 November 2016 (UTC)[reply]
I also like this idea. Once I made a list of advantages; I think CodeCat has it saved somewhere. The problem is that the longer we keep using the current format, the harder it is to make the change. — Ungoliant (falai) 18:02, 24 November 2016 (UTC)[reply]
A bot could make the change pretty easily, actually. It's easy to split pages into sections and then save each section to a new page. The cons are mostly in other areas, such as what happens to sea when we've split it, and what happens when a user searches for "sea" in the box. —CodeCat 18:13, 24 November 2016 (UTC)[reply]
I like the idea of sea becoming a disambig page; in fact, it could be a disambig page for all the diacriticky variations too, thus taking over the function of the "Variations of..." appendices and obviating the need for {{also}}. See User:Angr/disambig for one possible scenario. —Aɴɢʀ (talk) 19:34, 24 November 2016 (UTC)[reply]
I like that idea, but for such a core function we'd need someone to run a bot daily or more, to make sure any new entries are included. Who can we trust to do that? —CodeCat 20:02, 24 November 2016 (UTC)[reply]
If it's that critical it should run on WMF servers (the toolserver?) DTLHS (talk) 20:07, 24 November 2016 (UTC)[reply]
Some of the content could be auto generated without a bot, under these conditions:
  • We limit it to only pages of the same title
  • We enable subpages on mainspace
  • The individual pages are named as (term)/(langcode) instead of the other way around.
Then we can use the Special:PrefixIndex/ magic word to get a list of all the subpages, and optionally postprocess this list with Lua to make it nicer and more suitable. The disambiguation page would still need to be created, but once created, the module would do the work. —CodeCat 20:13, 24 November 2016 (UTC)[reply]
Some terms have slashes in them, so for example { Special:PrefixIndex/and/ } would give not only and/en, and/da etc. but also and/or/en. Can this be fixed? Can it be made such that each link in our imaginary disambiguation page is followed (or preceded, really) by the canonical name of the language, and such that the links are alphabetized by that rather than by the ISO code?__Gamren (talk) 12:06, 25 November 2016 (UTC)[reply]
I prefer using this format English/sea, Spanish/sea, because it is consistent with reconstructions such as Reconstruction:Gaulish/frognā. If we choose a complete system like "en/sea", I'd like if the reconstructions were renamed to follow suit. --Daniel Carrero (talk) 12:10, 25 November 2016 (UTC)[reply]
Horrible idea. We should do it only if all other language Wiktionaries do the same. SemperBlotto (talk) 07:28, 26 November 2016 (UTC)[reply]
As long as this is en.wikt, separate from other Wiktionaries, I think we should stick with an English focus. Equinox 07:33, 26 November 2016 (UTC)[reply]
I still don't like this idea. It solves a bunch of problems and creates a bunch more, but I'm not convinced that it would come out as a positive in the end. As someone who edits multiple language section at once, it would make my editing work harder as well. —Μετάknowledgediscuss/deeds 07:42, 26 November 2016 (UTC)[reply]
It seems to be a goog idea, however we should keep in mind that we cannot afford making the life of new and occasional contributers more complicated. Will finding the proper edit bottom be as easy as now if we switch to the new layout? Matthias Buchmeier (talk) 09:15, 26 November 2016 (UTC)[reply]
I oppose "Splitting pages by language (again)" (e.g. en/sea), as before. Furthermore, the rationale provided by OP does not support this proposal: it would only lead to change in the order of sections, placing English alphabetically. --Dan Polansky (talk) 09:32, 26 November 2016 (UTC)[reply]
No, because OP never intended this to be a discussion of splitting pages; however, that's what it turned into, so it's what I chose as header. I'll personally refrain from deciding until I've seen how this might actually be implemented, specifically whether it will, in fact, be harder to use for casual editors and readers. If we don't do this, I still don't think English being more massively researched is a good enough reason to warrant an exception from the rule; note, for example, that English has 701277 entries, Dutch 77899 and Anal 6, so the case for elevating Dutch above Anal is significantly stronger than elevating English above Dutch; except level of documentation has no relevance on alphabetical order, just as libraries don't sort their books based on the number of books by the author, and certainly not using a hybrid of productivity-sort and alphabetization.__Gamren (talk) 12:13, 26 November 2016 (UTC)[reply]
Oops, I am sorry; I did not check the revision history to see whether the discussion title changed. --Dan Polansky (talk) 12:16, 26 November 2016 (UTC)[reply]
Grouping languages in the same page is a huge benefit when you want to compare words between languages: it often happens that words in the same page are related (cf. Nepal, interjection, etc.), and you might want to compare pronunciations, precise meanings, etymologies, etc. Lmaltier (talk) 21:29, 1 December 2016 (UTC)[reply]
In my experience, libraries frequently sort their books by language, sometimes in complex mixed ways to make them more useful. If you use the English Wiktionary, you speak English, and thus it's quite likely English first will be useful. I don't see the advantage for simplifying the language order; alphabetical order for languages is effectively a random order that everyone is familiar with. I could make a case that Dutch should be a second group of languages, and Anal be at the tail in a group of minor languages, but (a) it might make it actually harder to find stuff, (b) there's lots of arguments about order (maybe the second group should be Chinese, Arabic, Russian, Spanish and French, and Dutch should be in a third group of national but subinternational languages), and (c) Anal is only on six pages, so it really doesn't matter.
The number of pages with multiple entries isn't huge, and putting English first makes finding the English definition easy; instead of everything jumbled together in quasi-random order, at least the most common language is pulled out top.--Prosfilaes (talk) 08:12, 28 November 2016 (UTC)[reply]
Splitting sounds like an abysmal idea. It hampers overview, hampers navigational speed and nobody has brought forth any benefits of it either. Further, its effect is effectively simulated by the tabbed languages option for registered users. This is an English language dictionary. I expect the average user to want to know what word X means in language Y and hence would go to the English entry and move on from there if need be. English is our hub. As such, I would even support moving it above Translingual. Korn [kʰũːɘ̃n] (talk) 15:03, 26 November 2016 (UTC)[reply]
Yes, arguments stated by Daniel Carrero are common sense. Lmaltier (talk) 21:29, 1 December 2016 (UTC)[reply]

Help test offline Wikipedia[edit]

Hello! The Reading team at the Foundation is looking to support readers who want to take articles offline to read and share later on their phones - a use case we learned about from deep research earlier this year. We’ve built a few prototypes and are looking for people who would be interested in testing them. If you’d like to learn more and give us feedback, check out the page on Meta! Joe Sutherland (WMF) (talk) 20:08, 29 November 2016 (UTC)[reply]