Wiktionary:Beer parlour/2020/January

From Wiktionary, the free dictionary
Jump to navigation Jump to search
discussion rooms: Tea roomEtym. scr.Info deskBeer parlourGrease pit ← December 2019 · January 2020 · February 2020 → · (current)

Inconsistent place categories for first-level country subdivisions[edit]

Categories for first-level political subdivisions of countries are inconsistent:


I would like to set things up for the remaining major countries. I've already compiled the appropriate lists for several countries, and making the appropriate module changes isn't hard, but we need a consistent naming scheme. In practice, we have three competing schemes:

  1. Append the country name in some form (US, Brazil)
  2. Append the subdivision type (Japan, Russia)
  3. Don't append anything (Canada, China, Australia, UK, India)

Arguing in favor of "don't append anything" are

  1. the fact that this is the most common scheme,
  2. some subdivisions (e.g. "South Australia", "Central Finland") already contain the country name,
  3. it's the shortest form and the easiest to type,
  4. this is the convention followed by Wikipedia.

Arguing against it is that

  1. it can lead to ambiguities, which will have to be resolved by special-casing one of the categories ("Georgia" the US state vs. "Georgia" the country, "São Paulo" the Brazilian state vs. "São Paulo" the Brazilian city, "Victoria" the Australia state vs. "Victoria" the Canadian city, "Macedonia" with several meanings),
  2. it may be confusing (e.g. "New South Wales" is not in Wales, "Inner Mongolia" is not in Mongolia, and someone who has never heard e.g. of Henan may have no idea what Category:Henan refers to, and even what type of entity it is).

Arguing in favor of appending the country name is that

  1. it's unambiguous; this avoids problems with e.g. "Georgia", "Macedonia", "Victoria"
  2. it's clarifying, e.g. it would make it clear that e.g. "Henan" is a place in China.

Arguing against it is that

  1. it looks silly when the country name is already in the subdivision (e.g. "South Australia, Australia", "Central Finland, Finland")
  2. its disambiguating effect doesn't actually help in cases like "São Paulo" the Brazilian state vs. "São Paulo" the Brazilian city

Arguing in favor of appending the subdivision type is that

  1. it's helps disambiguate in cases where appending the country name doesn't help ("São Paulo" the Brazilian state vs. "São Paulo" the Brazilian city); you'd only have problems when the same name is used for the same type of subdivision in different countries, which AFAIK is rare
  2. it's somewhat clarifying, e.g. it would at least make it clear that e.g. "Henan" is a province
  3. some subdivisions do traditionally contain the subdivision type in their most commonly used form (this is the case e.g. for "County Cork", "County Donegal", etc. in Ireland, and may be the case for Japanese prefectures)

Arguing against it is that

  1. it looks either strange or unnecessary in cases where this naming scheme isn't conventional (e.g. "Alabama State" or "State of Alabama"?)
  2. it adds unnecessary verbiage

On the balance, I think what we should probably do is follow Wikipedia conventions, like this:

  1. if including the subdivision in the name is conventional, do it ("County Cork", maybe "Fukushima Prefecture", maybe also the Russian federal subjects)
  2. if including the subdivision or country is necessary to disambiguate, do it as a special case; Wikipedia does this, but isn't very consistent in their naming conventions, e.g. Washington (state) vs. Washington, D.C.; Georgia (U.S. state) vs. Georgia (country); São Paulo (state) vs. just São Paulo for the city; Victoria (Australia) vs. Victoria, British Columbia; etc.
  3. otherwise, don't include the subdivision or country.

I'm open to suggestions, though, particularly how to handle ambiguous cases. Benwing2 (talk) 17:33, 1 January 2020 (UTC)Reply[reply]

I prefer to include the country in all cases, so that it's clear and consistent. It also means that if someone wants to create another category for a different subdivision with the same name, that we didn't foresee, then we don't have to rename everything. We can't know up front all the categories that need disambiguating. —Rua (mew) 19:34, 1 January 2020 (UTC)Reply[reply]
  • Generally I am against lengthening category names for states, provinces etc. by adding the country name unless absolutely necessary for disambiguation purposes. Washington does pose a problem however, with Category: Washington, USA it is not absolutely clear that it isn't Category:Washington, D.C., USA. I don't see Victoria as a problem either; apart from in B.C., Victoria is a big railway terminus in London. There is only one state by this name. Having to add longer category titles to new entries is a chore that should be avoided.
I think it is fair to point out that Rua's MewBot added USA to all US states in November 2017 - I opposed that at the time, but my protests were ignored. DonnanZ (talk) 22:52, 1 January 2020 (UTC)Reply[reply]

Request for changes to WT:ELE[edit]

Since I cannot edit this page myself, I request the following changes.

1. I would like to request that the line:

See also: Style_guide

be added to the top of the section Wiktionary:Entry_layout#Definitions. Per the instructions on the "View Source" page, I requested this some time ago at the Information Desk, but my request there has gone unanswered. I believe that this change is uncontroversial and clearly beneficial, so I wonder if someone with the necessary permission could make it.

2. Per Wiktionary:Beer_parlour/2019/December#Policy_on_usage_examples, whose proposals I have partially implemented myself, I request the following change to Wiktionary:Entry_layout#Example_sentences:

Existing text:

"Generally, every definition should be accompanied by a quotation illustrating the definition. If no quotation can be found, it is strongly encouraged to create an example sentence."

New text:

"Generally, every definition should be accompanied by one or more quotations illustrating that definition. Quotations are supplemented by example sentences, which are devised by Wiktionary editors in order to illustrate definitions."

I am taking the linked discussion as Support for this change, apart from what I consider to be a minor quibble about the use of the word "should". In fact, the word "should" is used in a similar way numerous times throughout WT:ELE. If anyone wants, as a separate exercise, to go through and change all these to what they prefer, then they may do so. Mihia (talk) 18:56, 2 January 2020 (UTC)Reply[reply]

 Done. - -sche (discuss) 08:13, 3 January 2020 (UTC)Reply[reply]

Revisiting pinyin capitalization of words derived from proper nouns[edit]

@Atitarev, Suzukaze-c, Tooironic, Dine2016, Geographyinitiative, I would like to revisit our rules on pinyin capitalization of words derived from proper nouns. There are some inconsistencies across entries, which is why we should address this again. I remember Atitarev argued for not capitalizing because Chinese should have its own rules (or more specifically, not follow English capitalization rules). However, it seems to me that the outcome is that Chinese is then following the rules of other languages, like French or Russian. I think we should follow Chinese conventions instead. Article 6.3.3 of the Basic rules of the Chinese phonetic alphabet orthography says that derived word should be capitalized if it is a proper noun or is considered one. I think we can make it slightly easier by determining capitalization based on whether a word (or part of the word) should have a proper name mark (專名號). This would mean words like 尼泊爾人 Níbó'ěrrén should be capitalized. — justin(r)leung (t...) | c=› } 05:46, 3 January 2020 (UTC)Reply[reply]

For reference, the important dictionary Xiandai Hanyu Cidian includes capitalized forms of pinyin for proper nouns like surnames, whereas Taiwan's Guoyu Jianbian Cidian & Guoyu Chongbian Cidian as well as the little-known Xiandai Hanyu Guifan Cidian abstain from including any capitalized pinyin whatsoever. --Geographyinitiative (talk) 05:49, 3 January 2020 (UTC)Reply[reply]
@Geographyinitiative: Well, I think it's good to include capitalized forms so we can safely put the other dictionaries outside of our discussion since they choose not to deal with capitalization at all. — justin(r)leung (t...) | c=› } 05:54, 3 January 2020 (UTC)Reply[reply]
I have had preference to reduce capitalisations because Chinese belongs to languages where the native script doesn't have a distinction. Foreign learners often capitalise pinyin based on their native language, often English. French capitalises nationalities/ethnicities but not the adjectives. Français is a French male person but français is an adjective. I thought would make sense to capitalise proper nouns but not their derivations, if they are written as one word.
In any case, I won't object to whatever decision as long there IS a decision, if it's consistent and we have rules rather than following whatever capitalisation is attestable elsewhere. I'm glad these rules are finally getting addressed.
Related old discussion: Wiktionary_talk:About_Chinese#Capitalisation_of_demonyms_and_language_names_-_a_mini-vote --Anatoli T. (обсудить/вклад) 06:00, 3 January 2020 (UTC)Reply[reply]
We can set up options that we vote for but my first question is are ethic group names (words with country names and ending in (rén)) and language names (ending in (), (wén) or , (huà)) really proper nouns in Chinese? --Anatoli T. (обсудить/вклад) 06:07, 3 January 2020 (UTC)Reply[reply]
More info for the decision making process: the official Putonghua Proficiency Test test prep book (which is what they use at the test site) has Hanyu Pinyin sentences with capitalized forms for location names- the example sentence at 浩瀚- both the simplified Chinese and the Hanyu Pinyin- were copied from the test prep book if I remember correctly. --Geographyinitiative (talk) 06:36, 3 January 2020 (UTC)Reply[reply]
@Atitarev: I think the (rén) ones are common nouns, but I'm not sure about the language ones. Anyhow, capitalization doesn't necessarily have to do with the part of speech of the word. There are plenty of common nouns that should be capitalized because they derive from proper nouns, such as 薛定諤的貓 (cf. English Schrödinger's cat).
@Geographyinitiative: The capitalized words at the example sentence at 浩瀚 are unambiguously proper nouns, which are not the issue I'm talking about. Are there examples from the book of how something like proper noun + 話 or proper noun + 人 is treated? — justin(r)leung (t...) | c=› } 07:34, 3 January 2020 (UTC)Reply[reply]
  • I think we are doing OK. The entries are more less consistent. My question is what to do when an entry has both a proper noun reading and a common noun reading (or other parts of speech). Compare, for example, 周禮 and 周官; the former has both a pn and a cn and remains uncapitalized, while the latter only has a pn and is thus capitalized. Is this ideal? Granted, it seems silly adding something like: zhōulǐ, Zhōulǐ in the pronunciation box considering they don't differ at all in terms of pronunciation. On the other hand, this practice is a form of inconsistency. ---> Tooironic (talk) 06:41, 3 January 2020 (UTC)Reply[reply]
    @Tooironic: In most cases like these, I would just have it in lowercase. However, for 周禮, I would want to have these capitalized because they are derived from 周, a proper noun. — justin(r)leung (t...) | c=› } 07:36, 3 January 2020 (UTC)Reply[reply]
    To be honest, I would prefer it if all the pinyin forms were in lower case. I don't see much point in capitalizing any of them. It's not like capitalization of a particular transcription system has any relevance to a Chinese dictionary. However, if we were to abandon upper case completely, we would have to come up with a workable bot solution. And I suppose some of you wouldn't support it either. ---> Tooironic (talk) 13:59, 3 January 2020 (UTC)Reply[reply]

I would like to say that I don't agree with the capitalization schemes for ENGLISH on Wikipedia and Wiktionary! haha- Anyway, here are some examples of capitalized pinyin from Xiandai Hanyu Cidian, 7th edition:

  • 汉语 Hànyǔ p513
  • 维吾尔族 Wéiwú'ěrzú p1363
  • 维也纳 华尔兹 Wéiyěnà huá'ěrzī p1363 Viennese waltz lowercase & uppercase
  • 潍 Wéi p1363
  • 魏 Wèi p1369
  • 藏 Zàng p1632
  • 藏传佛教 Zàngchuán-Fójiào p1633
  • 藏历 Zànglì p1633
  • 藏戏 zàngxì p1633 lowercase
  • 藏药 zàngyào p1633 lowercase
  • 藏族 Zàngzú p1633
  • 中国话 zhōngguóhuà p1695 lowercase
  • 中国梦 Zhōngguómèng p1695
  • 中药 zhōngyào p1697 lowercase

--Geographyinitiative (talk) 10:58, 3 January 2020 (UTC)Reply[reply]

I don't think we need to go through all possible ways to capitalise, hyphenate or space hanyu pinyin. We need to decide, adopt a standard and stick to it. --Anatoli T. (обсудить/вклад) 11:11, 3 January 2020 (UTC)Reply[reply]

I like Justin's proposal that uses 專名號 as a basis. —Suzukaze-c 01:48, 4 January 2020 (UTC)Reply[reply]

Test case: should (yān)'s pinyin be capitalized? I think probably/maybe so. --Geographyinitiative (talk) 10:54, 4 January 2020 (UTC) Another potential test case: --Geographyinitiative (talk) 05:48, 9 January 2020 (UTC)Reply[reply]

Request edit to protected template "archive-top"[edit]

The following sentence that is coded in the template "archive-top" and displayed in "RFV failed" notices is bad English (unparallel elements following "either"):

"Failure to be verified may either mean that this information is fabricated, or is merely beyond our resources to confirm."

I request that it be changed to:

"Failure to be verified may mean that this information is fabricated, or it may simply mean that the information is beyond our resources to confirm."

(Please let me know if this is a suitable place to put requests like this, or whether they should go somewhere else.) Mihia (talk) 18:51, 5 January 2020 (UTC)Reply[reply]

I agree that it's bad style. Another option is a shorter version: "Failure to be verified means that this information is either fabricated or beyond our resources to confirm." — Eru·tuon 22:13, 5 January 2020 (UTC)Reply[reply]
I think I'd prefer this shorter version. - dcljr (talk) 03:10, 6 January 2020 (UTC)Reply[reply]
I agree and changed the wording [1]. - -sche (discuss) 16:19, 8 January 2020 (UTC)Reply[reply]
The slight problem I have with the shorter wording is that it seems too definite that those are the only two possibilities (I know that my proposed wording did not actually mention any other possibilities, but it was slightly softened by the "may"s, as indeed was the original faulty wording). In fact, definitions that failed RFV could well have been added due to genuine mistake/misunderstanding. Mihia (talk) 21:58, 8 January 2020 (UTC)Reply[reply]
Perhaps we could say something like "Failure to be verified means that this information was added in error or fabricated, or simply that it is beyond our resources to confirm." Mihia (talk) 22:00, 8 January 2020 (UTC)Reply[reply]

Can someone confirm if Hindi related edits by an IP user are good?[edit]

Apologies if this isn't the correct place to post this. I've noticed numerous edits from the IP that are related to Hindi terms and all of which seem to lack edit summaries. Can someone familiar with Hindi look at the edits to see if they are good/constructive? Thanks. (Link to the IP users contributions). —The Editor's Apprentice (talk) 19:15, 5 January 2020 (UTC)Reply[reply]
I've found another IP (|) with a similar editing pattern, this one is making edits to what seem to only be English terms. I'm gping to look and see if I can find more IP users with similar behaviour. —The Editor's Apprentice (talk) 19:38, 5 January 2020 (UTC)Reply[reply]

I've checked some of Special:Contributions/ and they mostly remove manual transliterations or make use of templates. They are minor edits and in good faith. I only checked a few recent ones. --Anatoli T. (обсудить/вклад) 21:44, 5 January 2020 (UTC)Reply[reply]
That is similar to what I've seen. Thanks for acting as a second pair of eyes. —The Editor's Apprentice (talk) 23:40, 5 January 2020 (UTC)Reply[reply]
I agree, all their edits seemed good, aside from one in which they removed Middle Persian transliterations, which I reverted. This seemed to be the only such edit that the IP made around that particular time. (I looked through some of their contributions and checked for edits that added pages to Category:Automatic Inscriptional Pahlavi transliterations containing ambiguous characters in Special:RecentChanges and this was the only one.) — Eru·tuon 23:59, 5 January 2020 (UTC)Reply[reply]

A Wiktionarian in Residence[edit]


I am glad to inform you that the French Wiktionary hosts the first Wiktionarian in Residence, Sebleouf!

This is a full-time position for a year, starting today, in a university to participate in a on-going project called Dictionnaire des francophones, leaded by the Ministry of Culture of France. I am in charge of this project at the university level. I was hired because of my experience in Wiktionary and I convinced my funder and Wikimedia France to co-fund a in residence with several goals:

  • Fix some old templates and document them to facilitate the reuse of Wiktionary content.
  • Enhance the content with regional uses of French around the world.
  • Support the activities of the university and specifically the host, the Institut international pour la Francophonie [International institute for Francophonie], a compound of Université Lyon 3, based in Lyon, France.

And finally, some words about our great Wiktionarian in Residence. Sebleouf is one of the most prolific and meticulous contributor of French Wikipedia, a Commonist, an Open Food Facts board member, and more. I am glad he accepted this challenge, and to have him as a colleague. He will contribute with a dedicated account, Seb en Résidence (page in French Wiktionary). He may contribute here eventually.

If you have any question, feel free to ask me or him. We'll be glad to support anyone wanting to set up a similar initiative in another community! Noé 16:00, 6 January 2020 (UTC)Reply[reply]

Location of the "Glyph origin" section for CJKV entries[edit]

For those of us working on Chinese-character entries, a suggestion:

  • Shall we move the ===Glyph origin=== section to the topmost ==Translingual== portion of the page?

Currently, the ===Glyph origin=== details appear to be placed under the language that coined the glyph. Compare the page structures for (“gland”) coined in Japanese, and (“thread; line”) coined in Chinese. However, if the glyph is indeed translingual, this information is not specific to any one language, and presumably should go under the ==Translingual== header.

Curious what others think. ‑‑ Eiríkr Útlendi │Tala við mig 20:09, 6 January 2020 (UTC)Reply[reply]

The way I see it, Japanese and Korean borrowed them from Chinese. —Suzukaze-c 21:07, 6 January 2020 (UTC)Reply[reply]
There are clear cases of glyphs invented in Japan, however, such as (sen, gland) or (sasa, bamboo grass). We even have a whole category of these, Category:Japanese-coined CJKV characters. These were definitely not borrowed from Chinese. There are also glyphs invented in Korea, known as 국자 (gukja), such as (dap, rice paddy).
This kind of information is about the etymology of the glyph, as it were, rather than the etymology of the word(s) represented by the glyph. And since, in many cases, the glyph is used by multiple languages, any information about the glyph itself should presumably go into the ==Translingual== section.
Does anyone have any opposition to moving ===Glyph origin=== sections to be under the ==Translingual== header? ‑‑ Eiríkr Útlendi │Tala við mig 22:30, 7 January 2020 (UTC)Reply[reply]
Oh yes, those are coined in Japan. But what about ? There is a Chinese obsolete term, the Japanese simplification of 傳, and a native Zhuang term. I oppose moving Glyph origin (back to) Translingual. —Suzukaze-c 23:55, 7 January 2020 (UTC)Reply[reply]
  • Somewhat off-topic, but what are the advantages of having Translingual sections in sinograph entries at all? —Μετάknowledgediscuss/deeds 00:32, 8 January 2020 (UTC)Reply[reply]
    The only merit I see in having a Translingual section is for comparing glyph shapes across the languages, but I'm not sure if even that is really necessary. A lot of the explanations are based on the regional standards, which are not necessarily followed to the iota in practical use. — justin(r)leung (t...) | c=› } 00:42, 8 January 2020 (UTC)Reply[reply]
    I also oppose putting glyph origins under Translingual. Instead, we should probably have a template for glyph borrowing. — justin(r)leung (t...) | c=› } 00:44, 8 January 2020 (UTC)Reply[reply]
    A dedicated template is a great idea. Meanwhile, if the Translingual section doesn't serve much of a purpose, we might consider phasing it out... —Μετάknowledgediscuss/deeds 00:47, 8 January 2020 (UTC)Reply[reply]
    Pinging @KevinUp, because he edits the Translingual sections a lot. — justin(r)leung (t...) | c=› } 03:25, 8 January 2020 (UTC)Reply[reply]
    The translingual section allows casual users to search and view characters beyond CJK Extension B that are not accessible via handwriting OCR software, e.g. someone who does not know how IDS works can search for complex characters such as 𢁋 by looking at the derived characters section of or . It also works as a one-stop center for readers that are only interested in the shape of the glyph and not the meaning of the glyph. If we were to remove the translingual section, I'm not sure where to include descriptive information such as the differences between Chinese and Japanese - both share the same codepoint but the latter has one stroke missing. Generally, variations between Taiwan, Hong Kong and mainland China standards are permissible, but Chinese and Japanese forms are different and should not be confused with one another. The Japanese form is also duplicated as 𨺓 (U+28E93) and I think this sort of information might be interesting to some people. KevinUp (talk) 09:25, 8 January 2020 (UTC)Reply[reply]
    In light of your comment, I redirected 𨺓 (previously a redlink) to , like fullwidth letters redirect to their normal counterparts. I do this without prejudice to the creation of e.g. a soft redirect instead, thinking only that there should at least be a hard redirect. - -sche (discuss) 02:37, 9 January 2020 (UTC)Reply[reply]
    If we were to distribute the cangjie, stroke number and IDS values to the respective language sections, it would be more difficult to spot mistakes, e.g. the Cangjie value of Japanese is not the same as Chinese (hǎi) (I just added the Japanese value to that entry). I think it would be better to work on converting existing information provided by {{Han char}} to a more readable format, e.g. table format, rather than working on redistributing this information to their respective languages. KevinUp (talk) 09:25, 8 January 2020 (UTC)Reply[reply]
I also think it is a bad idea to move the "Glyph origin" section back to the Translingual section. "Glyph origin" is mainly used in the language (or region) where the glyph was first invented. If the glyph was inherited from the seal script, then it would appear in the Chinese section, and contain phonetic information from Old Chinese. If the glyph was subsequently borrowed into other languages along with its contemporaneous reading, then the glyph and its borrowed reading would appear under the "Descendants" section of Chinese. I don't think it is necessary to have another Glyph origin" section if the glyph, meaning and reading were all simultaneously borrowed into another language as it is understood to be a descendant. KevinUp (talk) 09:25, 8 January 2020 (UTC)Reply[reply]
In some languages such as Korean, Vietnamese and Zhuang, new glyphs are created by combining the phonetic and semantic elements of two different glyphs, so the "Glyph origin" section would contain information that are specific to one of these languages, which is another reason why I don't think it is a good idea to move this section to the Translingual section.
Meanwhile, {{obor}} already exists and is perhaps relevant for Japanese kun'yomi readings. KevinUp (talk) 09:25, 8 January 2020 (UTC)Reply[reply]
See what I did here: 𬖾
In my ideal world, most of the the glyph origin sections would be located under an "Ancient Chinese" (or similar) header. In the case that a glyph was actually created in a different language/dialect/etc (like the above example), then the glyph origin for that character would be under the header of whatever language that the glyph actually originated as a part of.
The translingual section is not ideal and I think it should be eliminated eventually. All the material would be moved under the appropriate language headers or onto specialized pages.
--Geographyinitiative (talk) 09:58, 8 January 2020 (UTC)Reply[reply]

Some thoughts after reading the thread so far.

  • For CJKV glyphs with multiple origins such as the example mentioned by Suzukaze-c above, each language has its own specific derivation for the glyph. In this case, there is a strong sensible argument for having separate glyph origin descriptions for each language.
  • However, for CJKV glyphs with one clear origin, I am not convinced: the glyph origin is shared by all languages that use that glyph, which is pretty clearly "translingual". By way of comparison, other non-CJKV single-character glyph entries have a Translingual section with Etymology describing the origin of the glyph. See A, ð, þ, etc. Single origin, used by multiple languages. We don't put the glyph origin details for A under ==Ancient Greek==, nor do we put the glyph origin details for þ under ==Old Norse== (or ==Proto-Germanic==, depending on how one dates this character).
In terms of usability, how would a user know where to look for glyph information? Most of this is currently located in Chinese entries, granted, but not always. It's even harder to figure out where to look when using tabbed languages, as the user can't just skim the page for the ===Glyph origin=== header.

‑‑ Eiríkr Útlendi │Tala við mig 20:43, 8 January 2020 (UTC)Reply[reply]

Ban the use of ⟨ɚ⟩, ⟨ə⟩, and ⟨ə(ɹ)⟩ for phonemic /əɹ/ in English phonemic transcription[edit]

I have noticed many ostensibly phonemic English transcriptions on Wiktionary employing the symbols ⟨ɚ⟩ and ⟨ə⟩ to represent the phonemic segment /əɹ/. For example, the pronunciation section for the word “later” looks like this:


  • (UK) IPA(key): /ˈleɪ.tə/
  • (US) enPR: lāʹtər, IPA(key): /ˈleɪ.tɚ/, [ˈleɪ̯.ɾɚ]
    • (file)
  • Rhymes: -eɪtə(ɹ)

/ɚ/ is not a phoneme in any dialect of English, it is an allophone of the phoneme /ɹ/ in postvocalic position in General American English. The exact same is true of the ⟨ə⟩ used in the RP transcription. The phonemic transcription of “later” as /ˈleɪ.tɚ/ and /ˈleɪ.tə/ is equivalent to transcribing the phonemes of "toilet" as /ˈtʰɔɪ.lət̚/ and /ˈtʰɔɪ.ləʔ/ instead of the correct /ˈtɔɪ.lət/. [ˈleɪ.tɚ] and [ˈtʰɔɪ.lət̚] are phonetic transcriptions which mark environmentally conditioned allophony that is subject to dialectical variation. These transcriptions belong inside brackets instead of slashes. The fact that the RP pronunciation of the lemma form of words like “later” elides the final /ɹ/ (as opposed to GA, which at least "colors" the preceding vowel) has caused some confusion on here. I can point to previous discussions of “Schrödinger's ɹ” and whether it is appropriate to transcribe /əɹ/ as /ə(ɹ)/ for RP. I should be clear that just because /əɹ/ is phonetically realized as /ə/ in the lemma form does not mean RP has lost the phoneme /ɹ/ in words like “later.” It is still 100% phonemically present. This elision simply the effect of a phonetic rule of RP which states that:

  • /Vɹ%/ → [V∅%]

Or in other words, /ɹ/ is elided when in postvocalic position and followed by a prosodic boundary. /ɹ/ resurfaces as soon as it enters prevocalic position, as in the phrase “later on,” which is realized in RP as [leɪtʰəɹɒn]. The mystery here is explained by the fact that the lemma form implies an unmarked pausa before and after the initial and final phonemes:

  • /%ˈleɪ.təɹ%/.

We can represent the phonemic structure of the lemma form of “later” then as:

  • /%CV₁$CV₀C%/
    • where ⟨C⟩ = consonant, ⟨V⟩ = vowel, ⟨%⟩ = prosodic boundary, ⟨$⟩ = syllable boundary, ⟨₁⟩ = primary stress, and ⟨₀⟩ = zero stress.

Which allows us to apply dialect-specific phonetic rules to each phonemic segment in its environment:

Which altogether gives us:

This procedure can be repeated for any dialect provided that its phonetic rules are known. A proper pronunciation section for "later" with extensive dialectical phonetic treatment would then look something like this:


I propose that the erroneous use of ⟨ɚ⟩, ⟨ə(ɹ)⟩, and ⟨ə⟩ for transcribing English phonemic /əɹ/ be banned as the first step towards reorganizing the messy psuedo-phonemic transcription system currently in place into a real phonemic transcription system which we can use to derive dialect-specific phonetic transcriptions in a rule-based manner. Rhemmiel (talk) 01:25, 7 January 2020 (UTC)Reply[reply]

"I should be clear that just because /əɹ/ is phonetically realized as /ə/ in the lemma form does not mean RP has lost the phoneme /ɹ/ in words like “later.” It is still 100% phonemically present." That is far from clear. The linguist John Wells describes RP as having lost "/r/" (phonemic slashes used, in contrast to the phonetic brackets used elsewhere on that web page) except for before a vowel. Linking [ɹ] does not only show up after words with historical /ɹ/: sequences like "jaw on" may also be pronounced with [ɹ] in many UK accents. Aside from that, a phonetic realization with hiatus is often an alternate possibility, for words with or without historical /ɹ/. The criterion of using a "transcription system which we can use to derive dialect-specific phonetic transcriptions in a rule-based manner" results in a diaphonemic transcription, not a phonemic transcription.--Urszag (talk) 10:50, 7 January 2020 (UTC)Reply[reply]
I strongly agree with Urszag and disagree with the original proposal. /ə/ is a phoneme, 'later' is /ˈleɪtə/ in my BrE accent, and there is no hidden /r/ in it, because the insertion of [ɹ] between certain vowels is entirely at the synchronic phonetic level today, unrelated to the spelling or (same thing) whether the word historically ended in /r/. RP and GenAm have different phonemic systems, and while a diaphonemic abstraction /əɹ/ would 'simplify' (in some sense) rule-generation of present-day forms (of the kind taken to an extreme in SPE), it does not reflect the phonological reality. (I can't adjudge on the phonemic status of [ɚ], but it seems to satisfy the criteria in the same way.) --Hiztegilari (talk) 11:18, 7 January 2020 (UTC)Reply[reply]
I support using /əɹ/ in place of /ɚ/ in phonemic transcriptions of General American and Canadian and similar rhotic dialects, though it would involve a lot of editing. I also support transcribing /ɝ/ that way because it's not a separate phoneme, as in birder /ˈbəɹdəɹ/, which would currently be transcribed /ˈbɝdɚ/ based on Appendix:English pronunciation.
As User:Urszag says, what you're describing sounds like a diaphonemic transcription system, like Wikipedia's English transcription system (w:Help:IPA/English). I'm very much against importing a system like that to Wiktionary to supplant our current systems because it does not assign one symbol to each phoneme that actually contrasts in a given dialect, does not fully accommodate all dialects (including Irish and Scottish and Welsh English), and does not manage to assign just one transcription to every word in the dialects that it does a fair job of accommodating (Received Pronunciation, General American, Canadian, Australian, New Zealand): words like bath require two transcriptions. I think it also causes more misunderstandings about the IPA than a well-designed phonemic transcription for a single dialect does, because it does not even attempt to choose symbols that represent the phonological features that distinguish phonemes in a given dialect, and that are close to the phonemes' typical pronunciations. It's not particularly helpful to an English language learner who wants to be able to see, in the transcriptions, which sounds actually contrast phonemically in a given dialect.
Wikipedia's diaphonemic system was designed so that there would not need to be multiple English IPA transcriptions in an article, and so that editors would not have to discuss which dialects should have IPA transcriptions in which articles. Those aren't concerns on Wiktionary. We can include phonemic transcriptions of as many dialects as we want, at the cost of a lot of inconsistency between entries. So a diaphonemic transcription system would not be useful for us. What would be useful is improvements to the transcription systems that we have, and ways to enforce consistency and make transcriptions easier to retrieve by automated tools. — Eru·tuon 10:46, 8 January 2020 (UTC)Reply[reply]
BTW, FWIW I've thought for a long time of writing an {{en-IPA}} template that takes exactly such a diaphonemic transcription and generates the appropriate IPA for various dialects, as opposed to having to input the IPA directly. I've also separately thought of auto-adding pronunciations to English words based on the Moby pronunciator or a similar freely available list of English pronunciations; this would be done with a special template so that such auto-added pronunciations are marked and can be manually checked and converted to standard templates if necessary. Benwing2 (talk) 15:35, 8 January 2020 (UTC)Reply[reply]
I would oppose changing British /-ə/ to /-əɹ/ unless our British editors were on board with it (hopefully more will comment). As I said when someone proposed that German words like hab be listed as ending in /b/ instead of /p/, I think that if two sounds are phonemically contrastive (in any position), it behooves us to distinguish them (wherever they occur). I would support standardizing the /əɹ/~/ɚ/ sound (in e.g. American accents) instead of some entries using /əɹ/ and others using /ɚ/. Much of the change seems like it could be either automated or semi-automated (i.e. done with AWB), e.g. just changing all instances of /ɚ/ to /əɹ/, except perhaps that instances where /ɚ/ is already followed by /ɹ/ should either be flagged for human inspection or located and reviewed right now (in case anyone has written /ɚɹ/ in a word like /ˈaɪɚɹlənd/, as opposed to in a word like /ˈfaɪɚɹum/. - -sche (discuss) 03:15, 9 January 2020 (UTC)Reply[reply]
@-sche: There actually aren't that many cases of ɚɹ (or ɝɹ) in {{IPA}}, and they're all English transcriptions; here's a full list. — Eru·tuon 04:41, 9 January 2020 (UTC)Reply[reply]
Excellent. Thanks for the list. Most of those look easy to clean up. I've started cleaning them up. - -sche (discuss) 06:36, 9 January 2020 (UTC)Reply[reply]
The following remain to do: answerer, barrel, blurt, circumlocution, current, diphtheria, encourage, hereinafter, kookaburra, linearithmic, offeror, overreactor, overridden, saturative, surety, urinate, worrisome, worry. - -sche (discuss) 09:35, 9 January 2020 (UTC)Reply[reply]
@-sche: Fixed for now. —Mahāgaja · talk 10:15, 15 January 2020 (UTC)Reply[reply]

Pretty much 95% of the (rather small) Old Saxon corpus consists of poetry as far as I can tell, so it seems to me to make little sense to put words in such a category. After all, we have not enough prose for comparison. For all we know these "poetic" terms were used in prose as well. Would anyone object to me emptying this category and removing it? — Mnemosientje (t · c) 14:05, 7 January 2020 (UTC)Reply[reply]

Probably a good idea. —Rua (mew) 10:28, 8 January 2020 (UTC)Reply[reply]
Unrelated to Saxon, but: when I was "importing" stuff from Webster 1913 I always had a bit of side-eye towards poetic words. You have to ask, what kind of poetry do we mean. Usually it's old-skool epic or historical stuff. I wonder how much 20th-century poetry would get a poetic gloss. Auden? Plath? Larkin? Surely not. If they used words strangely we would probably call it a nonce, because they didn't write in the elevated style. Just a thought. Equinox 03:26, 9 January 2020 (UTC)Reply[reply]

Rarely used tone marks in Hanyu Pinyin & Bopomofo (Zhuyin)[edit]

Example of ㄊㄧㄢˉ ㄉㄧˋ (see other corroborating information at Bopomofo page under 'Tonal marks'

As we have just shown on English Wikipedia's Bopomofo page, there are some variant forms of Mandarin Chinese transcription that we are ignoring on Wiktionary. Xiandai Hanyu Cidian puts a · in front of any Hanyu Pinyin syllable that can be or is to be read in the neutral tone. I therefore propose that English Wiktionary should incorporate this form into the drop-down box of Template:zh-pron like this: 蘿蔔 Pinyin: luóbo (also can be written as luó·bo). A similar situation arises with the first tone mark in Bopomofo (Zhuyin). I believe we should incorporate this form into the drop-down box of Template:zh-pron like this: 天地 Zhuyin: ㄊㄧㄢ ㄉㄧˋ (also can be written as ㄊㄧㄢˉ ㄉㄧˋ ).

In my opinion, doing this makes Wiktionary stronger, because it helps the readers understand these rarely-used / rarely understood forms for these transcription systems. Also, I would like to take the opportunity to invite comment on my recent edits to 陰平, 陽平, 上聲, 去聲 and 輕聲. --Geographyinitiative (talk) 23:16, 9 January 2020 (UTC)Reply[reply]

I don't think this is useful to readers. Instead, we should be presenting romanisations that actually appear in the wild, so to speak, like Wade-Giles. —Μετάknowledgediscuss/deeds 00:38, 10 January 2020 (UTC)Reply[reply]
@Metaknowledge Yes, and I both want to go to the Moon and go to Mars. Multi-syllabic Wade-Giles is needed, but I imagine it is apparently more complicated to add in than the Tongyong Pinyin work that was done in back in the heady days of the Summer of '19. The neutral tone dot in Hanyu Pinyin has been used since the 60s or 70s in the authoritative Xiandai Hanyu Cidian- I have a test edition from the 70s that included the neutral tone dots. I imagine this change will be easier. As for the first tone marker in bopomofo, I've never seen it in wide use or used systematically in any dictionary or book, but it does seem to have some advocates like User Yejianfei on English Wikipedia, and it is technically permitted according to Taiwan's rules of Bopomofo. I need to find the 1918 rules of Bopomofo. --Geographyinitiative (talk) 03:30, 10 January 2020 (UTC)Reply[reply]
Well, you could try working on the W-G problem yourself if you want. But adding something that provides no new information and which few anglophones (our readership) are likely to come across is not exactly what I'd call going to the moon. —Μετάknowledgediscuss/deeds 03:49, 10 January 2020 (UTC)Reply[reply]
@KevinUp, Metaknowledge I am tired of my endless whining and complaining and nothing getting done on this front. I have decided to accept the challenge and bring Wade-Giles into zh-pron ASAP. I will try to keep my problematic edits confined to a sandbox. (I accept that the users might be limited to the Anglosphere (don't have the stats on hand to prove it). But I do not accept that the user-base of this Wiktionary and its non-English versions should be limited to the Anglosphere. In my preparation for the test myself, I have laid a partial foundation for people in mainland China who are preparing for the Putonghua Shuiping Ceshi to use this site.) --Geographyinitiative (talk) 05:18, 10 January 2020 (UTC)Reply[reply]
My opinion regarding variant forms of Mandarin Chinese transcription is that the listing of these terms in the {{zh-pron}} box ought to follow WT:ATTEST as well. If it was used historically and can be found in at least three independent works that are on permanently recorded media, then it deserves to be included on Wiktionary. However, if it was never widely used and was only used by one or two people, then it shouldn't be included on Wiktionary. KevinUp (talk) 13:42, 10 January 2020 (UTC)Reply[reply]
I disagree, and seeing as nobody as suggesting that any romanisation of Mandarin besides Hanyu Pinyin get actual entries, I don't see how ATTEST is applicable. We provide four romanisations for Burmese and Korean, because they're useful to our readers. Wade-Giles was heavily used during its heyday, and ought to be there IMO. In fact, we could even have an autocollapsed sub-box of less commonly encountered Mandarin romanisations and put all of them in there (e.g. Sin Wenz, which might be obscure now, but apparently there were entire books printed in it). —Μετάknowledgediscuss/deeds 18:08, 10 January 2020 (UTC)Reply[reply]
No, it would be vain; it is not useful to know or to document which romanizations are attested, and would distract from actually useful work, and on the other hand like Metaknowledge has said the romanizations are created because they are particularly useful (spelling being not derivable from hearing, and writing the native script’s signs on a computer being a skill by itself). That’s why I opposed Wiktionary:Votes/pl-2018-12/Allowing attested romanizations of Sanskrit. Moreover, it has not as a rule been the particular spelling that has to be attested, but always only the term itself, so one can quote a word from audio works. This of course does not go well with Chinese where a proper character is needed, but Chinese is the one exception. (English and French with their high orthographic depth where one can write in multiple ways too let hesitate but at least there one can.) Fay Freak (talk) 02:49, 12 January 2020 (UTC)Reply[reply]
I'd like to clarify that I'm not a fan of creating separate entries for romanized forms of Mandarin transcription. I don't think multisyllabic Pinyin entries with three or more syllables are useful, apart from bumping up one's total edit count. What I meant is that the display of these rarely used romanized forms in the automatically generated {{zh-pron}} box ought to be attestable, so if we can't find at least three independent works using some particular form of romanization, then that form of romanization shouldn't be displayed in the {{zh-pron}} box. KevinUp (talk) 18:54, 15 January 2020 (UTC)Reply[reply]
I clicked through random entry (and then going to the lemma form, if necessary), ten times before I found citations on a page, and that was one cite on one out of six senses. Nobody is going to search for cites for romanized forms until someone drags it to RfV, and that's just creating busy work. I don't believe that citing romanized forms is valuable in theory, but even if I did, from a time-value perspective, it's simply not worth it; it's just going to be ignored and occasionally add more work to rfv.--Prosfilaes (talk) 10:15, 20 January 2020 (UTC)Reply[reply]
Romanisation is only a tool for languages, which don't use Roman (Latin letters), which includes Mandarin Chinese. Once (carefully) selected, there is no need to display all possible flavours of the same romanisation type. We allow Mandarin Chinese and Japanese romanisations as entries, which continues to make people think that the romanisation is actually an alternative script and continue to hyperlink them as words. It's tiring to remove やり方 (やりかた, yarikata) to make them look normal: やり方 (やりかた, yarikata). No need to attest romanisation, as long as it's standardised or the accepted standard by the dictionary makers. Dictionary publishers are free to choose a romanisation type for languages or dialects, which don't have a standard or tweak them to make suitable for specific needs. If anyone of used dictionaries for languages with foreign scripts, you'll agree that this is always the case. What is important, is the consistency. Users should be able to use it efficiently, if the romanisation is described well, used consistently and is not overwhelming or confusing. --Anatoli T. (обсудить/вклад) 11:32, 20 January 2020 (UTC)Reply[reply]

User:Stephen G. Brown is probably dead[edit]

Older discussion: Wiktionary:Beer_parlour/2019/June#User_Stephen_G._Brown.

User:Stephen G. Brown is probably dead. It's a great loss to our community. Should anything be done here? Should his user page be protected from vandals, as e.g. had to be done with User:Robert Ullmann's page? --Anatoli T. (обсудить/вклад) 00:08, 10 January 2020 (UTC)Reply[reply]

He lived in Dallas, Texas, USA. If someone could access obituaries for confirmations, it would be great too. --Anatoli T. (обсудить/вклад) 00:11, 10 January 2020 (UTC)Reply[reply]
Was that his actual name? DTLHS (talk) 00:23, 10 January 2020 (UTC)Reply[reply]
Yes. —Μετάknowledgediscuss/deeds 00:27, 10 January 2020 (UTC)Reply[reply]
I cannot find evidence that he is dead (e.g. obituaries). I also cannot find evidence he is alive (e.g. WMF activity, posts on Facebook). I hope he is doing okay, and I oppose any action taken on the assumption that he is dead (not that any vandals are going after his page anyway). A note on pragmatics: in English, the use of an exclamation point conveys excitement by default, and the use of it in the title here seems insensitive at best.Μετάknowledgediscuss/deeds 00:27, 10 January 2020 (UTC)Reply[reply]
@Metaknowledge: I take your point - removed exclamation marks. No excitement, please don't accuse of being insensitive. That was insensitive of you. Knowing him, my assumptions for the worst are very strong. He could be badly ill as well but it has been a very long time. Stephen is my friend. --Anatoli T. (обсудить/вклад) 00:34, 10 January 2020 (UTC)Reply[reply]
I was not accusing you, rather helping you. —Μετάknowledgediscuss/deeds 00:36, 10 January 2020 (UTC)Reply[reply]
I prefer to be optimistic and consider other reasons for Stephen's inactivity such as the ones suggested by Chuck Entz at User talk:Stephen G. Brown#January 2020. KevinUp (talk) 13:55, 10 January 2020 (UTC)Reply[reply]
Interestingly, the only Stephen G. Brown I can find on Facebook also lives in Dallas, Texas. What are the odds... Andrew Sheedy (talk) 20:16, 11 January 2020 (UTC)Reply[reply]
The odds are 100%, because that's his Facebook account. —Μετάknowledgediscuss/deeds 23:44, 11 January 2020 (UTC)Reply[reply]
This user may know more about him: @Seb az86556. See also [2]. Can he comment more about the user?--GZWDer (talk) 16:48, 12 January 2020 (UTC)Reply[reply]
Unless Stephen is/was a baptist pastor with young children (unlikely given that he was in his seventies), I don't think it was his account. The one I came across has also been quite active recently. Andrew Sheedy (talk) 05:25, 18 January 2020 (UTC)Reply[reply]
Wut? Stephen's Facebook account, which has been been publicly inactive since December of 2018, is very easy to find and has nothing about a "baptist pastor with young children" on it. From what I know, it is very likely that Steve is in assisted living facility or hospital, and has yet to pass. I can, with some certainty, confirm he was still alive as of September of last year. --{{victar|talk}} 06:16, 18 January 2020 (UTC)Reply[reply]
  • I agree with Metaknowledge that we should not assume he's dead without evidence. He may just be taking a retreat from the Internet, though I acknowledge it would be uncharacteristic for him to do so without telling people on Facebook and Wikimedia of his plans ahead of time. —Mahāgaja · talk 09:47, 15 January 2020 (UTC)Reply[reply]

New category: Category:Chinese compound surnames[edit]

Hey all. I think 歐陽欧阳 (Ōuyáng) 東宮东宫 (dōnggōng) 第五 (dì-wǔ) and other similar multi-character Chinese character surnames deserve a special category under the 'Category:Chinese surnames' category- something like 'Category:Chinese compound surnames'. Since they have their own English Wikipedia page (Chinese compound surname) then I would say we ought to have a category on English Wiktionary that encompasses all of the Chinese compound surnames we have. Thoughts? Suggestions? Disagree? --Geographyinitiative (talk) 03:50, 10 January 2020 (UTC)Reply[reply]

I went ahead and made the category- let me know what you think of it: Category:Chinese compound surnames --Geographyinitiative (talk) 03:56, 10 January 2020 (UTC)Reply[reply]
So far I've put seven compound surnames in the new category- let me know what you think of it. --Geographyinitiative (talk) 09:56, 10 January 2020 (UTC)Reply[reply]
Yes, I think this is a useful category. I prefer this sort of work (creating new categories where there was none) rather than arguing over the various shortcomings of this dictionary. KevinUp (talk) 14:04, 10 January 2020 (UTC)Reply[reply]

Request for mass edit[edit]

Currently all Thesaurus entries are only guaranteed to be in Category:Thesaurus. I'd like for all to be added to the relevant subcat of Category:Thesaurus entries by language. This can be achieved by providing the lang paramater of {{ws header}}. Currently, the vast majority, if not all, of the uncategorized entries are English.__Gamren (talk) 17:21, 12 January 2020 (UTC)Reply[reply]

Na'vi language[edit]

See Wiktionary:Votes/2020-01/Na'vi language
I think this should be included because it has a vocabulary of over 2000 words, and comes from a work that is very widely known. --Numberguy6 (talk) 18:38, 12 January 2020 (UTC)Reply[reply]

@Numberguy6: Have you taken a look at Wiktionary:Criteria_for_inclusion#Constructed_languages or any of the conversations linked there about included constructed languages? —Justin (koavf)TCM 18:43, 12 January 2020 (UTC)Reply[reply]
I'm talking about inclusion in the Appendix namespace, like Klingon or Quenya. — This unsigned comment was added by Numberguy6 (talkcontribs).
@Numberguy6: I know that and you didn't answer my question. —Justin (koavf)TCM 18:50, 12 January 2020 (UTC)Reply[reply]
What about inclusion in the same way as Eloi or Mandalorian? I think it should be included like that.--Numberguy6 (talk) 18:55, 12 January 2020 (UTC)Reply[reply]
@Numberguy6: It already is allowed in the Appendix namespace. You've created a vote for something that's already approved. —Μετάknowledgediscuss/deeds 19:13, 12 January 2020 (UTC)Reply[reply]
But since Na'vi doesn't have an ISO 639 code, we need to create a code for it. How do we do that? --Numberguy6 (talk) 19:17, 12 January 2020 (UTC)Reply[reply]

Okay, I figured out how to add a new language. But now I'm worried about copyright. What do you think about it? --Numberguy6 (talk) 19:29, 12 January 2020 (UTC)Reply[reply]

@Numberguy6: It already has a Wiktionary code, art-nav. In the future, please ask around before creating a vote (and may I delete the vote now?). As for copyright, that has never been decided upon in a US court, so we don't know, but as long as you don't add everything, I expect we'll be fine. —Μετάknowledgediscuss/deeds 19:40, 12 January 2020 (UTC)Reply[reply]
@Metaknowledge Yes you can delete it. --Numberguy6 (talk) 19:42, 12 January 2020 (UTC)Reply[reply]
Category:Na'vi language was deleted in 2014 by @-sche with the notice "copyright violation, see BP". Unless those concerns no longer apply, we shouldn't be recreating the entries. —Rua (mew) 20:51, 12 January 2020 (UTC)Reply[reply]
Thanks for the ping. The BP discussion that led to that deletion was this, in July 2014 (about Dothraki, Klingon, Na'vi, etc). In July 2018, some people opined that concerns about copyright seemed overstated, and started creating more Klingon. Would the folks behind Na'vi be more zealous than the folks behind Klingon in asserting copyright? I don't know. In general, It's not my impression that including any large number of words would be advisable. (Wikipedia says that the language consists of "more than 2200 words", which means even a couple hundred words would be a significant portion.) My understanding is that editors have previously argued that creating a small number of words is de minimis. - -sche (discuss) 21:17, 12 January 2020 (UTC)Reply[reply]
Ultimately, the whole panic seems odd to me, because they wouldn't sue us — if for some inane reason they didn't like us hosting this content, they'd send a cease and desist letter, and we could comply by deleting it then. —Μετάknowledgediscuss/deeds 21:22, 12 January 2020 (UTC)Reply[reply]
For you information, it appears that Wikibooks does not have any copyright issue (or at least did not question). Pamputt (talk) 19:59, 18 January 2020 (UTC)Reply[reply]

New code requests[edit]

The following need codes assigned:

  • the Mixtec family, and its protolanguage
  • the Cuicatec family, and its protolanguage
  • Proto-Trique (a family code already exists, omq-tri, so it would be omq-tri-pro)
  • Teposcolula Mixtec, an extinct Mixtec language

Some of these were brought up over a year ago but nobody responded. --Lvovmauro (talk) 10:33, 14 January 2020 (UTC)Reply[reply]

@Lvovmauro: I don't mind creating new codes for the families (I do that all the time), but I like to see evidence that linguists have actually done work on reconstructing a proto-language before creating a code for a new proto-language. Have Proto-Mixtec, Proto-Cuicatec, and Proto-Trique all been reconstructed (or at least has work begun on reconstructing them)? —Mahāgaja · talk 19:14, 14 January 2020 (UTC)Reply[reply]
Oh, and as for Teposcolula, are there words recorded in it, or merely mention of its existence in old records? —Mahāgaja · talk 19:44, 14 January 2020 (UTC)Reply[reply]
Proto-Mixtec has been reconstructed, we already have an appendix for it (Appendix:Proto-Mixtec roots). Proto-Trique has been reconstructed by Kosuke Matsukawa ([3]). I don't think Proto-Cuicatec has been reconstructed.
Teposcolula is the best-attested variety in the colonial period. There's a grammar and a dictionary and various texts by native speakers. --Lvovmauro (talk) 03:13, 15 January 2020 (UTC)Reply[reply]
OK, the following codes now exist: —Mahāgaja · talk 09:28, 15 January 2020 (UTC)Reply[reply]
@Lvovmauro:There ya go, have fun with them! —Mahāgaja · talk 09:28, 15 January 2020 (UTC)Reply[reply]

Word of the day: Brexiteer[edit]

This nomination is politically controversial, and particularly insensitive, and as bad as the government minting special 50 pence coins to celebrate it, adding insult to injury. I would prefer this event to go uncelebrated in WOTD. DonnanZ (talk) 11:37, 28 December 2019 (UTC)Reply[reply]

I second Donnanz, this WotD is a magnet for controversy and featuring it is just going to send an inordinate amount of shit to Wiktionary. ←₰-→ Lingo Bingo Dingo (talk) 08:52, 14 January 2020 (UTC)Reply[reply]
@Sgconlaw, so he sees this discussion. I third the above response: this is a very ill-considered idea, which will invite vandalism and make it seem as though Wiktionary has a political position. —Μετάknowledgediscuss/deeds 16:46, 14 January 2020 (UTC)Reply[reply]
I think it is a misconception that WOTD is always intended as a celebration of something. It merely marks notable events, some of which may be considered as celebratory, others certainly not (for example, International Holocaust Remembrance Day). It cannot be denied that Brexit is a notable event, and the message is neutral, neither celebratory nor denunciatory. The day when the result of the United Kingdom European Union membership referendum was known was also previously marked on WOTD. — SGconlaw (talk) 16:52, 14 January 2020 (UTC)Reply[reply]
Perhaps it would better to transfer this discussion to the Beer Parlour? — SGconlaw (talk) 17:06, 14 January 2020 (UTC)Reply[reply]
@Sgconlaw, Donnanz, Lingo Bingo Dingo: Sure, the WOTD does not endorse words or concepts by featuring them, but that is a subtle point that many of our readers will miss. There is no reason to invite vandalism and disgruntlement over an avoidable choice, and unlike the dictionary itself, WOTD is not meant to be comprehensive (we also wouldn't feature something with offensive language, for example). We could move this discussion to the BP, although I think it would be wiser simply to choose a new WOTD. —Μετάknowledgediscuss/deeds 01:14, 15 January 2020 (UTC)Reply[reply]
Discussion moved from Wiktionary:Feedback#Word_of_the_day%3A_Brexiteer. ←₰-→ Lingo Bingo Dingo (talk) 09:06, 15 January 2020 (UTC)Reply[reply]
Thanks, @Lingo Bingo Dingo. I will, of course, go along with whatever the consensus is, but would like to see whether there are editors who agree with my views on this matter. — SGconlaw (talk) 12:43, 15 January 2020 (UTC)Reply[reply]
  • I also oppose WOTD'ing Brexiteer. Even though we know doing so doesn't constitute an endorsement of Brexit on our part, casual readers may not know that. —Mahāgaja · talk 09:32, 15 January 2020 (UTC)Reply[reply]
I think it would be better to postpone the word to 31 Jan 2021 instead. KevinUp (talk) 18:30, 15 January 2020 (UTC)Reply[reply]
I wonder if we would have seen such backlash if the WOTD had been Remainer: do we dislike political WOTDs, or only the ones that aren't on our own side? (I dislike the political ones generally. I would like WOTD to be quite random, and based on the weirdness of words, not on real life. But I can see why people are tempted to add relevance. Something like pumpking on 31 Oct can't hurt for example. Hey someone add that one for me.) Equinox 00:06, 16 January 2020 (UTC)Reply[reply]
Well, Remainer would be even more stupid, because the UK isn't remaining, after all. But why invite all this trouble anyway? I agree with you that we ought to avoid political ones. (I remember the problematic idea to memorialise the 2016 US presidential election with a WOTD related to Trump and a FWOTD related to Clinton, or perhaps it was the other way around — a cute concept, but if someone only saw one of the two, which is quite possible, they would doubtlessly have assumed bias toward that candidate.) —Μετάknowledgediscuss/deeds 00:17, 16 January 2020 (UTC)Reply[reply]
Remoaner then? :)  --Lambiam 09:31, 16 January 2020 (UTC)Reply[reply]
In case anyone cares, I voted to remain at the time but I've partially changed my mind since. More relevantly: I would add that we usually avoid politically charged usexes, for hopefully obvious reasons. One way around this is to find politically charged real citations It hasn't been a problem yet, but some day we will probably have to legislate about it. Equinox 09:40, 16 January 2020 (UTC)Reply[reply]
Setting the Word of the Day is a tedious, neverending and largely thankless task and I appreciate Sgconlaw doing it; it is also a public-facing feature that merits the oversight we're giving it now, and I do agree with those above that this word shouldn't be featured. I think featuring a British word or thing like cream tea for the UK's national day, or commemorating Holocaust Remembrance Day, or setting "nice" words for holidays like Christmas or the start of Ramadan, is fine even if some people would find it somewhat "political" (e.g. hardline atheists would object to the holidays, hardline Christians would object to the non-Christian holidays, anti-imperialists or whatever might object to Britain's national day, etc). But this word (and the way it would be received) seems to be entirely, or almost entirely, political and controversial, and best skipped.
While we're on this topic, I thought we had some guideline that words shouldn't be derogatory (which poncy may have been stretching the guideline on) or out-of-use (which wench, with all its senses marked as archaic, is also stretching the guideline on, even though people do still use both words). - -sche (discuss) 10:24, 16 January 2020 (UTC)Reply[reply]
I was thinking the WOTD could go ahead, but delete the comment, as WOTDs don't always have comments added anyway. I think it is the comment that is insensitive, this event should not be commemorated, let alone celebrated. I'm waiting for the excrement to hit the fan after B-day, it's bound to happen. DonnanZ (talk) 17:12, 16 January 2020 (UTC)Reply[reply]
  • @Sgconlaw: As you wished, the BP discussion has garnered more editor input. However, that input seems squarely opposed to featuring the word on the proposed date. —Μετάknowledgediscuss/deeds 18:53, 20 January 2020 (UTC)Reply[reply]
Indeed. All right, I will change it presently. — SGconlaw (talk) 19:20, 20 January 2020 (UTC)Reply[reply]

Obsolete terms used in the definition of non-obsolete ones[edit]

The adjective in earnest is defined as "Sincere; determined; truthful; agood", agood being an obsolete term. I am against such a practice. --Backinstadiums (talk) 16:51, 16 January 2020 (UTC)Reply[reply]

I agree. Wiktionary is aimed at speakers of modern English, so it should use terms that today's people can understand. —Rua (mew) 16:53, 16 January 2020 (UTC)Reply[reply]
Btw, it might be interesting to see how many definitions use "thou". —Rua (mew) 17:30, 16 January 2020 (UTC)Reply[reply]
Fully agreed as well. Those can be listed under synonyms instead. — Mnemosientje (t · c) 18:51, 16 January 2020 (UTC)Reply[reply]
I agree and would go further: Obsolete, archaic, dated, or rare terms should not be used in any definition and should be appropriately labelled when in synonyms lists. DCDuring (talk) 20:45, 16 January 2020 (UTC)Reply[reply]
Agreed. The point of a dictionary is to help people understand words, not obfuscate them and make the user explore as much of the dictionary as possible. Andrew Sheedy (talk) 22:22, 16 January 2020 (UTC)Reply[reply]
Oppose as a blanket measure, agree in this case and agree for all English glosses. While using obsolete or archaic words in definition is generally very poor practice, occasionally an old-fashioned term in one language is simply going to be the closest equivalent to a term in another language (cf. the second definition of gij); in those cases it should be fine to have the old-fashioned term alongside a definition in modern English. ←₰-→ Lingo Bingo Dingo (talk) 09:37, 17 January 2020 (UTC)Reply[reply]
This is a fair point, although thou is not obsolete (=no longer understood), like the words OP is objecting to, but only archaic (=understood, but old-fashioned). If we're talking only about using obsolete words to define English words, then the obsolete words can indeed just be mentioned as synonyms with appropriate qualifier tags indicating obsoleteness, as said above. If there were a case where a truly obsolete word was the most sensible word to mention in defining a non-English word, (I'd love to see it, but) I think we could either (a) handle it as an exception on a case-by-case basis, or (b) mention (as obsolete) the obsolete word, without using it, like {{lb|fr|dialectal}} to cause a horse to blink four times, {{n-g|equivalent to the now-obsolete English verb {{m|en|foobar}}}} . - -sche (discuss) 16:42, 17 January 2020 (UTC)Reply[reply]
I will add my voice to agree with the other posters. I complained previously to a user (I forget who) who loved to use obsolete/archaic words in definitions. In this case it was User:DerekWinters who added "agood"; I don't know why, he seems to be a respected contributor. I would also second what User:-sche says about how to add an archaic or obsolete word to a definition if for some reason it's merited (e.g. if a language had a word that was closely equivalent to "thou" in both meaning and usage). Benwing2 (talk) 18:30, 19 January 2020 (UTC)Reply[reply]
He's respectable now, but he did a lot of damage here and at Wikipedia before he got the archaizing bug out of his system. I believe he's still banned from Wikipedia under a number of accounts. Chuck Entz (talk) 22:55, 19 January 2020 (UTC)Reply[reply]
If anyone is up for it, searching for English entries where all senses are marked as obsolete but {{t}}s are present would probably find two kinds of entries: those where the translations should be moved to a modern synonym, and those without a modern synonym, which might therefore be validly mentioned in the definitions of their translations (but where we still might decide to create a SOP entry as a translation hub to move the translations to). - -sche (discuss) 07:58, 22 January 2020 (UTC)Reply[reply]

Improve information about page protection[edit]

Wiktionary:Protected page, Wiktionary:Protected titles, and Wiktionary:Protected page guidelines should probably be merged or otherwise "rationalized" (and updated, as necessary — including what links to each title). I see no reason to have three different pages about this one concept. - dcljr (talk) 04:06, 17 January 2020 (UTC)Reply[reply]

I agree there seems to be no reason for three separate pages and have boldly merged Wiktionary:Protected titles into Wiktionary:Protected page guidelines. Wiktionary:Protected page seems like a newbie-facing, newbie-friendly explanation page, but has almost no incoming links, nor any recent edits, so I also merged it. - -sche (discuss) 16:24, 17 January 2020 (UTC)Reply[reply]
Thanks for doing this, but I don't think you used any of Wiktionary:Protected page at Wiktionary:Protected page guidelines (so you didn't really "merge" that page). Given that normal users will end up there from time to time, I think a bit of introductory text for their benefit is appropriate. Note that Wiktionary:Protected page is linked to from Special:Log/protect, and Wiktionary:Protected page guidelines is linked to from MediaWiki:Protectedpagewarning, which presumably is displayed when a fully protected page is edited by someone with the permissions to change the page (not users who can't — they see a different message). And MediaWiki:Semiprotectedpagewarning also links to the protection log (as does MediaWiki:Protectedpagewarning), which links to Wiktionary:Protected_page. Oh, and I think the contents of MediaWiki:Protectedpagewarning and MediaWiki:Semiprotectedpagewarning could be improved/updated/harmonized. Someone who can edit those pages should take a look and see what they think. - dcljr (talk) 16:40, 18 January 2020 (UTC)Reply[reply]
OK, I added some of the former Wiktionary:Protected page's newb-friendly content, but updated to tell them to comment in the Tea Room not on individual talk pages. - -sche (discuss) 19:14, 18 January 2020 (UTC)Reply[reply]

Wiki Loves Folklore[edit]

Hello Folks,

Wiki Loves Love is back again in 2020 iteration as Wiki Loves Folklore from 1 February, 2020 - 29 February, 2020. Join us to celebrate the local cultural heritage of your region with the theme of folklore in the international photography contest at Wikimedia Commons. Images, videos and audios representing different forms of folk cultures and new forms of heritage that haven’t otherwise been documented so far are welcome submissions in Wiki Loves Folklore. Learn more about the contest at Meta-Wiki and Commons.

Kind regards,
Wiki Loves Folklore International Team
— Tulsi Bhagat (contribs | talk)
sent using MediaWiki message delivery (talk) 06:14, 18 January 2020 (UTC)

Font wonkiness (degree signs and Portuguese ordinals)[edit]

See the entry for 1.º#Portuguese - at least on my system (Ubuntu), the superscript 'o' in the article title is plain, whereas the one in the body of the text has an underline (picture here if you don't see what I mean). This could be immensely confusing to someone trying to sort out the difference between the degree sign and an ordinal superscript 'o', which as far as I know is generally underlined in Portuguese, but not underlined in Italian, for example. I came here because I found a "360°" on Wikipedia (OSM with the "degree symbol" underlined. When I pasted it into somewhere else the underline disappeared. So I am guessing that Unicode has something like "ORDINAL SUPERSCRIPT O" (or "...M"!), and underlining is left to the font designer. Is there anything that could be done to make the symbol at least consistent within the Wikt page? Imaginatorium (talk) 06:52, 18 January 2020 (UTC)Reply[reply]

@Imaginatorium: In my machine (Ubuntu-like, Firefox), ordinal indicators (º, ª) in the font of the title and the headword line have an underline, but the font used in the title bar of the tab does not have an underline there, nor does the search bar's font. The title seems to use DejaVu Serif for that character, and the headword line uses DejaVu Sans, and the search bar Noto Sans; I don't know what font the tab uses. One way to go about this would be to survey all fonts and figured out which ones use underlines, and prefer them in our site CSS (or propose that MediaWiki use them), but it's a lot of work for such a small thing. — Eru·tuon 07:29, 18 January 2020 (UTC)Reply[reply]
Indeed, in some fonts the Unicode character “masculine ordinal indicator” (U+00BA) is virtually indistinguishable from the Unicode character “degree sign” (U+00B0). This may be confusing, but such potential confusion between similar-looking graphemes is widespread, like the minuscule letter ⟨ℓ⟩ versus the majuscule letter ⟨I⟩ in sanserif fonts (compare “Kim Jong Il” with “Kim Jong II”), or the minuscule letter ⟨ℓ⟩ versus the digit ⟨1⟩ in monospace fonts. It is also common that the digit ⟨0⟩ cannot be distinguished from the majuscule letter ⟨O⟩ in monospace fonts (and is it “C-3P0” or “C-3PO”?). I guess we just have to live with this.  --Lambiam 11:51, 18 January 2020 (UTC)Reply[reply]
FWIW, I'm using Firefox 72.0.1 on Ubuntu 18.04.3 (with FF set up to use Liberation fonts by default, but I also have DejaVu and Noto fonts installed), and all superscripts throughout the page 1.º (including in the tab and search box) are rendered with no underline. You can check our stylesheets to see which fonts this wiki wants us to see, but different users will actually see different fonts depending on what mix of fonts they have installed and the particulars of how font substitution (when necessary) is done on their systems. Plus, users' choices of skins and gadgets in their preferences can affect these things, too. Ugh. - dcljr (talk) 14:42, 18 January 2020 (UTC)Reply[reply]
We could always create and add an image, perhaps one showing how the symbol appears in two fonts, one with and one without an underline, and explaining that the difference does not affect the meaning. - -sche (discuss) 19:17, 18 January 2020 (UTC)Reply[reply]

Gaulish language[edit]

I didn't write anything into this portal yet but I have had a few considerations on the w:Gaulish language. I am also directing this at Mahāgaja, Holodwig21, Victar and Uiscefada in regard of the Celtic branch of languages, including concernment with Proto-Celtic.

Firstly, I'd like to address whether the script of Transalpine Gaulish reconstructions should only be Latin or also Greek. The Gaulish language itself is mostly attested in this dialect from a yet considerable amount of writings. I opined that attested terms can be added both in Latin and Greek even though they may be attested only in one script because there are no differences in inflection or pronunciation, both belonging to the same dialect.

Furthermore, I'd like to know whether reconstructions of w:Cisalpine Gaulish, attested from 20 inscriptions, may be made based either on an attested term in Transalpine Gaulish or a reconstructed term in the same dialect.

It should be decided to include plurals for proper nouns, in particular given names, or to omit those.

Moreover, vowel length can be shown in the title of the entry by use of a macron or it could be indicated solely within the entry. Maybe it was shown in Latin but surely at least sometimes in Greek.

Either "w" or "u" should be used for the approximant [w]. It was not distinguished in any kind of writing, which points at employing "u". HeliosX (talk) 19:56, 18 January 2020 (UTC)Reply[reply]

@HeliosX: While the Greek alphabet and Latin alphabet were used to write Transalpine Gaulish; I would argued that it would be more advantageous to write in just one script. Although we could create Gaulish entries for Gaulish terms written in the Greek alphabet. IMO, creating entries with scripts they were not written might lead to confusion.
Plurals in proper nouns should not exists.
Not sure on the vowel length. Usually if the Gaulish term is being reconstructed then the length must be included, but in attested terms, the practice is that it should not be included in the title.
I would prefer "w" to "u" for the approximant [w]. Those attested otherwise can be relegated to the alternative forms section. 𐌷𐌻𐌿𐌳𐌰𐍅𐌹𐌲𐍃 𐌰𐌻𐌰𐍂𐌴𐌹𐌺𐌹𐌲𐌲𐍃 (talk) 20:27, 20 January 2020 (UTC)Reply[reply]
I think we should use "v" for consonantal "u", just like we do for Latin. We shouldn't be using "w", which is just weird. —Rua (mew) 20:29, 20 January 2020 (UTC)Reply[reply]
@HeliosX: Continental Celtic in general was overwhelmingly written in variants of Italic scripts, with the Latin alphabet (whether capitals or cursive) being the most used in terms of both geographic spread and length of time. The Greek alphabet was mostly confined to the mouth of the river Rhône near the Greek city-state of Massalía (modern-day Marseilles) and only makes up a small fraction of the corpus. Granted, there are many Celtic names mentioned in ancient Greek literature, but I assume we're only looking at Gaulish language sources for guidance here. Besides the indigenous historical precdent, the only published dictionary of Gaulish I'm aware of is Dictionnaire de la langue gauloise by Xavier Delamarre, which lists all of its entry names in the Latin alphabet (attested Greek spellings are provided in the discussion beneath the entry). Personally, I believe the model set forth by Delamarre is the one we should follow.
I am not sure what the purpose would be in reconstructing Cisalpine Gaulish. I think that is far beyond our scope, and I would recommend only listing attested Cisalpine variants.
We should use plurals for proper names where it makes sense to (e.g. ethnonyms, theonyms that are attested in plural) but only in singular otherwise (e.g. personal names, theonyms that have no plural attestations, and placenames).
Attested forms never show vowel length in Gaulish (at least, not that I've ever seen). Some people argue that lengthened letter I's may represent this, but the Coligny Calendar demonstrates that to be inconsistent and thus unreliable. Currently, the entries seem to be following vowel lengths that are reconstructed for Proto-Celtic. I disagree with assuming the vowel lengths would not have changed in Gaulish. We know for certain that vowels were lengthened or shortened based on prosodic principles in the living Celtic languages, and so we should not make assumptions about Gaulish vowel length in spite of the relatively earlier time period we're dealing with. And even more importantly, there are going to be Gaulish words of uncertain etymology. It would be preferable in my view to imitate Delamarre: don't indicate vowel length for Gaulish (since we don't know it). Simply let readers refer to the supposed Proto-Celtic etymons directly if they wish to rather than mixing the two together as we have currently.
I agree that semivowels should not be distinguished from vowels in our entry names. Entries should follow attested forms and a pronunciation key can provide clarification to readers. Furthermore, we would ideally revise all of the Gaulish entries to follow attested spellings. So besides eliminating semivowels, we should also spell ⟨k⟩ as ⟨c⟩, ⟨ss⟩ as ⟨đđ⟩, and so forth. — Uiscefada (talk) 07:16, 26 January 2020 (UTC)Reply[reply]

Request for template editor's right[edit]

I am familiar with templates. I can improve them. -- Huhu9001 (talk) 12:53, 21 January 2020 (UTC)Reply[reply]

I hope more editors who are familiar with your edits will comment, as I am not familiar and so my initial comment was going to be that Wiktionary's templates and modules tend to do rather different things than, say, Wikipedia's, and in many cases should not be "improved" without prior discussion (at which point an existing template editor could make the change), before I saw that you have been around for years and have edited thousands of pages. OTOH, I also see that recent discussions on your talk page have included both thanks for some edits to a module and notices that your edits to another module broke many pages, and another veteran editor chiding that "the formatting of the headword templates should be changed radically without consensus." So, for my part, I will wait to hear from other editors, particular in the areas you edit (Japanese and Chinese) before supporting or opposing this. - -sche (discuss) 07:02, 22 January 2020 (UTC)Reply[reply]
I only want to clarify that the "veteran editor"'s complaint is not true. He just did not see the discussion. -- Huhu9001 (talk) 09:03, 22 January 2020 (UTC)Reply[reply]
Support, lots of good work IMO. —Suzukaze-c 07:10, 22 January 2020 (UTC)Reply[reply]
Abstain Oppose given their history, they should try working with others first. It seems he's just sticking to mostly Japanese templates and has several people following him, so I leave it to them. --{{victar|talk}} 09:17, 22 January 2020 (UTC)Reply[reply]
@Victar: I don't quite understand your advice. -- Huhu9001 (talk) 10:14, 22 January 2020 (UTC)Reply[reply]
You only read negative feedback on a user's talk page. —Suzukaze-c 14:27, 22 January 2020 (UTC)Reply[reply]
I didn't mean that. I was asking this user if he can explain his advice with more details. Roughly saying "try working with others first" is a bit too abstract to me. -- Huhu9001 (talk) 17:05, 22 January 2020 (UTC)Reply[reply]
@Huhu9001: I think @Suzukaze-c was answering to Victar, not to you. Canonicalization (talk) 19:33, 23 January 2020 (UTC)Reply[reply]
@Suzukaze-c: There's is much to read, and includes a cite-breaking edit and a block, sooo... --{{victar|talk}} 23:17, 23 January 2020 (UTC)Reply[reply]
@Huhu9001: What I mean is, in all the time you've been on en.Wikt, you've hardly started any module or template discussions. I'm even more dissuaded by your option below where you think such requests are "almost impossible" to complete. I can understand that it's frustrating when your idea gets turned down, but that doesn't mean that template editor rights should be used to circumvent that. As @Justinrleung remarked on your last edit that broke several pages, module changes that affect a wide community need to be discussed before going live. Also, it would behoove you to learn how to create testbeds first to test you code, something that doesn't require one to be a template editor. --{{victar|talk}} 23:17, 23 January 2020 (UTC)Reply[reply]
@Huhu9001, which modules/templates are you trying to edit anyway that require you to have template editor access? --{{victar|talk}} 03:54, 24 January 2020 (UTC)Reply[reply]
@Victar: Thank you for your explanation. My next target, were I granted the right, would be t:ja-go-ru. I want options for 沿うて/負うて/厭うて. -- Huhu9001 (talk) 05:49, 24 January 2020 (UTC)Reply[reply]
A few words from my experience of editing temps and mods: I have seen many editors talking of "you can ask someone who have the right to make changes for you" or so on. But that is in fact almost impossible. When you are blocked from editing you will not know how your code actually works and you can't really expect your code to be good enough in this condition. On the other hand, you can neither expect admins or temp editors to write code upon your requests. They are not customer service and they have their own excuses. They could say they don't know how to do it, they are busy in real life, they have other works, or simply after all they are volunteers. I can fully understand them so I would prefer not to bother them when I am fully capable of handling my own things. -- Huhu9001 (talk) 10:50, 22 January 2020 (UTC)Reply[reply]
SupportΜετάknowledgediscuss/deeds 23:42, 22 January 2020 (UTC)Reply[reply]
Support (mostly). Most of the edits to templates/modules seem good in general and he's found some good solutions to ongoing problems we've had with the Chinese templates (like collapsed tables in mobile). @Victar: for that particular case, I wasn't aware of the discussion that has been going on on User talk:Suzukaze-c (which has since moved to Wiktionary talk:About Japanese). He probably just needs to work on knowing the appropriate platforms to have discussions, which is a relatively minor issue. (He did also have some beef with Wyang, which is why he has been blocked in the past - I'm not entirely sure what the issues were, but now that Wyang's not around, I don't see big problems anymore.) — justin(r)leung (t...) | c=› } 02:53, 24 January 2020 (UTC)Reply[reply]
I'm inclined to Support, though others have pointed out areas where Huhu9001 can improve. Causing module errors isn't a disqualification: I do that sometimes (for instance recently, when I put 36,000 module documentation pages in CAT:E). But it's important to be available to fix them, as Huhu9001 seems to have been in the episode on his talk page. I'm not super familiar with all of his module and template edits, but those that I've seen have looked good and I'm reassured by his willingness to listen to advice here. And since he has been editing modules like Module:ja and Module:zh-see that I wanted to protect based on their number of transclusions, but didn't want to prevent him and others from editing, in effect he already is a template editor (in the sense of someone who's editing widely transcluded pages) and it is probably good to officially dub him one. — Eru·tuon 07:12, 24 January 2020 (UTC)Reply[reply]
@Erutuon: Is in the episode an idiomatic expression? Canonicalization (talk) 21:08, 24 January 2020 (UTC)Reply[reply]
@Canonicalization: No, I just mean "episode" in the first sense, the recent case of Huhu9001 causing module errors and then fixing them. — Eru·tuon 21:44, 24 January 2020 (UTC)Reply[reply]

So what's the conclusion? -- Huhu9001 (talk) 07:45, 6 February 2020 (UTC)Reply[reply]

Seems like there is support, I'll flip the bit. - TheDaveRoss 16:05, 6 February 2020 (UTC)Reply[reply]

Request for rollbacker right[edit]

I have been reverting vandalism here for a while now as part of my m:SWMT work.Wiktionary:Rollbackers is just a soft redirect and there is no specific procedure for requesting this permission here. I have rollback experience on some wikis and would like to get access to this tool here so that I can fight vandalism more efficiently. Honestly, rollbacking is possible using the undo feature without the right, Twinkle and other users scripts provide this feature. But the main reason I am asking this right, is to be able to use m:User:Hoo man/Scripts/Smart rollback script which let's its user conduct mass-revert. This script works only with the rollback right and I am not aware of any other alternatives. Thank you for your consideration. Masumrezarock100 (talk) 14:58, 21 January 2020 (UTC)Reply[reply]

Shouldn't your Small Wiki Monitoring Team work be done at small wikis? —Μετάknowledgediscuss/deeds 16:41, 21 January 2020 (UTC)Reply[reply]
Sorry for the confusion. When I said I monitor this wiki as part of my SWMT activity, I meant I patrol this wiki simultaneously with several small wikis. I do not usually patrol this wiki's recent changes via Special:RecentChanges, instead I use m:SWViewer which allows me to watch over the global recent changes queue and lets me patrol multiple wikis simultaneously. As you can see from my recent contributions, I reverted a LTA named Furedai's edits today here. These LTAs don't necessarily keep their abuse limited to just small wikis, they vandalize big wikis too. When I look at a crosswiki spammer's SUL (CentralAuth page), I see some edits on bigger wikis too. It would be nice to be able to use the rollback feature here, to quickly rollback vandalism when I am on a slow connection (GlobalTwinkle takes too much time to load on a slow connection) as well as when conducting mass-reverts as mentioned above. Masumrezarock100 (talk) 18:09, 21 January 2020 (UTC)Reply[reply]
Knowing that LTA, rollbacks won't stop them - only a block will. If they are reverted, they will simply revert back or do another edit on that entry, or move to vandalize the user talk page of whoever is reverting them, too then refusing to give up until they are blocked. I know this first-hand since I've dealt with LTAs when I was only able to rollback but before I was an admin; the Recent Changes page turned into a war zone. — surjection?⟩ 10:13, 22 January 2020 (UTC)Reply[reply]
Yes, I am aware of that. This LTA and their socks are globally locked. I did not think of asking for a local block here because they would just go to another wiki until they are blocked there too. Which I why I reported them to #wikimedia-stewards channel before reverting here. I thought they would be locked by the time I revert their edits. This was unfortunately not the case, and I should have waited longer. I am sorry for the extra work I have caused (revision deletion count would have been low if I hadn't reverted their edits before they got locked). If you want me to, I can report any further cases like this locally here, before asking for a global lock/block. Yours sincerely, Masumrezarock100 (talk) 12:45, 22 January 2020 (UTC)Reply[reply]
Eh, seems reasonable; the user's edits here and on WP seem to be good and the user seems to be a trusted Wikimedian (being given global-ip-block-exempt rights for example), besides which the ability is already "gettable" via scripts, so needn't be guarded too closely. AFAIR our procedure for this has been like our procedure for whitelisting: one admin nominates (or, in the case of a self-nom, supports) the user, and another admin seconds and grants the request, provided there is not opposition. I support, but to avoid acting unilaterally, leave it to another admin to grant. - -sche (discuss) 07:25, 22 January 2020 (UTC)Reply[reply]
Thanks for your support. But I would like to point out that any contributor with a valid need (e.g. users from Turkey, China and other countries where accessing Wikimedia sites are restricted due to censorship by local government) can ask for global-ip-block-exempt right. Not necessarily it is a sign of trust. I have been patrolling crosswiki for a few months now, and I have been granted rollback rights on some projects (10 if I recall correctly) because of my anti-vandalism work on those projects. You can see a overview of my crosswiki activity at Special:CentralAuth/Masumrezarock100. I am helping out here too. Masumrezarock100 (talk) 13:05, 22 January 2020 (UTC)Reply[reply]

Partial translation vs partial calque[edit]

About 35 entries (such as liverwurst, abreact and Yuan River) describe themselves as "partial translation"s in their etymologies, which are mostly formatted with {{etyl}} though ~6 use {{der}} and ~1 uses {{bor}}. About 126 entries (such as feldspar, realskola, Lake Tai and Tang dynasty) use "partial calque" in their etymologies, which are mostly formatted with {{calque}}. I presume the "partial translation of {{etyl}}"s should be changed to "partial {{calque}}"s. Yes? Or how should this phenomenon be handled? - -sche (discuss) 07:49, 22 January 2020 (UTC)Reply[reply]

For the most part, yes, but it's a case-by-case operation. I wouldn't call Yuan River a calque, partial or otherwise, just a poorly phrased etymology section. —Μετάknowledgediscuss/deeds 16:48, 22 January 2020 (UTC)Reply[reply]
I think I’d prefer a new template {{partial calque}} over “partial {{calque}}”. This should indeed be checked before it is automatically applied; we should in fact not even call Yuan River a “partial translation”.  --Lambiam 22:07, 22 January 2020 (UTC)Reply[reply]
Why not {{calque|partial=1}} or similar? – Jberkel 01:41, 23 January 2020 (UTC)Reply[reply]
Either of those approaches sounds fine to me, although I think we tend to use distinct templates for distinct methods of derivation more than we use parameters, so I would weakly prefer {{partial calque}}. - -sche (discuss) 22:46, 23 January 2020 (UTC)Reply[reply]
@Erutuon or any of our other module adepts: If I change Module:etymology/templates like this and then create {{partial calque}} with the content {{#invoke:etymology/templates|partial_calque}}, will that be enough to make that a functional template? I don't want to boldly edit one of our most widely used modules without somebody checking whether there are errors or deficiencies in my code, heh. - -sche (discuss) 06:09, 27 January 2020 (UTC)Reply[reply]
@-sche: It looks like it's working. I've gotten {{partial calque cat}} and {{auto cat}} working on partial calque categories. — Eru·tuon 20:37, 29 January 2020 (UTC)Reply[reply]
@Metaknowledge, Lambiam, Erutuon: I'm halfway done deploying this to the above-named entries (dropping the spurious ones, as suggested above), and I'm noticing that some (but not all) entries that are partial calques are also (in the other part) borrowings, for example siskonmakkara (borrowing of siskon) and uilleann pipes (borrowing of uilleann). (Whereas, Löffelliste and ñuudzaa have not borrowed any new elements into the language.) Do we want to add {{bor}}'s "borrowing" categories to these entries in some way, either manually or via a parameter of {{partial calque}} (which someone other than me would have to code)? - -sche (discuss) 08:31, 2 February 2020 (UTC)Reply[reply]

Movement Learning and Leadership Development Project[edit]


The Wikimedia Foundation’s Community Development team is seeking to learn more about the way volunteers learn and develop into the many different roles that exist in the movement. Our goal is to build a movement informed framework that provides shared clarity and outlines accessible pathways on how to grow and develop skills within the movement. To this end, we are looking to speak with you, our community to learn about your journey as a Wikimedia volunteer. Whether you joined yesterday or have been here from the very start, we want to hear about the many ways volunteers join and contribute to our movement.

To learn more about the project, please visit the Meta page. If you are interested in participating in the project, please complete this simple Google form. Although we may not be able to speak to everyone who expresses interest, we encourage you to complete this short form if you are interested in participating!

-- LMiranda (WMF) (talk) 19:01, 22 January 2020 (UTC)Reply[reply]

Limiting inheriting in etymologies[edit]

Apart from my previous subject, I'd also like to concern myself with the extent of inheriting in etymologies. I would suggest that a term can only be inherited from the predecessor language and, in case that it immediately follows its own predecessor language and is relatable in its attestability, also from the second predecessor. For example, West Frisian could inherit terms only from Middle Frisian and Old Frisian but not from Proto-West-Germanic. Middle Frisian, instead, inherits words only from Old Frisian yet not from the earlier language because it is attestable in contrast to Proto-West-Germanic, which is reconstructed. I am also directing this at Rua and Victar. HeliosX (talk) 18:34, 23 January 2020 (UTC)Reply[reply]

What is the benefit of restricting the use of the "inherited" template to those cases? I think this template is currently used to place words in categories like Category:English_terms_inherited_from_Proto-Indo-European. That wouldn't work if its use was restricted in the way that you describe. Do you think that categories of terms inherited from proto-languages should not exist?--Urszag (talk) 19:28, 23 January 2020 (UTC)Reply[reply]
English did not inherit any terms from Proto-Indo-European. DTLHS (talk) 23:22, 23 January 2020 (UTC)Reply[reply]
I can't tell what point you're trying to make here. That Proto-Indo-European is the name of a modern reconstruction and it's not technically possible for a natural language to inherit words from that reconstruction? If that's what you mean, then yes, it's obvious that no historical languages inherited any words from models created by modern historical linguistics. But terms like "Proto-X" are often used to refer, not only to the reconstruction, but to the hypothesized actual language/language group in the past that the reconstruction is meant to represent. If you mean that the term "inherit" is inappropriate because if we traced back along the timeline, we would stop calling the language "English" well before we got all the way back to the Proto-Indo-European stage, that seems to me like an unnecessarily narrow definition of "inherit".--Urszag (talk) 01:35, 24 January 2020 (UTC)Reply[reply]
I'm not seeing the point or benefit of this. --{{victar|talk}} 23:21, 23 January 2020 (UTC)Reply[reply]
I agree with Victar here, and disagree with Helios and DTLHS. - -sche (discuss) 23:53, 23 January 2020 (UTC)Reply[reply]
Is this only about inheritance from reconstructions, or is the intention that one would also be disallowed to state that English water is inherited from Old English wæter?  --Lambiam 10:06, 24 January 2020 (UTC)Reply[reply]
As I have outlined in my example about West Frisian terms, English water would also be inherited from Old English wæter. HeliosX (talk) 16:48, 24 January 2020 (UTC)Reply[reply]
That seems rather arbitrary to me. I really don't understand the reasoning behind this proposal. —Rua (mew) 16:51, 24 January 2020 (UTC)Reply[reply]
It is not actually coincidental. Since borrowing connects only a single language as for an English term borrowed only from French but not simultaneously from Middle French, inheriting should also reflect only more approximate language relations. Continuing my example to the utmost extent, inheriting would be confined to West Frisian from Middle and Old Frisian, to Middle Frisian from Old Frisian, to Old Frisian from Proto-West-Germanic and Proto-Germanic, to Proto-West-Germanic from Proto-Germanic and Proto-Indo-European and to Proto-Germanic from Proto-Indo-European. This is directed as well at Rua, Victar and Lambiam. HeliosX (talk) 18:51, 24 January 2020 (UTC)Reply[reply]
  • Borrowing indeed is only one layer deep. However, inheritance goes all the way back to the beginnings. Limiting inheritance like this does indeed seem arbitrary. Why would some inheritance relationships be two layers deep, while others only one? Would there be any allowed three-layer inheritances? Why the differences? Why the limitations? This doesn't seem either consistent or all that useful. ‑‑ Eiríkr Útlendi │Tala við mig 19:15, 24 January 2020 (UTC)Reply[reply]

Standardize obsolete English æ and œ spellings on one template[edit]

Currently, obsolete English æ and œ spellings are presented in at least four ways: gynæcologist is given as an {{obsolete typography of}} gynaecologist, while archæology is given as an {{obsolete spelling of}} archaeology, and archæological as an {{obsolete form of}} archaeological. (The fourth way is that a few, perhaps, may still be presented as lemmas, valid {{alternative form of}}s/{{alternative spelling of}}s, or {{archaic spelling of}}s/{{archaic form of}}s, and may need to be cleaned up to indicate obsoleteness.) For at least the first three situations, we should decide on one format to use, and standardize on it, IMO. - -sche (discuss) 00:00, 24 January 2020 (UTC)Reply[reply]

I say we go with typographies. These ligatures aren't truly letters, so they aren't different spellings anymore than short/long "s" is a different letter (it's just a different form of a letter). And the form template should be for different conjugations like "hast". —Justin (koavf)TCM 01:29, 24 January 2020 (UTC)Reply[reply]
I agree with Koavf that the difference between "æ" and "ae" in English is best described as a matter of typography, not spelling, and that it's better not to use "obsolete form of" in cases like this. The wording "obsolete typography of" seems awkward to me. "Obsolete typographic alternative to" sounds better to my ears (although I don't know if its preferable when all else is considered).--Urszag (talk) 01:47, 24 January 2020 (UTC)Reply[reply]
"alternative representation of"?
More distantly related, what are we to do with citation text containing such "alternative representations" as "vv" for "w" and "f" for "s"? DCDuring (talk) 02:47, 24 January 2020 (UTC)Reply[reply]
"Representation" may be a good word, although by itself I think it is too vague, like "form". I would like "obsolete typographic representation of" better than "obsolete typography of". If you mean the variant of long s (ſ) that looks like an f, the standard treatment here is not to mention it. "VV" I guess would be treated the same as the v/u alternants which are what "obsolete typography of" was originally created for. (Looking at that talk page, I see the point raised that "obsolete typography of" implies the text is printed with type, which isn't always true.)--Urszag (talk) 03:03, 24 January 2020 (UTC)Reply[reply]
Yes, one thing that was pointed out on the talk page years ago is that these spelling variants can also be found in handwriting, including (in various languages) from before the advent of moveable type / typography. Maybe alternative graphemic representation or something? - -sche (discuss) 17:45, 24 January 2020 (UTC)Reply[reply]
I thought obsolete implies that it is no longer usable/understandable. Wouldn't archaic be more appropriate? -Mike (talk) 22:22, 24 January 2020 (UTC)Reply[reply]
This question has come up before when it comes to deciding how to label spellings. It has been opined that while it makes sense to distinguish terms as obsolete-vs-archaic based on whether or not they'd still be understandable (if old-fashioned) vs would be unintelligible, most spellings would still be intelligible even if they hadn't been used in a century or five (e.g. advauncemente, goodenesse, or indeed almost everything in Category:English obsolete forms), so the more sensible distinction there is whether they're still used. - -sche (discuss) 03:12, 25 January 2020 (UTC)Reply[reply]

Are spaces in abbreviations significant?[edit]

Discussion moved from User talk:Sgconlaw.

As for your "not clear what you mean by "no examples of this form" as there are numerous examples" (diff): There might be numerous examples for Q.E.D. (without spaces), but the ones I removed from the entry Q.E.D. (diff) aren't any for it but are for Q. E. D. (with spaces): [first term]

  • 1759, 1775, 1818, 1823: It's Q. E. D. with spaces
  • 1809: The link is broken
  • 1999: That's indeed Q.E.D without spaces

[second term]

  • 1814, 1825, 1827: It's Q. E. D. with spaces

Compare USA/U.S.A./U. S. A. for a similar case. --Trothmuse (talk) 20:03, 23 January 2020 (UTC)Reply[reply]

This is best raised at the Tea Room. Feel free to start a discussion there. — SGconlaw (talk) 20:20, 23 January 2020 (UTC)Reply[reply]
@Trothmuse: The main page, which “alternative spelling” or “alternative form” pages link too, treats the term in all its spellings. So there is no such thing that there are “examples for Q. E. D. but not Q.E.D.” All is on one page to illustrate the usage and sense better. Users have it harder, having to click around, if all is spread around alternative forms. Fay Freak (talk) 20:23, 23 January 2020 (UTC)Reply[reply]
@Sgconlaw: Why? USA/U.S.A./U. S. A. and z.B./z. B., and WT:Main Page ("all words of all languages") + WT:Neutral point of view ("a descriptive ... and not a prescriptive approach"), and the way how quoting is done (without changing the spelling, one also doesn't modernise Shakespeare when one quotes him) show how it is done.
@Fay Freak:
  1. Q.E.D. isn't the main page, but QED is, so it doesn't make sense to have quotes not containing the term Q.E.D. (but QED or Q. E. D.) in the entry Q.E.D..
  2. ATM the quotes aren't at just one place (the main entry QED) but at multiple places (QED, Q.E.D, Q. E. D).
--Trothmuse (talk) 20:37, 23 January 2020 (UTC)Reply[reply]
It’s late where I am so I don’t have time now to discuss this in detail, but I believe the question of whether spaces or their absence are to be taken into account needs discussion at the Tea Room. If there is consensus that spaces are to be disregarded, then U. S. A. will need to be deleted. As for quotations, the main entry or lemma usually has quotations illustrating all alternative forms, but each individual alternative form can have some quotations showing that that form exists. — SGconlaw (talk) 20:48, 23 January 2020 (UTC)Reply[reply]

@Trothmuse has raised an issue which, I think, requires a broader discussion: are spaces significant in abbreviations? For example, is it desirable to distinguish between Q.E.D. and Q. E. D. (and, for that matter, between QED and Q E D), and between U.S.A. and U. S. A.? My feeling is that it is not – the OED does not distinguish between them, for example – but we should reach some consensus on the matter. — SGconlaw (talk) 07:26, 24 January 2020 (UTC)Reply[reply]

We also have inconsistencies when abbreviations exist with or without dots. Sometimes a sense that can be attested with or without dots (or spaces) is at only one (or some) of the possible forms, sometimes it's at them all, but maybe with inconsistent wording or template usage, part of speech, etc. Dots can sometimes be "contrastive"/sense-specific, like Master of Arts can be M.A. or MA but Massachusetts can only be MA, so I guess senses should be present (duplicated or using some alt-form-of template) at all attested forms. Spaces seem much less likely to be "contrastive"—I would like if someone could point to some cases where they are; even something like N. Kor., which initially seemed promising, seems to also be attestable as N.Kor. at the same seemingly very low frequency as N. Kor.. My initial thought would be that, because we're a written dictionary, we tend to give every written form its own page and should probably likewise put each abbreviation at whatever form (spaced or unspaced) is most common, with duplicate senses or alt-form-of templates at other attested forms. However, I'm not sure, and am open to being persuaded otherwise. (But it would certainly seem odd to take one that is normally spaced and lemmatize only the unspaced spelling, or vice versa.) - -sche (discuss) 08:12, 24 January 2020 (UTC)Reply[reply]
Space—no space is a false dichotomy. Look at the kinds of spaces. Not sure how it is in English but in German all such combinations require the narrow no-break space U+202F, according to proper typography and language rules. Now people can take out the ruler to look which abbreviations are attested or written properly which way – no, I am for ignoring such details and just adding with full stops and without as it is thus usually entered on computers: it’s a total bikeshedding issue. You don’t add under correct apostrophes either as alternative forms but deploy hard redirects for them. Fay Freak (talk) 11:21, 24 January 2020 (UTC)Reply[reply]
I wouldn't want to reproduce the entire array of possible spaces, but we do distinguish a binary of space vs no space outside abbreviations, as we famously have separate entries for coal mine vs coalmine, besides e.g. et cetera and etcetera; why forego doing so inside abbreviations? With apostrophes, the comparison would be that we indeed do not normally add multiple types (though we have some entries like Hawaiʻi...), but we do distinguish the binary presence or absence of them, as in e.g. yall vs y'all vs ya'll, or its#Contraction and it's#Etymology_2. Eh... I also continue to think it'd be poor form to lemmatize a form that isn't the usual/most common one, but since both forms might be attested (in one language) and might be lemmas in different languages, redirects can't be used and some content would need to be duplicated already, like with u. a. vs u.a.... - -sche (discuss) 17:42, 24 January 2020 (UTC)Reply[reply]
Leave it to the Germans to make their kids measure the spaces in their written abbreviations. {eye-roll} In English the spaces are merely typographer's (or writer's) preference. Often they may be smaller than a normal space but not so much that the space on the right side of an interior dot is the same as on its left side; but visually, looking like it is scrunched together when reading through at normal speed. Occasionally I see people refer to the natural rules of English that don't need to be itemized, and this spacing of abbreviations is just one of those rules. Wiktionary should probably adopt a prescriptive policy on whether to use spaces or not and then just go with that. -Mike (talk) 00:50, 25 January 2020 (UTC)Reply[reply]
I'd say spaces in dotted abbreviations are not significant. I would personally prefer putting them in the unspaced title and redirecting the spaced title to it, but I don't have any strong arguments in favor except that the spaced version feels more archaic or badly typeset or something. For curiosity's sake, here is a list of pretty much all mainspace titles that look like abbreviations.Eru·tuon 01:23, 25 January 2020 (UTC)Reply[reply]
I feel like this would have to be a language-specific determination, since in German the norm/rule is to space, and it's leaving out the space that's 'bad' typesetting. That means hard redirects should go in the other direction for some languages, and would be impossible at least in some cases. I don't particularly object to such an approach (hard directs wherever possible, to whichever form a language standardizes), but it does still strike me as inconsistent to distinguish the presence of a space in an abbreviation like 1 Sam. (would we move that to 1Sam.?) or a word like coal mine, but not Q. E. D.—and what would be done with a spaced undotted abbreviation like *"Q E D"? Leave the spaces there, or redirect it to a spaceless form? - -sche (discuss) 06:34, 27 January 2020 (UTC)Reply[reply]
A space in the absence of a period determines whether the letters on either side of it are separate or contiguous. When a period is present, it only determines the width of the separation, and width varies so much between fonts that it's almost meaningless.
Think of it this way: what's the lexical difference between a single space and a double space? If every space were of equal importance, that difference would have some kind of significance, rather than just making the spacing look odd. Not only that, but there are a number of different spaces. Even the people who grit their teeth when they see a hyphen used instead of the correct width of dash aren't likely to fuss over the difference between an en-space and an em-space.
Yes, the search function distinguishes between periods and periods with spaces, but it also distinguishes based on things like zero-width non-joiners and right-to-left markers. Do we need pages with ZWNJs in different places so people will find the entries regardless of where they copypaste the search terms from? Chuck Entz (talk) 07:50, 28 January 2020 (UTC)Reply[reply]
@-sche: Hmm, so it's more complicated than I thought. At the very least I meant to say that, in an English abbreviation with single letters and dots (or maybe without dots, as in Q E D), it seems mainly a stylistic thing whether there are spaces between all the parts of the abbreviation or no spaces at all. "1 Sam." is a different case because it contains a number and a multiple-letter abbreviation. But your point about spacing in these types of abbreviations being language-specific certainly makes hard redirects an inconsistent solution. Actually, I will retract what I said about the unspaced single-letter-dotted abbreviations being standard: they aren't in authors' names like "J. R. R. Tolkien"; "J.R.R. Tolkien" seems decidedly worse. — Eru·tuon 01:51, 29 January 2020 (UTC)Reply[reply]
I'm not sure that "J.R.R. Tolkien" is nonstandard or much worse. On Wiktionary it looks bad, but depending on the kerning of various fonts, sometimes it looks better than "J. R. R. Tolkien," which can look too spaced out. I usually omit the spaces. Andrew Sheedy (talk) 04:30, 29 January 2020 (UTC)Reply[reply]
When doing more formalized genealogical writing or transcription, I often do names as "J. R. R. Tolkien", with thin, non-breaking spaces between the letters, or less formally as "J.R.R. Tolkien". Also in that context I would never think to do "J R R Tolkien" or "JRR Tolkien". -Mike (talk) 17:03, 29 January 2020 (UTC)Reply[reply]
They are apparently significant to the writer, since mixing space and non-space is unacceptable (*U. S.A.). Equinox 07:37, 28 January 2020 (UTC)Reply[reply]
That   is also    true for sentences. -Mike (talk) 00:41, 29 January 2020 (UTC)Reply[reply]
I didn't   know    that you spoke     Shatner.  :) ‑‑ Eiríkr Útlendi │Tala við mig 01:19, 29 January 2020 (UTC)Reply[reply]
Watched lots of original Star Trek! -Mike (talk) 17:03, 29 January 2020 (UTC)Reply[reply]

Open call for Project Grants[edit]

Greetings! The Project Grants program is accepting proposals until Feburary 20 to fund both experimental and proven projects such as research, offline outreach (including editathon series, workshops, etc), online organizing (including contests), or providing other support for community building for Wikimedia projects.

We offer the following resources to help you plan your project and complete a grant proposal:

With thanks, I JethroBT (WMF) (talk) 18:38, 24 January 2020 (UTC)Reply[reply]

Mongolian free variation selector characters for Manchu[edit]

The titles of some of the Manchu entries here use Mongolian free variation selector (FVS) characters (e.g. ᡩ᠋ᡠ᠋ᡴᠠ (duka) instead of ᡩᡠᡴᠠ (duka), with an FVS1 right after the ᡩ), while it appears that FVS characters are not used in Manchu words online most of the time. Should we use FVS characters in the titles of Manchu entries? — RcAlex36 (talk) 19:05, 29 January 2020 (UTC)Reply[reply]

Pinging @LibCae as our only recently-active editor to list Mongolian-script proficiency, in case they have a comment; we have no Manchu speakers. If the characters have no effect on display, and are present only in some entries, they may have crept in unintentionally via someone copypasting from somewhere that uses them; that's how soft hyphens creep in (requiring me or others to periodically check the database dumps for them and remove them), and it's one reason some Persian entries used to have identical-looking Arabic letters and vice versa (for which I think we now have an edit filter, which we could also do for this, if we want). - -sche (discuss) 07:49, 30 January 2020 (UTC)Reply[reply]
I'm seeing a difference in display, but don't know what the significance is. The one with FVSs (ᡩ᠋ᡠ᠋ᡴᠠ) has a notch above the first letter, whereas the one without FVSs (ᡩᡠᡴᠠ) doesn't have the notch and has a stroke to the right of the second letter. — Eru·tuon 11:42, 30 January 2020 (UTC)Reply[reply]
https://r12a.github.io/mongolian-variants/ shows a pretty complete list of the differences and how the fonts handle them. They should have a difference in display; I don't understand if it's more than Mongolian versions of stuff like the two versions of a (open-top or closed loop) or g (open-bottom or closed loop bottom), and I think Unicode abdicated their responsibility here by not properly encoding things.--Prosfilaes (talk) 07:24, 3 February 2020 (UTC)Reply[reply]
Pinging also @Crom daba as someone who edits Mongolian and may know if these characters are desirable or not. (FWIW, I see no difference in display in Firefox or Chrome.) - -sche (discuss) 19:29, 30 January 2020 (UTC)Reply[reply]
@-sche They are undesirable, the connecting line should extend above the round head of initial <d> before letters <e>, <u> and <u̅>, just as the word is rendered without selectors. See Gorelova (2002) or Zakharov's dictionary. Crom daba (talk) 17:25, 8 February 2020 (UTC)Reply[reply]
In fact fonts display different letter shapes, whereas I saw none of them I've got followed the Unicode chart selection rule. But at least in Noto Sans Mongolian, for letter d, FVSs must be added for the correct letter form display. I've been tried to accept the Unicode standard, I thought it was the only way to mediate the font display variations. LibCae (talk) 03:16, 10 February 2020 (UTC)Reply[reply]