Wiktionary:Beer parlour

Definition from Wiktionary, the free dictionary
(Redirected from Wiktionary:BP)
Jump to navigation Jump to search

Wiktionary > Discussion rooms > Beer parlour

Lautrec a corner in a dance hall 1892.jpg

Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.

Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.

Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!

Beer parlour archives edit


August 2018

Dialogue tags[edit]

I was considering making a category for verbs that are often used as dialogue tags (which is when written dialogue is attributed with something like "he said"). What do you think? Should it be appendix-only? There's no real rules on it, AFAIK (there's no obvious reason I can think of that spoke and talked can't be used as dialogue tags, but they can't). Should it be marked to the correct sense for multi-sense verbs? And should it be at the present-tense lemma form say even though it's almost always the past-tense form said? GaylordFancypants (talk)

Is this the same as what grammarians call a verbum dicendi (plural verba dicendi)?  --Lambiam 14:04, 1 August 2018 (UTC)
Yes, I think so, I'd never heard of that term before. GaylordFancypants (talk)
See also reported speech. SemperBlotto (talk) 20:47, 1 August 2018 (UTC)
The verb (or “tag”) may equally take direct speech (a quotation) as a direct object (I’m fine”, he said.), or introduce reported speech through a subordinate clause (He said that he was fine.).
I've taken a stab at starting the template as I envision it. I don't know how to do conditional fields so it's just a non-working prototype to get feedback for now, at Template:Verba dicendi. What do y'all think? GaylordFancypants (talk)
Where would the output of this template be displayed? DCDuring (talk) 02:49, 2 August 2018 (UTC)
Under a "usage notes" header. I'd say on both the lemma and past-tense pages (say and said), or if you'd rather just the lemma with a quick usage note on the past-tense form that points readers to the lemma page for the details. GaylordFancypants (talk)
It's somewhat large, and it's boilerplate; perhaps it could be in an Appendix: page? —Suzukaze-c 07:33, 2 August 2018 (UTC)
Perhaps a few well-chosen examples of how to actually use this and what would be displayed will help to clarify the intention. I do not understand the class RPO. Take the sentence “The time has come to talk of many things.” Is talk RPO there? Another question: why does “object=forbidden” list write? To me, “He wrote that everything was fine” is a perfectly acceptable sentence.  --Lambiam 09:25, 2 August 2018 (UTC)
Also, the plural is fine for a category (as in Category:English verba dicendi), but for a template the singular is more appropriate, and probably also a minuscule for its initial letter, depending on the intended use.  --Lambiam 14:43, 2 August 2018 (UTC)
  • Why not insert the desired content manually in sandbox copies of language sections, starting with one or two in English? Shouldn't we start with the displayed content and not the template? Also, how many expressions would merit application of the concept? DCDuring (talk) 11:39, 2 August 2018 (UTC)
Looks like there's more constructions for reported speech than I anticipated (especially taking into account obsolete, regional and rare constructions), so I'm considering the much simpler template at Template:verbum dicendi now (which just says the verb can be used as either a tag, reported speech or both). Then there would be an appendix page listed that describes all the rules in copious detail. GaylordFancypants (talk)
Yeah, I think this would apply to literally hundreds of verbs. A label with a link to the glossary would probably be more appropriate than a usage note in every entry. Andrew Sheedy (talk) 21:57, 2 August 2018 (UTC)
A lot of people use verbs for color rather than adjectives or adverbs, which has led to a wide variety of substitutes for the usual "he/she said". You also see a lot of Usenet posts that go out of their way to use extremely strange verbs of this type to introduce the text in the previous message that the poster is replying to. Chuck Entz (talk) 03:39, 3 August 2018 (UTC)

Edit request: Wiktionary:Entry layout[edit]

(Repeat of 16 June 2018). Please see Wiktionary talk:Entry layout#Indentation?.  --Lambiam 14:00, 1 August 2018 (UTC)

Template:antonyms, Template:synonyms, Template:hyponyms, Template:hypernyms[edit]

Currently, these templates throw out module errors when the word parameter is left empty ({{ant|fr|}}Lua error in Module:nyms at line 17: The parameter "2" is required.).

Couldn't we change that, and make them behave as the etymology templates (for example, {{bor}}: {{bor|en|la|}}Latin [Term?]), plus make them put the entry in a CAT:Requests for antonyms in Russian entries, CAT:Requests for synonyms in French entries, etc.?

@Ungoliant MMDCCLXIV? Per utramque cavernam 14:48, 2 August 2018 (UTC)

Oppose. There's no reason to use these if you don't have any content. Just add the category manually. DTLHS (talk) 15:56, 2 August 2018 (UTC)
Oppose. It would enforce finding synonyms and thus we would have pairings which are only fuzzy matches; and good matches are added anyway when someone sees them. You can use the Thesaurus namespace to request semantically related words – Thesaurus entries suggest by their mere existence to add semantically related words. Fay Freak (talk) 17:00, 2 August 2018 (UTC)
@Fay Freak: I think none of your points is valid. 1) I didn't say we should add that template everywhere; editor's judgment should be exercised. I'm not going to add an antonym request to giraffe. 2) It's not true that "good matches are added when someone sees them": an editor doesn't necessarily think about antonyms when he comes to an entry. To take an obvious example: вну́тренний (vnútrennij) has no antonym mentioned, while it's been visited by at least three people fluent in Russian. 3) How exactly do you suggest to proceed with that? Create Thesaurus:ru:внутренний to ask for antonyms? That doesn't make sense. Per utramque cavernam 17:15, 2 August 2018 (UTC)
So gently pushing the addition of semantical relations as a maintenance issue while editors actually wanted to do other things? Nice.
The semantical relations can be added with a slower pace too. Fay Freak (talk) 17:20, 2 August 2018 (UTC)
@Fay Freak: Well yes (isn't it what we do with {{rfinfl}}, for example?), but nobody is forced to do anything here. But I get your point; maybe it's not such a good idea to create an umpteenth maintenance category. Per utramque cavernam 17:25, 2 August 2018 (UTC)
Inflections are core information while semantical relations are supererogation. Also they are not placed at the same places the same ways with the same intentions, so they are not so commensurable for this. Fay Freak (talk) 17:37, 2 August 2018 (UTC)
You could say the same about etymologies and images, and there are request templates for both. — Ungoliant (falai) 18:42, 2 August 2018 (UTC)
No, IMO. If no-one has bothered to list synonyms, maybe there aren't any. (In contrast, if a word was borrowed from another language, that presupposes there is a term in the source language, which could usually be added. And almost anything could, theoretically, be represented in an image—although IMO over-requesting and over-adding images to e.g. noncorporeal concepts or complex verbs would generally be undesirable.) If you think there are synonyms, then you add them, or ask in the Tea Room. - -sche (discuss) 13:20, 5 August 2018 (UTC)
Otherwise, you have the issue that because it's hard to prove a negative, it's hard to know when, if ever, the request could be removed... - -sche (discuss) 13:52, 5 August 2018 (UTC)

New header "Alternative spellings" distinct from "Alternative forms"[edit]

I don't like having a single header both for alternative spellings and alternative forms. Couldn't we split them? Per utramque cavernam 08:51, 5 August 2018 (UTC)

I can see why you would want to distinguish them, given that we have different templates for their definition-lines, but it'd take up more space when both were present (and look redundant, IMO), and make even more salient the difficulty of distinguishing which strings are alt spellings and which are alt forms: e.g. should alt capitalizations be differences in spelling, or form? (Or should they have their own, third header?) And given that noobs and WF already misuse variants of the header (which DTLHS has helpfully tracked from time to time), any distinction we might pick would not be maintained, so we'd have two headers being used for the same things (e.g. in huevada "alternative spellings" is already used for strings that differ in pronunciation, perhaps enough to be considered not even just alt forms but distinct synonymous words, but at least now that's an error, findable by a simple search, and fixable). Whereas, all alt spellings are alt forms, so headering them as such seems tolerable to me. The existence of different templates seems like less of a problem, although the nominal distinction is certainly not cleanly maintained, because they almost never occur together in the same block of definitions. - -sche (discuss) 13:48, 5 August 2018 (UTC)
Wiktionary:Votes/pl-2010-07/Alternative forms header? —Suzukaze-c 02:08, 12 August 2018 (UTC)
I haven't been terribly consistent about this but I tend to favour alt forms for minor spelling differences not affecting the sound (haemo-, hemo-) and synonyms for anything else (magic, magical). I am cautious about introducing further headers because (as -sche suggests) we already take up too much space for simple entries and having a new header doesn't guarantee that users will use it properly anyway. Equinox 09:32, 12 August 2018 (UTC)

News from French Wiktionary[edit]

Logo Wiktionnaire-Actualités.svg


It's a pleasure to invite you to read the July issue of Wiktionary Actualités translated in English!

July Actualités are filled with three articles: the word populism and a comparison of its definition in several wiktionaries; a presentation of a dictionary about Breton and French in contact; a discussion about the actors of the description of neologisms and the role of Wiktionary it this task. As usual, there is also new and encouraging stats, videos and nice pictures.

This issue was written by nine people and was translated for you by Dara. This translation can still be improved by readers (wiki-spirit). We hope you could enjoy this reading and we'll be happy to answer any question you may have about our publication or articles in it Face-smile.svg Noé 09:49, 7 August 2018 (UTC)

Adminship (reluctantly)[edit]

Even though I don't desire adminship, there are times when it would be useful. DonnanZ (talk) 10:45, 7 August 2018 (UTC)

RfV for love?[edit]

Should we have a policy to follow Merriam-Webster on this issue? I can see strong arguments either way. —Justin (koavf)TCM 00:22, 8 August 2018 (UTC)

Profoundly stupid. (To be clear: MW's decision, not your post.)
Shall we similarly remove the, ineffable, get?
As far as I'm concerned, Merriam-Webster just admitted that they can't do their jobs, and rather than strive for excellence, they're simply throwing up their hands and sweeping their incompleteness under the rug. I find it difficult to respect such an approach to lexicography. ‑‑ Eiríkr Útlendi │Tala við mig 16:35, 8 August 2018 (UTC)
That has got to be a joke. Equinox 17:47, 8 August 2018 (UTC)
Yes, isn't Clickhole a satire site related to The Onion? - -sche (discuss) 18:08, 8 August 2018 (UTC)
Indeed. I suspect it isn't citable, but many people online sometimes refer to falling for a transparent joke by The Onion or one of its sister sites as "eating the Onion". —Μετάknowledgediscuss/deeds 22:15, 8 August 2018 (UTC)
See w:ClickHole. DCDuring (talk) 03:35, 9 August 2018 (UTC)
I did, however, take this as inspiration to overhaul our entry, adding a few senses. - -sche (discuss) 08:05, 14 August 2018 (UTC)
While we're discussing the entry, aren't verb senses 3 and 5 really the same sense? To me it seems like they both mean "to enjoy, be greatly pleased by, derive delight from" (an emphatic word for like). —Granger (talk · contribs) 12:18, 20 August 2018 (UTC)

Linking to alternative forms[edit]

If anyone would like to help out in linking to English alt forms, alt spellings, obsolete forms, etc., to their main entries, there is now a list of unlinked forms that will take a bit of work to get through. There's also a Portuguese list if anyone would like to help with that. Ultimateria (talk) 01:42, 9 August 2018 (UTC)

A related issue is linking to non-lemma forms and alternative forms/spellings instead of to main-entry lemmas. I occasionally find myself clicking through two, three, or (rarely) more entries before getting to a lemma. It's bad enough having to run through an entire lemma entry for a polysemous word to find the relevant definition. DCDuring (talk) 19:28, 11 August 2018 (UTC)
I agree that these need to be corrected as well. If anyone is willing to create a list of these chaining soft redirects, I would appreciate it. Ultimateria (talk) 20:19, 14 August 2018 (UTC)
@Ultimateria User:DTLHS/cleanup/alt form chains DTLHS (talk) 01:02, 15 August 2018 (UTC)
Thank you! I have my work cut out for me now haha Ultimateria (talk) 02:08, 15 August 2018 (UTC)
What about the mester > mister > míster chain? It just so happens that there are two senses to the middle term, with the other two being alt forms of distinct etymologies. Is screening manually the only way to find them? Ultimateria (talk) 02:15, 15 August 2018 (UTC)


https://www.bruzz.be/mobiliteit/leukemiepatient-gratis-op-openbaar-vervoer-vlaanderen-maar-niet-brussel-2018-08-07 https://www.hln.be/regio/brussel/kankerpatient-met-leefloon-spaart-elke-mogelijke-cent-eten-uit-vuilnisbak-om-medicijn-te-kopen~a0b4bbda/ 12:36, 11 August 2018 (UTC)

@Chuck Entz I saw you rolled these links back, but they aren't spam links. They are Belgian media articles about long-time Wiktionarian Sven and his rare form of leucemia, posted by Sven himself. ←₰-→ Lingo Bingo Dingo (talk) 14:31, 18 August 2018 (UTC)

Differentiating citations by category of "durability"[edit]

do someone a frighten contains a typical RfV of a current internet meme.

It is characterized as cited, but only one of those is clearly in print. In order to cover such meme-idioms, which we probably should to keep relevant) we have to accept some relaxation of our "durably archived" standard. But we need to acknowledge that there are differences in the types of citations that have evidentiary import. At the very least we need some way of noting entries that only meet attestation with relaxation of the "durably archived" condition. Perhaps we could at least characterize citations as in print, usenet, not provably in print, provably not in print. Usenet is already marked if one of the templates is used. The other two should also be marked as it is an unreasonable burden on a user to research the citation's archival status. There may be other appropriate status markings as well. DCDuring (talk) 19:22, 11 August 2018 (UTC)

As more and more text is limited to online news sources such as Huffington Post, Buzzfeed, etc, while print media, in general, is dying, we should probably revisit our durably archived rule. What criteria do we use, however, to identify legitimate sources of language use? Kiwima (talk) 21:21, 4 September 2018 (UTC)

Wiktionary Cognate Dashboard[edit]

Screenshot of the Hub view

Hello all,

A few months ago, we asked you for feedback about Cognate, the system allowing interwikilinks between Wiktionaries (on main namespace). Several community members gave some suggestions, one of them was to provide statistics about these interwikilinks.

The Wikidata team is pleased to present you the Wiktionary Cognate Dashboard, a website presenting a lot of interesting information about how Wiktionaries are connected to each others. You can find there, for example:

  • the most interlinked Wiktionary entries not having a page on your Wiktionary
  • the number of interlinks between each possible pair of Wiktionaries
  • visualizations of the relationships between different Wiktionaries

To learn more about the tool, you can have a look at the documentation (please help us translating it in your language!). The interface of the tool itself can also be translated in other languages by using this page.

If you find a bug, please let a comment on this Phabricator task or ping me onwiki. Thanks a lot, Lea Lacroix (WMDE) (talk) 13:00, 14 August 2018 (UTC)

@Lea Lacroix (WMDE): Hi. I've recently asked DTLHS for a list of French words found on el.wikt but not here: User:DTLHS/Greek French phrases. Would it be possible to do this myself with this tool? I'd be interested in:
  • which Belarusian entries we're missing that are on many other wiktionaries (an example: a few days ago I've created падбародак, quite a basic word which exists on seven other wiktionaries already);
  • which Belarusian entries we're missing that are on ru.wikt;
  • which French entries we're missing that are on nl.wikt;
  • a couple other lists.
Per utramque cavernam 12:28, 17 August 2018 (UTC)
Hello @Per utramque cavernam and thanks for your suggestion. We've been thinking about this, unfortunately we're currently blocked because Wiktionaries use different ways to format their language titles inside the pages, and there is currently now way to browse all pages and check if there is a section related to a certain language on an entry. See details here. Lea Lacroix (WMDE) (talk) 13:17, 17 August 2018 (UTC)

Cyrillic numbers[edit]

Our entries for English numerals (e.g. ninety-nine) list Arabic and Roman numerals as synonyms. Do you think we should include the Cyrillic (and any other) ones as well (I am not confident to do it myself). SemperBlotto (talk) 20:23, 16 August 2018 (UTC)

Since it's an English entry, and as far as I know Cyrillic numerals are not used in any English context, no. DTLHS (talk) 20:24, 16 August 2018 (UTC)
@SemperBlotto: What exactly are "Cyrillic numerals"? If I type "99" using e.g. a Russian keyboard, it's the same "99" you would get using an English or any Roman based keyboard. --Anatoli T. (обсудить/вклад) 03:16, 17 August 2018 (UTC)
Sorry - that was down to ignorance. I had better not ask about Hebrew numerals. SemperBlotto (talk) 06:04, 17 August 2018 (UTC)
For what it's worth: Cyrillic numerals. - -sche (discuss) 06:53, 17 August 2018 (UTC)
@SemperBlotto: That's OK and it may be worth considering other numerals where appropriate. @-sche: Yes, but they are archaic. --Anatoli T. (обсудить/вклад) 09:19, 17 August 2018 (UTC)

Implementing "Restructure comparative and superlative categories"[edit]

In order to implement Wiktionary:Votes/2018-07/Restructure comparative and superlative categories, some wide-sweeping changes will have to be made. These notes below are copied from the vote talk page, where I had talked about ways to implement the change:

  1. Modify the categories under poscatboiler data or wherever it is to match the new spec, keeping adjective comparative forms etc. there until we have gotten rid of those categories
    1. Moving "comparative adjectives" etc. under adjective forms
    2. Adding a new category entry for "comparative adjective forms" etc.
  2. Move the entries:
    1. Category:English adjective comparative forms -> Category:English comparative adjectives (and respectively for all categories under Category:Adjective comparative forms by language).
    2. Category:English adjective superlative forms -> Category:English superlative adjectives (and respectively for all categories under Category:Adjective superlative forms by language).
    3. Category:English adverb comparative forms -> Category:English comparative adverbs (and respectively for all categories under Category:Adverb comparative forms by language).
    4. Category:English adverb superlative forms -> Category:English superlative adverbs (and respectively for all categories under Category:Adverb superlative forms by language).
    • (If the target categories already exist for some langauges, just merge the two)
    • The moving process will probably entail the following:
      1. Modifying Template:comparative of and Template:superlative of
      2. Editing the head template transclusions in all articles
  3. Eliminating the old categories from poscatboiler data

Right now the step would be to modify Template:comparative of and Template:superlative of; I was thinking of getting rid of the is_lemma parameter and assuming it is always true, therefore categorizing all comparatives under LANGNAME comparative adjectives for instance. Are there any suggestions, or would this be a good solution? SURJECTION ·talk·contr·log· 14:46, 18 August 2018 (UTC)

(This change would also apply to language-specific comparative/superlative templates, like the English ones) SURJECTION ·talk·contr·log· 14:49, 18 August 2018 (UTC)

Wrapping entire entries in {{l|en}}[edit]

Apparently this is not transparently a dumb idea and we need to explicitly forbid it. Would anyone else like to weigh in? DTLHS (talk) 04:29, 20 August 2018 (UTC)

Why forbid it? It may be an easier way to create entries for some users. Does it hurt something? - Alumnum (talk) 04:42, 20 August 2018 (UTC)
Technical issues aside (see Chuck’s comment on your TP), this destroys the syntactic usefulness of {{l}}. When you tag a bunch of definitions, usexes, labels, quotations, {{syn}}-type templates and HWL with {{l|en}}, it indicates that the whole content is written in English, which is not the case for half of these things. This could throw off, for example, a parser or a screen reader that is not designed to take into account spans with a lang attribute that are inside another span with a lang attribute. — Ungoliant (falai) 05:30, 20 August 2018 (UTC)
To be nitpicky, in the resulting HTML code, it is only the first paragraph after the start of the {{l}} template that is tagged with lang="en": the headword line in this case. This is probably because {{l}} uses an inline element (<span>...</span>), which cannot surround block elements like paragraph tags (<p>...</p>), so it is only added to the contents of the first <p>...</p> tag. But the HTML is still bad: the headword doesn't need to be tagged as English, then as Portuguese. — Eru·tuon 17:30, 20 August 2018 (UTC)
Bottom line: it saves you a few keystrokes, but makes it harder for everyone else. If it were something that went away after the initial edit, it wouldn't be that much of a problem, but it remains behind to confuse other editors and to make the results of modifying basic entry infrastructure much more unpredictable.
I suspect this violates the rules in WT:EL, though I haven't read through it to verify. Chuck Entz (talk) 05:49, 20 August 2018 (UTC)
Other templates inside it evidently ignore it. Definitions should link to English, and since the def template was deleted, the linking template is now the only one that does it. It does not put anything into wrong categories or anything. I don't see how those scenarios you described apply here. I doubt that any of you are truly concerned about anything else other than aesthetical preferences. - Alumnum (talk) 08:24, 20 August 2018 (UTC)
Probably in violation of "Headword line". ←₰-→ Lingo Bingo Dingo (talk) 06:46, 20 August 2018 (UTC)
This is a very bad idea, and I don't even know where to put the warning not to do it. WT:NORM? Luckily, I have never seen anyone besides Alumnum attempt it. —Μετάknowledgediscuss/deeds 05:36, 20 August 2018 (UTC)
I think it violates several web standards and is therefore forbidden already. Plus I do not discern the use Alumnum thinks it has, or what would be “easier”. He has added the template in a way which has not changed anything except making the markup invalid. Fay Freak (talk) 09:27, 20 August 2018 (UTC)
It messes up the instances of {{l|mul}} and {{taxlink}}. See User:DCDuring/Sandbox. Translingual terms that are taxonomic names are used in definitions of English terms. DCDuring (talk) 15:36, 20 August 2018 (UTC)
  • I don't see how we can do without {{l}}, it's far too convenient. But there are times when it doesn't make sense to use it, like [[car]] [[driver]] for bilfører, which of course doesn't explain why there is no entry for car driver. But that's another story. DonnanZ (talk) 23:29, 23 August 2018 (UTC)

Survey of lexicographers' needs[edit]


I just discovered Elexis survey of lexicographers' needs and I invite you to spend some time to fill it. It is aimed at [pro] lexicographers but I consider we are, despite Wiktionary not being our daily job. If you fill out the whole survey, you can get the previous answers and it is quite interesting to read. Deadline is August 27. Let me know if anyone spends some time on it Face-smile.svg Noé 15:01, 21 August 2018 (UTC)

I filled it out. DCDuring (talk) 16:43, 25 August 2018 (UTC)

Word origins[edit]

I think word origins should also be part of wiktionary. —This unsigned comment was added by (talk).

"See the Top Slang Term from Every State"[edit]

Make sure we have these? Hyperbolick (talk) 12:55, 23 August 2018 (UTC)

Here are the ones we are missing. Please verify existence before adding, since the source site is the worst kind of clickbait (50 pages for 50 paragraphs!), evidently created to serve ads. Equinox 11:48, 25 August 2018 (UTC)
  • awful awful (a kind of local milkshake; Rhode Island)
  • Benny (s.o. from Bayonne, Elizabeth, Newark, and New York; used in New Jersey)
  • biffed (embarrassed, humiliated? Utah)
  • bop (a longish distance to travel; Maryland)
  • it's brick (weather is very cold; Massachusetts)
  • chitlans (one's children; Georgia)
    Not itself in DARE, but probably from chit, which (per DARE) has two meanings: "sprout, germinal part of a plant" and "child, young woman" DCDuring (talk) 16:39, 25 August 2018 (UTC)
    On Google Books I see one instance of "recorded speech" of chitlans where the book (Crook County by Nicole Gonzalez Van Cleve) says it is ambiguous whether it means "children" or "chitlins"; I see two books where it definitely means "chitlins". - -sche (discuss) 17:19, 25 August 2018 (UTC)
  • cowboy up ("man up", deal with it; Montana)
  • gaper (first-time skier; Colorado)
  • get 'er dun (go and do it [redneck phrase of encouragement]; Maine)
    I am familiar with this as a redneck Southern phrase, not a Maine phrase. - -sche (discuss) 17:19, 25 August 2018 (UTC)
  • hollin' (boldly insulting someone to their face; Ohio)
  • pass a good time (have a good time, have fun [SoP?]; Louisiana)
    It's plausible as a native English construction (one can pass time, or pass a splendid weekend in the Bahamas), but French could've reinforced it, yes, although I would've expected a direct calque to use moment. - -sche (discuss) 17:25, 25 August 2018 (UTC)
    Yes, that's what bothers me with this edit. Per utramque cavernam 17:53, 25 August 2018 (UTC)
    Reminds me of the beautiful French phrase laissez les bons temps rouler --XY3999 (talk) 18:02, 25 August 2018 (UTC)
  • pigeon (desperate gamble to recoup losses; Nevada)
  • potato drop (New Year "ball drop" ceremony with a potato; Idaho)
  • pork queen (winner of state beauty pageant; Iowa)
  • pre-funk ("pre-function", drinking before another event; Washington)
    google books:"pre-funking" and the following cite of "pre-funked" suggest the verb is real, although see my comment below about (not) trusting the regional assignment without references. - -sche (discuss) 17:29, 25 August 2018 (UTC)
    • 2014, Andrew K. Smith, The Adderall Empire: A Life With ADHD (→ISBN):
      They were friends of my brother and we pre-funked it up at Fields, which is right across the street from Qwest Field.
  • red beer (beer brewed with tomato juice; Nebraska)
  • roofer (idiot; Arkansas)
  • shark bait (pale-skinned tourist [taking part in sea sports?]; Hawaii)
  • snoopy/Snoopy? (s.o. who pushes their food around instead of eating; Pennsylvania)
  • Some of these are very frustrating. I know "it's brick" is real (having heard it in Connecticut, though, rather than Massachusetts), but finding a cite for it just seems impossible. @Kiwima, DCDuring might be interested in scouring the Web for this kind of thing. —Μετάknowledgediscuss/deeds 16:02, 25 August 2018 (UTC)

Here's one for brick. Equinox 16:09, 25 August 2018 (UTC)

  • 2005, Vibe (volume 12, number 14, page 102)
    And while the tropics are definitely the place to be when it's brick outside, rocking a snorkel on the beach only works when you're snorkeling.
Here's another:
  • 2014, Ray Mack, Underestimated: A Searcher's Story (→ISBN), page 89:
    He was always hanging tight with me and since he had access to a ride . . . it made traveling easier. I mean it was no biggie brain buster to take the train, but when it's brick outside . . . fuck the A train.
I have added brick, with two more citations. Not that hard to find, actually. Kiwima (talk) 03:40, 26 August 2018 (UTC)
Some books seem to be suggesting it's AAVE. Given that the list also seems to get "get 'er dun" wrong, I wouldn't trust its regional labels without cites or references. - -sche (discuss) 17:23, 25 August 2018 (UTC)

List of broken links from Wikipedia to Wiktionary[edit]

Wikipedia is the top referring site to Wiktionary, responsible for 64% of referrals.

In my profile: User:Uziel302 I put a list of over 5000 broken links from Wikipedia to Wikisource, any help fixing those links or creating the articles will be much appreciated.

Thanks.Uziel302 (talk) 15:34, 24 August 2018 (UTC)

It seems to primarily be links to fix rather than entries to create; many simply have capitalization when they shouldn't (unlike in Wikipedia, capitalization is important [as in, it matters if you have it or if you don't] in Wiktionary) SURJECTION ·talk·contr·log· 15:44, 24 August 2018 (UTC)

Category:Long English words[edit]

How is this category populated? Is it automatic? How long does a word have to be? SemperBlotto (talk) 11:00, 25 August 2018 (UTC)

Manually; at least 25 letters long. Equinox 11:27, 25 August 2018 (UTC)
No, it's automatic. DTLHS (talk) 17:08, 25 August 2018 (UTC)
Yeah, it's still at the bottom of the page even if you take the manual category out. (How much memory does checking for whether or not a lemma should be in that category add, I wonder...) - -sche (discuss) 17:10, 25 August 2018 (UTC)
None. DTLHS (talk) 17:11, 25 August 2018 (UTC)

Tarantino "Language"[edit]

There are some lemmas in "Tarantino", but actually Tarantino is not a language but one of the several dialects of Neapolitan/South Italian Language. The correct way to include Tarantino dialect's words in the Wiktionary is to write them as variations of Neapolitan/South Italian lemmas, that's how it works for other languages as English, where, beside the standard pronunciation, they are often written the local variations (US, Canada, Australia, etc.). --Jamala (talk) 11:07, 25 August 2018 (UTC)

@GianWiki, -sche Per utramque cavernam 11:14, 25 August 2018 (UTC)
There is also another problem, i.e. Neapolitan Language lemmas are, in most cases, not general lemmas of the Neapolitan/South Italian Language, but lemmas in the dialect of Naples, with IPA for Naples accent, with Naples ortography, and so on. That because of the misleading name of the language (Neapolitan) that coincides with the name of one of the dialects of it (Neapolitan as dialect of the city of Naples). Just read my home page description to understand better. Anyway, that's a secondary problem now. --Jamala (talk) 11:26, 25 August 2018 (UTC)
The Italian dialects are a bit of a grey area, hard to know when/where to separate vs when to lump together under shared codes. (We also separate Emilian and Romagnol from each other.) If there's no standard form of "South Italian", it might make sense to leave Neaopolitan standardized on Naples, and add any other dialects that currently lack codes in under Neapolitan, with dialectal labels(?) (which should then also be added to Naples Neapolitan). Whether Tarantino should be merged, I don't know. I note there are separate Wikipedia editions in Neapolitan vs Tarantino. Pinging @SemperBlotto, Widsith, who have some knowledge of these languages, having added some of our Neapolitan and/or Tarantino words; what do you think? - -sche (discuss) 17:07, 25 August 2018 (UTC)
Neapolitan of Naples has it own peculiar features that make it quite difficult to be used as a general standard, anyway Tarantino is part of South Italian/Neapolitan for sure, as far as I know languages are not made by wikipedia editions.--Jamala (talk) 21:34, 25 August 2018 (UTC)
Does a general standard exist? What resources would you use to find Neapolitan words and their spelling? Crom daba (talk) 21:46, 25 August 2018 (UTC)
A general standard does not exsist and for now I accept the fact that Naples is used as it, but it would be better to define a way to obtain general words that can summarize all the features of all the dialects, for example I find strange to accept "veré" (to see) as the standard form, when it's clearly a phonetic variation (rhotacism) present only in some dialects (e.g. Naples), the general standard should be "vedé", that's just an example.--Jamala (talk) 21:57, 25 August 2018 (UTC)
Moreover, also the ortography has to be reviewed. Neapolitan traditional ortography was only used in Campania region, elsewhere (Abruzzo, Molise, Puglia, Basilicata) it is used a different ortography (curtiello vs curtielle). These are just some points. --Jamala (talk) 22:03, 25 August 2018 (UTC)
@Jamala Being able to decide on a standard form can be a lot harder than it seems, often there will be small differences across dialects that cannot be accounted for by usual patterns of correspondence, how would you approach such cases? Crom daba (talk) 22:30, 25 August 2018 (UTC)
yes, there are many cases of words that do not follow a precise pattern of standardization, in this case I would consider only the varieties following the pattern, considering the others as local alternatives (as for English, there are words that are just pronounced with an other accent, others that in some dialects are not only pronounced with an accent, but they also change in structure, then they become a local alternative of the more common and in this case official word, the same in Spanish and etc.). Eventually, if there are words that are quite different among each dialect, then it is possible to accept the form of the biggest dialect (Naples). --Jamala (talk) 06:52, 26 August 2018 (UTC)
I give you an example, the masculine determinative article is 'o only in Naples, but in all the other dialects is "lu" ('u), I would consider 'o a local alternative of lu, not the opposite.--Jamala (talk) 06:52, 26 August 2018 (UTC)
I only add words that I've either seen used in print, or that are in a reputable dictionary. In the case of Neapolitan, for me that means either something archaic from Lo Cunto de li cunti, or something given in Iandolo's Dizionario Napoletano semantico-etimologico. I would consider Tarantino to be a dialect of "Neapolitan", but I don't have especially strong feelings on the matter. Ƿidsiþ 07:20, 26 August 2018 (UTC)
There's a problem: Neapolitan Language is not the dialect of Naples, but the name of a cluster of several dialects, with different ortographies and different phonologies, there are a lot of poems and books in each of these dialects, how to select words? --Jamala (talk) 07:37, 26 August 2018 (UTC)
P.S. I would also consider this http://www3.pd.istc.cnr.it/navigais-web/ as a good source, and also this https://www2.hu-berlin.de/vivaldi/?id=0001&lang=it --Jamala (talk) 07:38, 26 August 2018 (UTC)

Dialectal forms in languages with non-phonetic scripts[edit]

It often occurs that a dialectal form that is given in IPA transcription is important for purposes of etymology or reconstruction and needs to be cited in an etymology section or a descendant list, what should be the suggested way of doing this? My solution is to do it like this:

Speakers of Ordos (and other Mongols in Inner Mongolia) use a slightly updated version of Classical Mongol as their literary language, and ᠴᠢᠯᠠᠭᠤ (čilaɣu) is the place where this pronunciation should belong. Yet at the same time, spoken Ordos is quite different from Written Mongol and Mongol script cannot express its pronunciation fully and even if it could these would be eye dialect spellings, not valid instances of Written Mongol.

This must be a very common situation for languages with a venerable written tradition, how are we to handle them? Crom daba (talk) 23:45, 25 August 2018 (UTC)

@Crom daba: I don't think it's common at Wiktionary to have many examples with dialects without their written forms, for Chinese and Arabic dialects we attempt to find / make some written form - some are based on transliterations found in other dictionaries. E.g. Moroccan Arabic doesn't have a very established written form but there are rules, so Moroccan Arabic can be written in the Arabic based on those rules. Otherwise, it would be in Roman letters, either form would be unattestable. Min Nan POJ spellings appear in dictionaries but it's not always easy to attest. Some failed or will fail if properly tested with CFI.
I don't know if it's appropriate here but maybe you can provide the transliteration without the Mongolian spelling, something like {{m|mn||tr=čiluu}}: [script needed] (čiluu). Maybe also with ps to render the transcription. Perhaps, what you provided is the only way - just show the IPA - with no native script or romanisation, which can't be attested. --Anatoli T. (обсудить/вклад) 00:48, 26 August 2018 (UTC)
@Atitarev Attesting pronunciations seems like a tricky business. I don't like the {{m|mn||tr=čiluu}} solution, since script is not actually needed and transcription shouldn't be used for IPA. Using an ad-hoc romanization is also inadequate since these forms shouldn't be their own entries in the first place, and it would just mean another system for the user to learn while the data is perfectly representable with IPA and obvious to most users.
I guess my suggestion would be to make a template that's a hybrid of link templates and the accent template, what drawbacks would this have? Crom daba (talk) 01:13, 26 August 2018 (UTC)
@Crom daba: OK then. I see your point. Providing pronunciation in etymologies may be useful, even if the source language has the script and transliteration. Users might wonder why алфави́т (alfavít) is pronounced more like Modern Greek αλφάβητο (alfávito), rather than the Ancient Greek ἀλφάβητος (alphábētos), which is the source. Just the transliteration and the native script is not enough. Providing the late Ancient Greek pronunciation may be useful to understand the b/v and e/i discrepancy. --Anatoli T. (обсудить/вклад) 01:25, 26 August 2018 (UTC)

Proposal to delete Simple English Wikiquote and Wikibooks[edit]

There is a now a proposal to delete Simple English Wikiquote and Wikibooks. Agusbou2015 (talk) 22:27, 26 August 2018 (UTC)

Proposal withdrawn, and the projects will not be deleted. StevenJ81 (talk) 14:49, 28 August 2018 (UTC)

Editing of sitewide CSS/JS is only possible for interface administrators from now[edit]

(Please help translate to your language)

Hi all,

as announced previously, permission handling for CSS/JS pages has changed: only members of the interface-admin (Interface administrators) group, and a few highly privileged global groups such as stewards, can edit CSS/JS pages that they do not own (that is, any page ending with .css or .js that is either in the MediaWiki: namespace or is another user's user subpage). This is done to improve the security of readers and editors of Wikimedia projects. More information is available at Creation of separate user group for editing sitewide CSS/JS. If you encounter any unexpected problems, please contact me or file a bug.

Tgr (talk) 12:39, 27 August 2018 (UTC) (via global message delivery)

Well, we now have to give people back their rights if they want them. I propose that we notify all admins, and those that express a desire to edit sitewide CSS or JS in the future should be made into interface admins. @Chuck Entz, TheDaveRossΜετάknowledgediscuss/deeds 19:17, 27 August 2018 (UTC)
I think that seems like a good idea, although I am not sure that the discussion about how to handle this was truly resolved. Since the status quo (prior to the overlords imposing their will) was that all admins had this ability, it seems like our tacit policy is that any admin who wants to have this power may have it. If that policy is to change I think we should vote on it. - TheDaveRoss 12:58, 28 August 2018 (UTC)
I agree we can grant it on request (and only on request, not automatically en masse to all account that have admin rights, since one reason for the change seems to have been to keep inactive users' hackable accounts from having it). I'd like the ability, since I occasionally edit CSS in particular, though I don't mind waiting if we want to discuss things further first. Some other users who've recently edited site-wide CSS and JS and who I'll ping to comment here if they want the ability back are @AryamanA, Atitarev, Daniel Carrero, Dixtosa, JohnC5, Justinrleung. - -sche (discuss) 18:58, 28 August 2018 (UTC)
Yes please, I'll probably work on using better Indic fonts at some point. —AryamanA (मुझसे बात करेंयोगदान) 20:53, 28 August 2018 (UTC)
I would like it if more of our technically competent users had this power, but would use it with great care, both with respect to security and to stability of the user interface. DCDuring (talk) 17:43, 30 August 2018 (UTC)

How do we link to abbreviations from lemmas?[edit]

For example, ABV links to alcohol by volume through the {{initialism of}} template. But how should alcohol by volume link to ABV? Under the header of Alternative forms? Abbreviations? (Is that a standard header? Would acronyms and initialisms be included too?) WT:ELE doesn't mention linking to abbreviations, acronyms, and initialisms. Ultimateria (talk) 22:30, 27 August 2018 (UTC)

I prefer the Synonyms header. DTLHS (talk) 22:34, 27 August 2018 (UTC)
Me too. I agree with DTLHS. DCDuring (talk) 17:44, 28 August 2018 (UTC)
Yes, Synonyms is probably the best header for this. - -sche (discuss) 18:58, 28 August 2018 (UTC)
That does make sense. I'll start implementing it. Ultimateria (talk) 02:54, 30 August 2018 (UTC)

no Chinese index at Main Page?[edit]

Why isn't there a link to the Chinese index at the Main Page? Can someone add the link? ---> Tooironic (talk) 01:28, 29 August 2018 (UTC)

I don't know. I think that the indices are mostly pretty bad, though; I'd rather remove that whole section of the main page. —Μετάknowledgediscuss/deeds 14:33, 29 August 2018 (UTC)
I agree. The Esperanto index is really bad—I think the last significant update to it was in 2012. —Granger (talk · contribs) 00:31, 30 August 2018 (UTC)
I hate that they're so prominent on the main page when they're so woefully out of date. I agree that we need to remove or renovate them. Ultimateria (talk) 02:56, 30 August 2018 (UTC)
I'd put MB's pages like User:Matthias Buchmeier/es-en-q on the front page instead - much better and up to date. --XY3999 (talk) 14:16, 30 August 2018 (UTC)
I don't really understand what the benefit of the index pages was supposed to be in the first place. What does the Esperanto index offer that isn't already provided by categories like Category:Esperanto lemmas and Category:Esperanto adjectives? —Granger (talk · contribs) 15:20, 30 August 2018 (UTC)
Yeah. Let's remove them from the main page. We could replace them with links to the lemma categories (now that you mention those), but why? - -sche (discuss) 17:53, 30 August 2018 (UTC)

@-sche, Ultimateria, Mx. Granger, Metaknowledge Would one of you take it upon himself to remove them? I don't think anyone will take exception to it. Per utramque cavernam 12:12, 7 September 2018 (UTC)

Since I've already brought up this same topic here this year and nothing happened, I have gladly removed all references to the Index namespace. Ultimateria (talk) 17:16, 7 September 2018 (UTC)
@Ultimateria: Thanks. I think the link to Wiktionary:Topics ("Topical index") should also be removed, at least temporarily, as that page is currently in RFD. By my doing, admittedly, but that page has already been RFD'ed before, and was kept only because of one keep vote (against 3 delete)... Per utramque cavernam 17:32, 7 September 2018 (UTC)
Thanks for bringing this up. I've replaced Wiktionary:Topics with Category:List of topics, which is really the same thing but more complete and self-updating. Ultimateria (talk) 22:24, 7 September 2018 (UTC)

Sino-Vietnamese readings vs. Vietnamese readings of Chinese characters that are not used in Vietnamese[edit]

The information provided by the Nom Foundation is sourced from various references, one of which is the Giúp đọc Nôm và Hán Việt (Nôm and Sino-Vietnamese Pronunciation Guide, abbreviated as "gdhn"). While using the query service [1] provided by the Nom Foundation, I've noticed that the book seems to be both a Vietnamese/Chinese multilingual dictionary, as well as a dictionary for Nôm characters. For example, in the case of phấn, the following entries are listed: KevinUp (talk) 11:02, 29 August 2018 (UTC)

(1)   Phấn (fèn)
(2)  Phấn (fen)
(3)  * (Hv phấn)
(4)  Phấn (fén)

I'm not sure why is listed twice, but based on the context provided, it seems that the first entry is for Chinese usage (something you would find in a Chinese-Vietnamese multilingual dictionary) while the second entry is for Vietnamese usage (Hv=Hán Việt), for the Sino-Vietnamese word that has been absorbed into the Vietnamese language. KevinUp (talk) 11:02, 29 August 2018 (UTC)

In view of this, are there any plans to separate the two contexts? This entry seems to be well formatted: - The {{vi-readings}} template is used to differentiate between (1) Hán Việt readings derived from fanqie (phiên thiết) and (2) Nôm readings for vernacular readings used in Vietnamese.

Currently, most Han characters use {{vi-hantu}} (created in 2006). Should we be using {{vi-readings}} instead of {{vi-hantu}} for better readability? KevinUp (talk) 11:02, 29 August 2018 (UTC)

@Mxn. —Suzukaze-c 03:03, 30 August 2018 (UTC)
I found a beer parlour discussion five years ago regarding the usage of the two templates. (Wiktionary:Beer parlour/2013/December#Nom character). I think {{vi-hantu}} should be gradually phased out in favor of {{vi-readings}}. KevinUp (talk) 15:52, 30 August 2018 (UTC)
Would it be possible for someone to fix Module:vi so that {{vi-readings}} is able to display the Han character in a Vietnamese font before the Hán Việt and Nôm readings? KevinUp (talk) 15:52, 30 August 2018 (UTC)
I've recently updated Module:vi in this edit here: [2]. The {{vi-readings}} template is now an improved version of {{vi-hantu}} and the older {{vi-hantu}} template should be considered deprecated. KevinUp (talk) 18:15, 31 August 2018 (UTC)

Here is an example of a Vietnamese reading of a Chinese character which is not used in Vietnamese: , which has two readings: nhẫm and nhẩm [3]. I'm not sure how the two readings are obtained. Based on Middle Chinese phiên thiết, the reconstructed reading is nhậm (nhưthậm切). A query of "nhẩm" using the service provided [4] indicates the meaning of to be identical to its usage in Chinese. KevinUp (talk) 11:07, 30 August 2018 (UTC)

@Wyang Any thoughts on this? My main concern is the distinction between actual readings that have been absorbed into the Vietnamese language and those that are listed in Giúp đọc Nôm và Hán Việt to help native Vietnamese readers understand Chinese characters. KevinUp (talk) 11:07, 30 August 2018 (UTC)
Giúp đọc Nôm và Hán Việt is a poorly composed resource in terms of Hán/Nôm distinction and Vietnamese etymology. It confuses many Sinitic (Hán) readings with non-Sinitic ones as it overconservatively assigns valid Hán readings and senses as Nôm. For phấn, it realises "Bột tán" is a Sino-Vietnamese meaning, but still erroneously treats the senses "Bột tán", "Có dạng bột tán" as Nôm, thus the split into two entries in the dictionary. Similar examples can be seen in bao (Lo liệu trước sau (như Bao Hv): Mọi việc cứ để tôi bao; Lớp vây bọc: Bao thư; Bao gạo; Bao lơn; Cụm từ: Bao tử (*thai ở bụng mẹ; *dạ dày)), cân, dẫn, thương, nhận, etc. Also many of its readings are outright wrong: for example Hán-Việt of 空 should be không and khống, whereas the dictionary just has không and conflated khống with không.
For 飪, the regular Hán-Việt reading is nhẫm. The character was a rising-tone character in MC, which is realised as the ngã tone in Hán-Việt when the character initial is a sonorant. Note that 甚 had two readings in MC: 常枕切 and 時鴆切. The 如甚切 fanqie for 飪 is based on the former reading of 甚, the Hán-Việt reading of which is actually thẫm. In Modern Vietnamese, however, the former reading has merged into the latter, leaving only thậm as the SV for 甚. Generally speaking fanqie is still a reliable and handy method to use, but there are a number of caveats. Apart from choosing the correct MC reading, it is important to also bear in mind that fanqie was meant for MC, not Modern SV readings, and quite a number of rules often have to be applied in order to convert the "raw" fanqie to the correct SV reading.
Wyang (talk) 10:37, 31 August 2018 (UTC)
@Wyang: Thanks for the reply. In view of this we should stop using readings provided by Giúp đọc Nôm và Hán Việt. There are indeed a significant number of erroneous readings in it. For Nôm readings, Tự Điển Chữ Nôm Dẫn Giải would be a better resource as it is very well cited ([5]). As for Hán Việt readings, we can use Vietnamese dictionaries such as Từ Điển Hán Việt (Trần Văn Chánh, 1999) where the phiên thiết readings are available. There is also WinVNKey [6], although it tends to be slightly outdated in terms of character support. KevinUp (talk) 18:15, 31 August 2018 (UTC)
@Bumm13 Hi there. You might want to take note of this: When using {{vi-readings}}, we need to separate Hán Việt readings from Nôm readings. The Nôm Foundation lists these readings as "Hán-Việt reading" and "Vietnamese" respectively. Also, take note of the references in brackets: (btcn), (vhn), (gdhn), (tdhv). Readings from (gdhn) should be used with slight caution. KevinUp (talk) 18:15, 31 August 2018 (UTC)
Yeah, I didn't fully understand that difference (at least regarding the Nom Foundation source information) when I first started adding Sino-Vietnamese readings to the single-character CJKV articles back in 2012, so some cleanup of Vietnamese sections in those articles is needed. That said, I do know the difference between Hán Việt readings and Vietnamese Nôm and am careful not to confuse the two. Thanks for the heads-up on the (gdhn) source readings from the Nom Foundation. Bumm13 (talk) 02:44, 1 September 2018 (UTC)
@Bumm13 The use of {{vi-readings}} is part of this entry layout proposal. I don't have as nearly much insight into Vietnamese etymology as Wyang, but in my opinion even GĐHN is better than how the Unihan database completely conflated Hán-Việt and Nôm readings. (Presumably Unihan was the source of most of the {{vi-hantu}} usage around here.) FYI, the Vietnamese Wiktionary imported readings from {{R:WinVNKey:Lê Sơn Thanh}} with permission. I haven't heard any criticisms of WinVNKey's Hán-Việt and Nôm tables, but there are a couple things to note:
  • To KevinUp's question, there's a lot of intentional between the Hán-Việt and Nôm tables. For example, in vi:nhậm, is listed once as Sino-Vietnamese for "nhậm, nhẫm" and another time as Nôm for "nhậm, nhẫm, nhẩm". My understanding is that a reading will appear in both tables if it is used both in Chinese transcription and in Vietnamese text written in chữ Nôm. WinVNKey is an input method editor, after all, so its primary focus is transcription and transliteration rather than etymology.
  • Interesting. Turns out the reconstructed reading of nhậm based on Middle Chinese (nhưthậm切) is relevant after all. Yes, a reading will appear in both tables if the original meaning from middle Chinese is also used in Vietnamese Nôm. KevinUp (talk) 14:38, 2 September 2018 (UTC)
  • WinVNKey relies on the HAN NOM A/B fonts (?), which predates the last few rounds of CJK additions to Unicode. Though the vast majority of characters are encoded in the CJK Unicode blocks, some rarer Nôm characters were encoded in the Private Use Area and need to be converted. (We haven't begun this conversion at the Vietnamese Wiktionary, and I'm unaware of a good resource for doing so systematically.)
 – Minh Nguyễn 💬 13:11, 2 September 2018 (UTC)
  • I do have a copy of the HAN NOM A/B fonts and it seems to contain many glyph that are not yet encoded in Unicode 11.0. I'm not sure whether some of the glyphs are actual Nôm characters or not. It might take some time for all of these characters to be encoded. KevinUp (talk) 14:38, 2 September 2018 (UTC)
@Mxn: I think it is well established now that Unihan contains a large number of mistakes, eg. stroke count, pronunciation, definitions and is the main reason why cleanup is still ongoing for single-character CJKV entries. As for GĐHN, it does seem to have confused some Sino-Vietnamese characters as Nôm characters. I even found some Nôm characters that have been encoded with simplified Chinese components, eg. 𬖷 (⿰, U+2C5B7), 𬅂 (⿰, U+2C142), 𦫘 (⿺, U+26AD8). These characters appeared in GĐHN and do not have corresponding traditional forms, which is why I am slightly doubtful about the usability of GĐHN. KevinUp (talk) 14:38, 2 September 2018 (UTC)
Fortunately, there is the well-cited Tự Điển Chữ Nôm Dẫn Giải (Nôm Characters with Quotations and Annotations) by Prof. Nguyễn Quang Hồng which lists Chữ Nôm characters (and their readings/meanings) based on the 214 radical system. I think we can use this book for Nôm readings and Từ Điển Hán Việt (Trần Văn Chánh, 1999) which provides phiên thiết readings for Sino-Vietnamese readings. KevinUp (talk) 14:38, 2 September 2018 (UTC)
I've also updated Module:vi. All the readings are now listed under Han character, similar to kanji/hanja used in Japanese and Korean entries. Would you mind taking a look at to see if it looks okay? In future, we will have sections like Etymology 1, Etymology 2, similar to what is now being done to Japanese entries such as , which has combined readings under kanji and more detailed information under each etymology section. KevinUp (talk) 14:38, 2 September 2018 (UTC)
@KevinUp: I'm not entirely clear on what has changed in Module:vi. Is it just that the Han character is displayed at the beginning of both reading lists? If so, that's seems fine to me. As for the entry layout, I wanted a long time ago to use "Character" instead of "Han character" as the heading, since I was concerned that "Han character" would lead to confusion with with "Hán-Việt" even when a character is only used for chữ Nôm. But I don't feel too strongly about it, and perhaps my concern is unfounded given the prominent link to chữ Nôm in the template. – Minh Nguyễn 💬 21:51, 2 September 2018 (UTC)
@Mxn: Yes. It's just a minor change in appearance to match that of the deprecated {{vi-hantu}} template. On the other hand, I think the term Hán Nôm is more appropriate compared to "Han character" to match that of kanji and hanja. However, other editors might disagree because Hán Nôm is a non-English header. The terms Hán Nôm/Hán tự (Chinese characters used in Vietnam), chữ Nôm (demotic characters used for vernacular writing) and chữ Nho (literary Chinese characters) don't mean the same thing, but are often confused. There's also chữ Hán which is the generic terms for "Han character" that includes simplified Chinese characters. KevinUp (talk) 11:51, 3 September 2018 (UTC)
@Wyang Any thoughts on Vietnamese entry layout? Han character on top followed by Etymology 1, Etymology 2 with {{vi-noun}}, {{vi-verb}} under their respective etymology. I've noticed Korean entries seem to follow the opposite, with "Hanja" under Etymology 1,2,3. See for example. KevinUp (talk) 11:51, 3 September 2018 (UTC)
Thanks for the ping. I have no strong opinions on this. I think I have only very rarely edited non-Chinese sections on character pages, and will probably focus on the non-Han script entries for Korean and Vietnamese for now. Wyang (talk) 11:56, 3 September 2018 (UTC)
No worries. Vietnamese and Korean entries using quốc ngữ and Hangeul respectively are of higher priority. KevinUp (talk) 12:39, 3 September 2018 (UTC)
(樂 is my experiment and is by no means settled. —Suzukaze-c 21:38, 3 September 2018 (UTC))
Also, I've recently created Wiktionary:About Vietnamese/references and {{vi-ref}} so we should begin delisting external web references in favor of more reliable publications. KevinUp (talk) 11:51, 3 September 2018 (UTC)

yok and var[edit]

There is a bit of an edit war going on with the Turkish word "yok". We describe one as a determiner and the other as an adjective. It seems that one of these might be incorrect. Does anyone know what we should do? SemperBlotto (talk) 08:37, 30 August 2018 (UTC)

(I pinged a couple of our Turkish-speaking editors to comment at Talk:yok. - -sche (discuss) 17:42, 30 August 2018 (UTC))
However they are described, they should be described the same way. Neither can be used attributively, which may argue against labelling them as adjectives. However, that is how the Turkish Vikisözlük classifies them both, and so does the German Wikiwörterbuch. Wikipedia calls them “descriptive adjectives”. I’ve also seen them described as (extremely) defective verbs, but that is a bit far-fetched, linguistically speaking. In any case, “determiner” seems completely off the mark. Note that these words can also be used in diverse ways as other parts of speech, like a noun or an interjection, just like English yes and no. [Off-topic. Turkish words like bazı and her are currently described as adjectives. Shouldn’t these be classified as determiners?]  --Lambiam 20:22, 30 August 2018 (UTC)

Adding Interface Admins[edit]

Having no one on the wiki with these rights was a serious problem, so I was bold and added a few.

Please don't interpret this to mean that I'm handing them out to everyone who asks.

What I did was go through the list of admins and make some no-brainer choices: those who have shown a particular interest in and knowledge of the site's jss and css infrastructure, and have worked with it in the past. I consider it a stopgap measure, to be modified or ratified by the consensus of the community.

Now that we have minimal capability in that area, I'm going to stop acting unilaterally and leave it up to everyone else to come up with procedures for adding and/or removal of these privileges. I will, of course, implement whatever the community decides.

Thanks! Chuck Entz (talk) 02:36, 31 August 2018 (UTC)

@Chuck Entz: I think you should give it to everyone who asks — everyone who is already an admin, that is. And people seem to have agreed on that point above at #Editing of sitewide CSS/JS is only possible for interface administrators from now, where you haven't commented. Why does Aryaman not have it, for example? —Μετάknowledgediscuss/deeds 03:00, 31 August 2018 (UTC)
We don't have a good means to assess technical competence and good judgment, let alone security awareness (security being the principal rationale offered by our MW overlords). If I were to ask for the powers, it would not be in enwikt's interest to given them to me. I won't ask, but there may be some admins who will ask and should not be given the powers. Mayb some select committee (other Interface admins + Checkusers + Bureaucrats + template editors?) should say yea or nay to requests for the powers. DCDuring (talk) 03:45, 31 August 2018 (UTC)
Note that interface admins can be assigned temporarily. You can consider a precautionary measure to assign temporarily to all who ask for it, say for a month, until there is an established procedure. --Vriullop (talk) 08:00, 31 August 2018 (UTC)
Can you name a single admin whom you would not trust with the powers that they used to have? I can't. —Μετάknowledgediscuss/deeds 04:56, 31 August 2018 (UTC)
It's not about "trusting" an admin to not personally abuse global javascript- you also have to trust that they are responsible enough to not let their account get compromised. We could do something like require Special:Two-factor_authentication to be enabled for interface admins. DTLHS (talk) 17:23, 31 August 2018 (UTC)

Mecayapan Nahuatl[edit]

This language used to be called "Isthmus-Mecayapan Nahuatl" but was changed to "Mecayapan Nahuatl" at some point. Why? I can't find any discussion. --Lvovmauro (talk) 09:26, 31 August 2018 (UTC)

diff, I also cannot find this discussion. @-sche DTLHS (talk) 18:25, 31 August 2018 (UTC)
I renamed it because "Isthmus-Mecayapan Nahuatl" is so rare as a name for the language that it apparently doesn't even meet CFI (at least, I haven't spotted any books using it, trawling around Google Books), and in any case "Mecayapan Nahuatl" is far more common. In the past, in cases like this where we copied an unused Ethnologue name but another name was overwhelmingly more common and there weren't many entries, I didn't always bother starting RFM discussions. If there's a reason why the name should be changed again or changed back, we can discuss it now. :) - -sche (discuss) 20:06, 31 August 2018 (UTC)

"Kurmanji" and "Sorani" to "Northern Kurdish" and "Central Kurdish"[edit]

I have had a request to update some language names across the site, and knowing that this sort of thing is often contentious and nuanced I would like a second opinion. Does anyone object to updating the names as requested? @Calak, I am not questioning your judgment, I just don't know enough about the languages to blindly make bold changes. - TheDaveRoss 21:27, 31 August 2018 (UTC)

Check it in Ethnologue.--Calak (talk) 21:31, 31 August 2018 (UTC)
Ethnologue often uses disambiguatory names which are not common in literature, as seems to be the case here, which suggests our current names are better: I can find plenty of literature about google books:intitle:"Kurmanji", not much about google books:intitle:"Northern Kurdish", and plenty on google books:intitle:"Sorani", not much about google books:intitle:"Central Kurdish". But I'll also ping User:Şêr (who wrote much of WT:AKU was still active on other projects recently) and other recently-active Kurdish speakers User:Ferhengvan and User:Bikarhêner in case they have opinions. - -sche (discuss) 21:51, 31 August 2018 (UTC)
Kurmanji is an accent of Northern Kurdish (in Turkey) and Sorani is an accent of Central Kurdish (in Soran province of Iraq). For example in Iraq, people who speak Northern Kurdish, never use "Kurmanji", they call their accent "Badini". In all scientific books, common terms for these dialects are NK (Northern Kurdish) and CK (Central Kurdish), I don't speak about google books. For example you can check prof. David Neil MacKenzie works, you can find Northern Kurdish and Central Kurdish in his works everywhere. + User:Vahagn Petrosyan and User:Ghybu.--Calak (talk) 23:03, 31 August 2018 (UTC)
Currently this seems like a no-go, until or unless I hear from others. - TheDaveRoss 15:18, 10 September 2018 (UTC)

September 2018

News from French Wiktionary[edit]

Logo Wiktionnaire-Actualités.svg


It's a pleasure to invite you to read the August issue of Wiktionary Actualités translated in English!

August Actualités arrived with brand new stats! Not only about French Wiktionary but about all Wiktionaries thanks to a great new tool! There is also an article about dictionaries of law in French, some shorts news and nice pictures from a campaign about remote but beautiful lands.

This issue was written by eight people and was translated for you by Pamputt and me. This translation can still be improved by readers (wiki-spirit). We hope you'll enjoy this reading and we'll be happy to discuss about it if you have any questions Face-smile.svg Noé 12:32, 1 September 2018 (UTC)

Cleaning up citations[edit]

I've been noticing a plethora of grossly incomplete, misformatted citations such as the following at [[lot]]:

  • C-3PO
"We seem to be made to suffer. It's our lot in life." in Star Wars Episode IV: A New Hope.

What's missing are the name of the writer (not the fictional speaker), the date, a link to the script or some other way of confirming that these words were part of the movie and showing the context, etc.

We don't have a good template for requesting cleanup of such things. A page like [[lot]] might have a dozen or more problems with citations. I've been using {{rfdate}} to attract attention to such citations, but "date" has been interpreted hundreds of times as the birth and death dates of the author. I've been adding a comment so that when someone wants to add the date what appears in the edit window is "rfdate|and other bibliographic particulars". I've started adding a longer comment: "rfdate|and other bibliographic particulars, eg, title of work, page, url, full name of author"

{{quote-book}} can also be useful, as inserting "| title= |" generates a visible request to supply the missing title.

But these tools don't address the general case of missing citation information. And it is not reasonable to expect all of the citation problems on a page to be corrected by the contributor who happens to notice them. Often it is efficient to correct instances of bad citations of a single author across multiple entries rather than clean up an entire entry.

The problem is not tiny. Already more than 1,400 pages use {{rfdate}} and an unknown additional number of entries should have it. And there are many entries that have badly formatted citations

How should we organize cleanup of citations on such entries? A single category called something like Category:English pages with citations problems ("pages" to recognize inclusion of pages in Citations namespace) into which multiple templates categorized might be sufficient categorization. Perhaps a single additional template to be inserted into a somehow wrong citation in that placed the entry into such a category would be useful. Its first unnamed parameter could be a comment on what seemed wrong. DCDuring (talk) 15:46, 2 September 2018 (UTC)

We can probably run a dump analysis to make a list of all dodgy citations and take it from there - one thing we are good at is making cleanup pages. --XY3999 (talk) 09:01, 9 September 2018 (UTC)

Malay as an ISO 639 macrolanguage[edit]

According to the IANA language subtag registry [7], various subtags such as the following have been listed as belonging to the macrolanguage "ms" (Malay language):

In addition, language subtags for several Malay trade and creole languages are available but not listed under the macrolanguage "ms" (Malay language):

In view of the fact that the "ms" language tag refers to the Malay language, which is a dialect continuum used in the Malay archipelago (including parts of Thailand, Singapore, Brunei and Indonesia) rather than the Malaysian language ("zsm" language tag) or the Indonesian language ("id" language tag), are there any plans to create a new section for the Malaysian language using the zsm language tag, as well as new sections for the other languages? See ISO 639 macrolanguage for further reading. KevinUp (talk) 10:59, 3 September 2018 (UTC)

Whenever the Ethnologue/SIL/ISO has assigned separate codes for "[Language]" and "Standard [Language]", we (always?) use only the former code and treat the latter as redundant; this is also the case with Malay. Based on prior discussions recorded at Wiktionary:Language treatment, the redundant codes zsm and zlm are subsumed under ms, as are jak, orn, ors and tmw. The status of the remaining ones is unclear. There has been discussion in the past of merging some or all of the dialects, although a vote that proposed merging even the standardized register of Indonesian with Malay/ms failed (although it had relatively limited participation and was years ago, so new discussion is fine). It seems likely that several more of the dialects listed above should be merged (one would need to evaluate their level of mutual intelligibility, of course). - -sche (discuss) 17:50, 3 September 2018 (UTC)
That's a bit unusual. jak, orn, ors, tmw should not be merged with ms. These are Proto-Malay languages spoken by the indigenous people (Orang Asli) in peninsular Malaysia. The four languages are mutually intelligible among the Orang Asli but differ significantly with standard Malay in terms of vocabulary. In my opinion, the various subtag languages identified by the Ethnologue/SIL/ISO for the Malay language are well defined. Each language has certain vocabulary words that are mutually exclusive from other regional variants of Malay. Only zsm and zlm are redundant language tags. Where mutual intelligibility is concerned, the differences between these languages are similar to that of Danish vs. Norwegian, or Pennsylvania German vs. standard German, and not as similar as the differences between British, American and Australian English. KevinUp (talk) 21:25, 3 September 2018 (UTC)
On the other hand, I don't think Indonesian and standard Malay (used in Malaysia/Singapore/Brunei) are unifiable. Both languages have considerable differences in terms of spelling, grammar, pronunciation and vocabulary. I would suggest having a unified Malay section with a unified etymology section followed by Indonesian, standard Malay, and other regional variants below it, in line with the ISO 639 macrolanguage definition. Are there any other languages using this format? KevinUp (talk) 21:25, 3 September 2018 (UTC)

See also:

Per utramque cavernam 12:25, 4 September 2018 (UTC)

I am of the opinion that the two languages (Indonesian and standard Malay) have diverged significantly due to political differences, similar to that of Danish and Norwegian. Nevertheless, they share the same etymology. Hence, they should be unified. I suggest having a unified Malay section (as an ISO 639 macrolanguage) with subheaders consisting of Indonesian, standard Malay, and other regional variants as listed above. I don't think it is a good idea to unify standard Malay and Indonesian because there are differences in terms of verb conjugation and derived terms between the two languages. Note that verb conjugation in regional dialects or informal speech is much simpler (or sometimes not used at all) compared to formal speech or writing, where it is frequently used for grammatical perfection. KevinUp (talk) 14:32, 4 September 2018 (UTC)
You (KevinUp) appear to state first that in your opinion Indonesian and standard Malay should be unified (because they share their etymology) and next that unifying them is not a good idea (because of differences). Did you change your mind halfway the comment? If not, which should it be?  --Lambiam 10:16, 5 September 2018 (UTC)
Malay varieties don't have much of inflection and differences in standard varieties, such as Malaysian Malay, Indonesian Malay and Bruneian Malay (most important ones) are minimal. They definitely can be unified under one L2 but I don't see this happening for the lack of dedicated Malay editors, especially native speakers. Malay would not have grammatical hurdles compared with Serbo-Croatian, which was successfully unified for the purpose of this dictionary. --Anatoli T. (обсудить/вклад) 10:33, 5 September 2018 (UTC)
@Lambiam: What I mean is that Indonesian and the existing Malay section (which seems to refer to standard Malay rather than the dialect continuum) should be placed in a unified Malay section. The header would still be the same, ie. Malay while the existing Malay section would be renamed as "standard Malay" and placed under it. This is because where etymology is concerned, the "ms" language tag which refers to the dialect continuum used in the Malay archipelago (including both standard Malay and Indonesian) is often confused with standard Malay ("zsm") used in Brunei, Singapore and Malaysia. For example, the term "Jepun" is standard Malay rather than Malay (the term is no longer used in Indonesia), while "Jepang" is Indonesian rather than standard Malay (the term is no longer used in standard Malay). However, "Jepang" is well expanded under the Malay section even though the word is more commonly used in Indonesian rather than standard Malay. As mentioned before, the existing Malay section is slightly confused with standard Malay and I hope this issue can be sorted out. KevinUp (talk) 14:36, 5 September 2018 (UTC)
Although inflection form between the various regional standards seems to be largely identical, some forms only exist in one of the languages and not the other. For example, dijepangkan, jepangkan, menjepangkan, penjepangan (listed as derived terms in the Malay section of "Jepang") are not used in standard Malay. I don't think there is an equivalent word for Japanification in standard Malay. Nouns are not inflected in standard Malay and something like "dijepunkan" would be considered ungrammatical. The correct term in standard Malay would be "dijadikan Jepun" (to be japanized). Another example would be inflected forms using memper- (used to convert adjectives into verbs in standard Malay and Indonesian), which has a similar but unrelated form, memper- + "noun/adjective" + -kan used in certain words such as mempertingkatkan ("to improve; to elevate"), memperkasakan ("to empower"), mempertahankan ('to defend") which seems to exist only in standard Malay and not Indonesian. If we were to use a unified Malay section (unified as in both languages combined into one with labels being used to differentiate between the two), I think users might become even more confused. On the other hand, if we were to redefine Malay as standard Malay and place it under a unified Malay section (Malay as a macrolanguage) along with Indonesian and other regional variants, we can still use citations and quotations to differentiate between the different languages while maintaining a single etymology section to make it easier for English readers to read up on the origins of certain words such as "Japan". KevinUp (talk) 14:36, 5 September 2018 (UTC)
What you have described so far looks like the Ancient Greek dialects Attic, Ionic, Doric etc. (also spread through clumped islands), or like Coptic is shown as Bohairic, Sahidic, Fayyumic, Akhmimic, Lycopolitan. One can put multiple tables under one header. Randomly picked example: θεάομαι (theáomai). Is this what you like? Fay Freak (talk) 02:31, 6 September 2018 (UTC)
Thanks for the reply. Very interesting to find out about the various conjugated forms of ancient Greek dialects. Fortunately, where inflection is concerned in the Malay language, both standard Malay and Indonesia have the same general structure. There are up to fifty possible combinations, including prefix-suffix combinations and active-passive forms, most of which can be found here:Template:ms-der. Unfortunately, it is not possible to create a standardized table or chart for all the conjugated forms. Some verbs only have three forms (eg. the recently created senam (to exercise)) while some verbs may have up to 12 different forms (eg. sakit (to be sick)). Some forms may be considered deprecated or non-existent in one of the languages but not the other. I wonder if there are other languages which creates conjugated verbs only out of necessity? KevinUp (talk) 07:01, 6 September 2018 (UTC)

Malayan languages vs. Malay language[edit]

AFAIK Pattani Malay (of Thailand) at least does not write in Latin script; it regularly uses Thai and localised-Arabic scripts resulting unable to share the same entry. Pattani Malay and Malaysian Malay are also very different in reading and grammar, so they could not communicate each other. --Octahedron80 (talk) 02:20, 6 September 2018 (UTC)

Yes well, even though Pattani Malay uses the Thai script, the IPA pronunciation of its vocabulary is the same as that of Kelantanese or Kelantan Malay (in northern Malaysia) written using the Latin script. Unfortunately, we can't merge Pattani Malay and Kelantan Malay due to script differences. Also, we do not yet have Kelantan Malay entries (but it can be created). Note that both varieties have the same article (Kelantan-Pattani Malay) on Wikipedia. Are there any other languages on Wiktionary that has this kind of situation (same IPA pronunciation, different script)? Also, we may need to discuss whether Malayan languages can be considered the same as the Malay language on Wiktionary. Both terms may have been confused because native speakers of Malayan languages (ie. non-standard varieties of Malay) are also taught the standardized version of the language (bahasa Melayu) in schools. KevinUp (talk) 07:01, 6 September 2018 (UTC)
FYI Pattani and Kelantan and that region were in the same country until Britain cut them in half, that's why. And Malaysia began to use Latin script since then. --Octahedron80 (talk) 16:34, 6 September 2018 (UTC)
Malay native speaker here. I think Malay and Indonesian should be separated. We could separate it by having Malay entries in Jawi (localised Arabic script) and Indonesian entries written in Roman script. Like Hindi and Urdu. --Tofeiku (talk) 11:17, 6 September 2018 (UTC)
Seems like there is also the issue of Malay entries with the same content but using different scripts. This also needs to be sorted out. Compare -nya and , bait and بيت for example. KevinUp (talk) 18:43, 6 September 2018 (UTC)
I already have solution for (standard) Malay: making Latin as primary and Jawi as secondary -- by putting the ms-jawi template instead of copying them around -- because it is easier to search by A-Z. I have been doing this for some time. For Pattani Malay, I make Jawi as primary because no Latin. By the way, Pattani's Jawi recently has new orthography against (standard) Malay's. --Octahedron80 (talk) 02:48, 7 September 2018 (UTC)
Any source about the new orthography that I could read online? ^^ And also, IIRC, Pattani Malay is kind of a dialect like any state dialects in Malaysia. The Malay community in Thailand uses Jawi script and Standard Malay since they alao have their own Dewan Bahasa dan Pustaka. It's like Brunei I guess, they have a Bruneian dialect but for formality they use Standard Malay. --Tofeiku (talk) 14:54, 7 September 2018 (UTC)
@Tofeiku User Talk:Octahedron80#Etymologies of Pattani Malay Terms --Octahedron80 (talk) 02:25, 8 September 2018 (UTC)
@Octahedron80: Good solution. There are now 162 entries under Category:Malay terms with Jawi spelling. Entries using Jawi spelling can be linked back to equivalent entries using the Latin script via the {{ms-jawi}} template. On the other hand, although Pattani Malay and standard Malay are significantly different from one another, Pattani Malay (southern Thailand) and Kelantan Malay (northeastern state of Malaysia) are equivalent forms. The two regions were cut off by the British as mentioned before, resulting in Pattani Malay using the Thai script and Kelantan Malay using the Latin script (in informal writings, eg. diaries and letters). Currently, Pattani Malay entries using the Jawi script ([8]) seem to be equivalent to that of Kelantan Malay. Unfortunately, we do not have editors that are proficient in Kelantan Malay to create entries for Kelantan Malay using the Latin script. In future, Pattani Malay entries using Jawi script instead of Thai script may need to be renamed as Kelantan-Pattani Malay to avoid ambiguity. KevinUp (talk) 15:56, 7 September 2018 (UTC)

@Sgconlaw I think you might be interested in mbf (Baba Malay), also known as Peranakan Malay. The language is almost extinct among the younger generation, but plenty of resources can be found at the National Library of Singapore.

Are there examples of languages that concurrently use two different type of scripts due to the separation of geographical borders? How shall we deal with this type of situation? KevinUp (talk) 18:43, 6 September 2018 (UTC)
@KevinUp: Of, course, there are few. Malay itself, just have a look at Malay lemmas. E.g. اءور and aur are two forms of the same word, Serbo-Croatian is written in Cyrillic and Roman. --Anatoli T. (обсудить/вклад) 02:58, 7 September 2018 (UTC)
Romanised Malay itself has a lot of homographs unlike Jawi Malay. For example, bait: بيت (house) and باءيت (byte), bala: بلاء (disaster) and بالا (troop). I see Chinese Traditional character is used over Simplified although most people use Simplified characters. Why not use Jawi Malay as well here? In Brunei and Pattani, Thailand, Jawi can be seen everywhere. Malaysia also still use Jawi but not as much like those 2 places. --Tofeiku (talk) 03:15, 7 September 2018 (UTC)
@Tofeiku We can just separate etymology sections like other languages do; no problem. --Octahedron80 (talk) 03:19, 7 September 2018 (UTC)
The terms بيت and باءيت (romanized as bait) as well as بلاء and بالا (romanized as bala) do not share the same etymology. When such situations are encountered, it is best to split the romanized Malay entry into separate etymology sections. The |gloss= parameter can then be added to {{ms-jawi}} to link the Jawi entries to its related meaning in romanized Malay. As for Chinese characters, one of the reasons why simplified Chinese characters are linked back to traditional Chinese characters (despite simplified Chinese being used more often) is because of the approach taken by various Chinese dictionaries published in mainland China such as Hanyu Da Zidian, Hanyu Da Cidian and Zhonghua Zihai which refer back to traditional characters for headwords involving simplified characters. This is done because some simplified Chinese characters such as are derived from two different traditional characters: (, to issue; to develop) and (, hair) KevinUp (talk) 15:56, 7 September 2018 (UTC)
On an unrelated note, I noticed that Serbo-Croatian more and море have almost identical layout and content. I'm not sure whether that was intended or not. KevinUp (talk) 15:56, 7 September 2018 (UTC)

Statistics for Malay/Indonesian entries[edit]

Hi. Would it be possible for someone to figure out the number of Malay and Indonesian entries that are currently available? In addition, I would like to know the number of entries which has both Malay and Indonesian sections it it. Thank you very much. KevinUp (talk) 14:36, 5 September 2018 (UTC)

@KevinUp: You can find entries with both a Malay and an Indonesian header by searching : insource:/==Malay==/ insource:/==Indonesian==/. Currently there are 1,033 of them. — Eru·tuon 02:11, 6 September 2018 (UTC)
@KevinUp: The other question about the number of entries (lemmas and non-lemmas) are in Category:Malay lemmas and Category:Indonesian lemmas and Category:Malay non-lemma forms and Category:Indonesian non-lemma forms. --Anatoli T. (обсудить/вклад) 02:16, 6 September 2018 (UTC)

Thank you. The statistics for current entries of Malayan languages (as of Sept 2018) are as follows: KevinUp (talk) 07:01, 6 September 2018 (UTC)

Possible outcomes[edit]

A possible outcome of this entire discussion is to rename the current Malay section as "standard Malay" so as not to confuse the dialect continuum with that of the standard variety used in Brunei, Singapore and Malaysia. KevinUp (talk) 07:01, 6 September 2018 (UTC)

@KevinUp: It is an option but it's still a significant change and needs to go through a vote. --Anatoli T. (обсудить/вклад) 07:14, 6 September 2018 (UTC)
@Atitarev: Indeed. That is up to the community to decide. KevinUp (talk)
FWIW I doubt that will happen; we tend not to use "Standard X" names; "Malaysian" might work, though. A Chinese-style merger could work if there were enough editors knowledgeable of Malay interested in implementing and maintaining it, but if there is opposition to merging Malaysian and Indonesian, that'd seem to be a roadblock. FWIW script differences are not inherently an impediment to merging things under one L2; Serbo-Croatian is written in multiple scripts, even e.g. Afrikaans has some entries in Arabic script; for that matter, some varieties of Chinese are written in Arabic or Cyrillic. - -sche (discuss) 18:45, 6 September 2018 (UTC)
For naming “zsm” (Bahasa Malaysia) I think “Malaysian” works better than “standard Malay”. The family “ms” of Malayan languages can be named “Malayan” instead of “Malay“. The term “Malay”, without further qualifications, could then be reserved for the (many) cases in which Bahasa Malaysia and Bahasa Indonesia agree.  --Lambiam 19:17, 6 September 2018 (UTC)
I'm sticking to calling it (Standard) Malay. This language is not exclusively for Malaysia only. Brunei Darussalam and Singapore call their national language bahasa Melayu (Malay language). --Tofeiku (talk) 03:00, 7 September 2018 (UTC)
Standard Malay is the description found in the IANA language subtag registry [9] for the "zsm" language code. The use of the "zsm" language code has remained largely obsolete due to "ms" being the preferred code before it was redefined as a macrolanguage in ISO 639-3 to refer to the dialect continuum used in the Malay archipelago. Although "Malaysian" might work, native speakers in Brunei and Singapore as well as Malay speakers living overseas might oppose to it. KevinUp (talk) 19:32, 7 September 2018 (UTC)
As for naming the "ms" family of languages as Malayan rather than Malay, I think some users may confuse Malayan with that pertaining to Malaya, the region formerly ruled by the British that includes peninsular Malaysia (also known as west Malaysia) and Singapore. I suggest we treat Malay to be the same as "bahasa Melayu". Malay can be used as an umbrella term to refer to any of the regional variants used in the Malay archipelago. Note that regional forms such as Brunei Malay and Pattani Malay are translated as "bahasa Melayu Brunei" and "bahasa Melayu Pattani" in the Malay language. We can still use Malay to refer to the many cases where both standard Malay and Indonesian are similar, but I don't think we need to reserve the term for that purpose, because Malay or bahasa Melayu is just a general term that can refer to any of its varieties. For more examples, see Talk:bahasa Melayu. KevinUp (talk) 19:32, 7 September 2018 (UTC)
As for the possibility of unifying Indonesian and standard Malay, I would like to reiterate that the differences between the two languages are akin to that of Danish and Norwegian. Due to Dutch influence and the influence of the Javanese language, the Indonesian language has a wider source of loanwords and has evolved considerably compared to standard Malay ("bahasa Melayu piawai" or "bahasa Melayu baku") which is based on the language used during the time of the Riau-Lingga Sultanate that had a flourishing literary culture. KevinUp (talk) 19:32, 7 September 2018 (UTC)
Kindly peruse the following pages for further reading:
Support merger under a "Malay" heading, as per the comments in previous discussions. The differences can be handled by labels, lists of synonyms, etc. Wyang (talk) 08:15, 12 September 2018 (UTC)

Use cases[edit]

Would adding information about use cases of a word not be useful? As a kind of separate section with references and excerpt quotations. I'm thinking about old words which have interesting histories in early writing, or new words that just aren't as old as people say they are. -Inowen (talk) 07:42, 5 September 2018 (UTC)

Isn’t that the purpose of the example sentences (aka usexes) following a definition, to show the uses of a word by applying it in appropriately illustrative contexts? (Unfortunately, many current usexes are not particularly helpful, but that is another discussion.) If you have something else in mind, could you give an example of what you’d like to see?  --Lambiam 10:30, 5 September 2018 (UTC)

New language request: Milpa Alta Nahuatl[edit]

This language currently does not have an ISO code, and is not recognized as any a dialect of any other variety of Nahuatl (at least according to Ethnologue). --Lvovmauro (talk) 03:40, 6 September 2018 (UTC)

Read-only mode for up to an hour on 12 September and 10 October[edit]

13:33, 6 September 2018 (UTC)

Arawakan tree[edit]

So, the tree for Arawakan is very underdeveloped (see {{#invoke:family tree|show|awd-pro}}), many languages have less common spellings and forms (ex. Pareci vs. Paresi), several dialects are marked as languages (ex. Ashéninka Pajonal), a few languages are still lacking codes (ex. Wiriná), etc. Since I appear to be the only one that has any interest in this family and the fact that most languages don't even have entries to them, instead of painstakingly indexing all the needed changes, I'd like to just have a whack at cleaning it up. Does anyone have any objections? @-sche, Metaknowledge? --Victar (talk) 15:42, 6 September 2018 (UTC)

I have no knowledge of those languages or the literature on them, although -sche might (so I'd wait for them to respond). I'm fine with letting you go at it, though. —Μετάknowledgediscuss/deeds 17:39, 6 September 2018 (UTC)
Yeah, a lot of our language families are pretty underdeveloped. (Meta has been working for some time to fix names of Bantoid languages...) Go for it; that (i.e. not bothering with discussions) is what I've done in the past for languages with no entries where I expected the changes wouldn't be controversial. If you merge any codes/languages, make a note in WT:LT (I guess link to this thread as the "discussion"), and try and keep the merged varieties' names as "otherNames" so it's clear how they're being included. If any of the changes seem questionable, I'll bring it up here, though I doubt that'll happen since you tend to know what you're talking about! :) We do have a tiny handful of users with specialist knowledge of Arawakan, like Emi-Ireland, though they haven't been active recently. - -sche (discuss) 19:06, 6 September 2018 (UTC)
@-sche, Metaknowledge, thanks, and I'll be sure to put in a merge request we I switch the lang codes to etym-only codes for the Asheninka dialects. --Victar (talk) 20:25, 6 September 2018 (UTC)

Categorize Babel categories by language family[edit]

The thread above this gave me an idea: wouldn't it be useful to categorize "user language" categories, e.g. Category:User wau which contains users who speak Wauja, into language-family categories, i.e. Category:User awd (or whatever naming format, like "User language family awd")? That way, if you need of someone who speaks Wauja, you go to Category:User wau like always, but if you're in search of people with knowledge of e.g. any Arawakan languages, you don't have to try out "Category:User xxx" for every different language to see which ones exist and have users, you just go to Category:User awd and use the "▷" buttons to find such users. (As for how to bring about the categorization, it seems like someone could write a bot to add the families based on the families that are set in Module:languages, where a respectable 6518 of 8054 languages have family info already; it's not an urgent or high priority, so it's fine if no-one has time at the moment. Special cases like "User sr" can be done by hand.) - -sche (discuss) 19:20, 6 September 2018 (UTC)

Which level of language families? "Indo-European" or "Germanic", or both? DTLHS (talk) 19:31, 6 September 2018 (UTC)
I suppose it should follow the same categories as the languages do in Module:languages, for simplicity / ease of implementation and maintenance. (So, "CAT:User de" in "CAT:User gmw" in "CAT:User gem" in "CAT:User ine".) - -sche (discuss) 20:45, 6 September 2018 (UTC)
Doesn’t work anyway with Metawiki. Fay Freak (talk) 19:38, 6 September 2018 (UTC)
Huh? Just add the categories... to the category pages... - -sche (discuss) 20:45, 6 September 2018 (UTC)
I mean any plan here does not catch users who keep their user pages on Metawiki. Fay Freak (talk) 23:15, 6 September 2018 (UTC)
As long as the user's page on this wiki is categorized into CAT: User wau by any means, we can put [[Category:User awd]] at the bottom of our page [[Category:User wau]], and effect this categorization on our wiki. Right? Whether metawiki wants to copy our system or not seems like a separate matter. - -sche (discuss) 23:22, 6 September 2018 (UTC)


Suggest adding Proto as a language itself for word and word particle definitions, particularly where lexemes are primary. -Inowen (talk) 17:55, 7 September 2018 (UTC)

Proto-Indo-European is a reconstructed language- it is nowhere attested in actual use. It doesn't belong in the dictionary itself, but we do have entries in the Reconstruction namespace. See WT:AINE for details. Chuck Entz (talk) 19:36, 7 September 2018 (UTC)
Reconstruction is sometimes the way we know about languages which in real life were spoken natural languages. Its what we got, the best we can do, and since research into Proto is constantly being improved, its the place where a lot of whole-istic knowledge about European-Eurasian language roots. Roots are the main idea - the simple words which are at the base of the tree of a lot of words. Proto is just the name for all of that research, its not a "reconstructed language" as much as a dictionary of roots (nobody speaks Proto anymore, do they). -Inowen (talk) 18:56, 8 September 2018 (UTC)

Vowel Lengths in Ancient Greek[edit]

For the life of me I cannot seem to uncover the sources of some of the vowel lengths so confidently produced on Wiktionary's Ancient Greek pages. On what authority do we hear that Κικέρων had a short ι (Κῐκέρων)? As far as I am aware Cicero's name does not show up in any surviving Hellenistic poetry, only prose (Plutarch, Appian; Strabo?) - and I am not convinced simple transcription from Latin to Greek conserves vowel lengths robustly enough for us to claim this quantity as 'known'. (Interestingly, the Latin page for Cicero - https://en.wiktionary.org/wiki/Cicero#Latin - seems to lack the vowel lengths; where, however, they would be very much justified (Cĭcĕro, ōnis is well-attested). Why are there so few macrons on Latin words?)

Do not get me wrong. The presence here of macra and breves in our Greek pages is in my opinion desirable, even indispensable. The fact is that Wiktionary is one of few places where a student or scholar can find good information on vowel quantities, whether for prosodic purposes or phonetic (pronunciation); and we must retain this. But the approach taken has perhaps been over-zealous, if the editors feel no vowel can ever be left unmarked. Thus my question is really a) an open one - can someone inform me what the procedure is for deciding these vowel lengths, or from what source this information is reproduced? - maybe someone has found a lexicon that can corroborate these vowel lengths? - or, failing that, b) a suggestion, or plea, that unknown quantities be removed, and macra and breves used only where we actually do have them on good testimony. —This unsigned comment was added by 2A00:23C4:3988:C101:25C0:90F9:EB63:6D8A (talk) at 18:29, 7 September 2018 (UTC).

In general transcription from Greek to Latin preserves vowel length (for example, ̓́Ῑκᾰρος → Īcărus), so I expect transcription in the opposite direction to behave the same way.  --Lambiam 00:12, 8 September 2018 (UTC)
I don't know as much about policy for Latin, but for Ancient Greek, editors agreed on adding macrons and breves zealously (see Wiktionary:About Ancient Greek § Diacritics and accentuation). It's fairly common for a lexicon not to mark short vowels with breves, but we include breves because without them it's unclear whether an unmarked vowel is short or just hasn't had a macron added yet.
Occasionally breves in Wiktionary are not specifically supported by a lexicon or other evidence (perhaps most commonly with proper nouns), but are added because if LSJ or someone else doesn't indicate a vowel is long, it is probably short because lexicons are fairly diligent about marking long vowels, and because short vowels are more common. But I second what Lambiam says about Κικέρων (Kikérōn). — Eru·tuon 00:29, 8 September 2018 (UTC)

What categories labels generate[edit]

I have discovered that {{lb|en|uncommon}} generates Category:English rare forms, whereas {{lb|en|rare}} generates Category:English terms with rare senses. So you can use "uncommon" if a term has only one sense, but it is defined in the category as "English terms that serve as rarely used forms of other terms". I'm scratching my head. DonnanZ (talk) 23:28, 7 September 2018 (UTC)

We do have both Category:English terms with rare senses (>9100 members) and Category:English terms with uncommon senses (1 member, namely Category:English uncommon forms (20 members)). I don't see the point of categorizing "English uncommon forms" as "English terms with common senses". "See also" links between the two would be convenient and less confusing.
It would also help if we had some criteria for applying "uncommon" and "rare" to definitions. "Rare" would seem to me to mean less common than "uncommon". It is not clear to me whether the frequency of the sense is uncommon or rare relative to the whole of the English language (but still meeting RfV) or to the total use of the word (lemma). I think one good use of these labels for English definitions is in part to discourage the use of both rare and uncommon terms in defining FL terms, so perhaps precision isn't required. DCDuring (talk) 20:13, 17 September 2018 (UTC)

Creation of Cajun French, Missouri French, and Louisiana Spanish tags[edit]

1.I have created many a definition for Louisiana and Cajun French entries, but we have nary a dedicated Cajun French tag linking to a category. There exists the tag "Cajun" that shows the word "Louisiana" and uses a blue link to direct to Cajun Peoples on wikipedia, but it links only to the Louisiana French category.

2.I would like to begin adding Missouri French(Paw-paw French) definitions, but we have neither a category nor a tag for it.

3.I also ask that we make a Louisiana Spanish tag linked to the Louisiana Spanish category, as there are region specific words available. Aearthrise (𓂀) 15:18, 9 September 2018 (UTC)

Also, can someone please create such tags for Travancore English, South America English, and Nicaragua Spanish? I've seen them used in entries but they don't result in categories. 22:56, 9 September 2018 (UTC)

French Dialects of America[edit]

@Per utramque cavernam We should make a CFI exception for more poorly attested French dialects- like Missouri(Paw-Paw), Louisiana(Louisiana Colonial), and Cajun(Acadian) French. Aearthrise (𓂀) 18:53, 9 September 2018 (UTC)

I second this. Minority forms of a language can't be treated with the same rigour as the main dialects, and these forms of French are pretty distinctive (and fairly localized). Andrew Sheedy (talk) 00:28, 10 September 2018 (UTC)
I agree (see the convo that prompted this suggestion). I suppose it's not limited to French, by the way. Per utramque cavernam 08:34, 12 September 2018 (UTC)

Replace {{t-needed|xx}} with {{t|xx}} (with no term)[edit]

All of our other linking templates already display a request when the term hasn't been provided, so why not do the same with translations? Module:translations would need to be modified so that it displays the way {{t-needed}} currently does, if there is no term, and of course add the category. The translation added script would also need to be modified. As an added bonus, you can differentiate between {{t}} and {{t+}} even if en.Wiktionary itself has no translation yet. —Rua (mew) 12:38, 10 September 2018 (UTC)

Symbol support vote.svg SupportSuzukaze-c 21:54, 11 September 2018 (UTC)



Next month, German Wikicon, French Wikiconvention, annual meeting of Wikimedians from Central and Eastern Europe (CEE Meeting) and WikiConference North America will be four opportunities for contributors to meet. In the first, third and last one, I haven't see anything about Wiktionary but I'll be at the French one in France and I'll be part of a team that will do a workshop "How to enjoy Wiktionary" and some other meetings and talks about cool stuffs in our project.

Well, Is any English Wiktionary editors plan to go to any of those event or even took part of any event of this kind in the past? Have you ever met another wiktionarian IRL in the past 15 years? Face-smile.svg Noé 15:32, 11 September 2018 (UTC)

  • I met one Wiktionarian IRL. It was fun. --XY3999 (talk) 06:58, 17 September 2018 (UTC)
  • <thinks @Equinox should attend> I have met quite a number of Wiktionarians at Wikimanias, and a few other events. Mostly the meetings felt awkward. Perhaps if the events were less stressed… - Amgine/ t·e 18:27, 17 September 2018 (UTC)
I have no idea what goes on at these "Wikimanias". Why are they stressful? DTLHS (talk) 18:42, 17 September 2018 (UTC)
I find them stressful because of the quantity different personalities attempting to meet together in order to coordinate sometimes disparate goals and efforts. But also because it is really only about Wikipedia. - Amgine/ t·e 18:50, 17 September 2018 (UTC)
  • Not really interested in the "official"/corporate-ish Wikimedia events. Equinox 18:53, 17 September 2018 (UTC)
Can we make a bylaw that official Wiktionary meetings/meet-ups can only take place in musty basement rooms in public libraries? - TheDaveRoss 19:32, 17 September 2018 (UTC)

Module errors in categories with invalid canonical names[edit]

Could an admin delete all the categories in CAT:E with "Kamviri" and "Taino" in their names? Those canonical names are no longer valid and the pages have had module errors for several days. — Eru·tuon 20:47, 11 September 2018 (UTC)

V111P and Latin headings[edit]

Just noticed V111P's contributions moving Latin heading to L5 from L3. This is not reflected in WT:ELE or WT:ALA#Part_of_speech_headers. Is this a boldness which should be reverted or left alone? - Amgine/ t·e 15:03, 12 September 2018 (UTC)

Nope, looks like vandalism. (and dang I should know better about BP) - Amgine/ t·e 15:08, 12 September 2018 (UTC)
@Amgine: I don't see what's wrong with their edits. See WT:ALA#More complex cases. — justin(r)leung (t...) | c=› } 15:13, 12 September 2018 (UTC)
Mmm, perhaps I should revert myself. The contribution history seems to me to be inconsistent; about half the time enforcing ELE, at others doing the opposite, whimsically in one edit correcting to L3 and also demoting another heading to L4... - Amgine/ t·e 15:28, 12 September 2018 (UTC)
In principle, part-of-speech headers should never be L5. If you find that occurring, there's probably something else that needs changing. —Rua (mew) 16:07, 12 September 2018 (UTC)
There's no point in worrying about header levels at all since every editor and language has their own quirks that aren't written down in WT:ELE or anywhere other than someone's head. DTLHS (talk) 16:53, 12 September 2018 (UTC)
FWIW, I just went through a batch of their edits to JA entries, and it looked to me like they were trying to match ELE in good faith, getting confused in a couple places by complicated entry structures (such as accidentally moving deriveds up from under the relevant POS, minor and easily understandable goofs). ‑‑ Eiríkr Útlendi │Tala við mig 16:58, 12 September 2018 (UTC)
At the moment I'm mostly fixing POS sections that are not subsections of a language, etymology or pronunciation section. Maybe I don't understand the rules for the pages with the Chinese and Japanese characters, so I may have made some mistakes there. --V111P (talk) 22:51, 12 September 2018 (UTC)

Finnish pronunciation template[edit]

Today I wanted to add a pronunciation to a word, specifically huuhkaja.

I often use wiktionary as a translation dictionary and a way to compare cognates etc in different languages, but often find when doing this entries don't have pronunciations - which to me are very important for understanding new words in languages I'm not familiar with. So I'm thinking about whenever that happens just trying to find audio for those words and (using the wikipedia IPA guide for the language) add the IPA pronunciations myself.

I have a good idea what the pitfalls of doing that could be and when I should be careful, and know sometimes I'd probably want to do a "request for pronunciation" instead. (I don't know how those work. Is it just like I add the template and they're listed in the category, and that's all there is to it? If I were to take on the maybe-more-sensible task of adding pronunciations to words in the English rfp category, would I just let those words exit the category once I've added a pronunciation, or do some of them remain in there if they don't have an audio file?)

As I was looking at other entries for reference though, I discovered something I really wasn't expecting to find - that Finnish words have some kind of special templates that can automatically insert hyphenations ( {{fi-hyphenation}} ) and pronunciations ( {{fi-IPA|*}} ). (for example, on "huuhkaja" when I was going to go insert /ˈhuːkɑjɑ/ as what I thought the word sounded like and tried the template instead it suggested: /ˈhuːhkɑjɑˣ/, [ˈxʷuːxkɑ̝jɑ̝(ʔ)] )

How does this work? Does the template simply use an algorithm to convert the letters in the word to a pronunciation guess?? Is Finnish a case of a very predictably-pronounced language where it's almost always best practice to use this template because even somebody who knew it really well would rarely have to fix the template's work? Are there other languages where this is the case?

Valenoern (talk) 05:24, 15 September 2018 (UTC)

RFP: If you want IPA, add {{rfp}}, and if you want audio, add {{rfap}}. They are removed as the request is fulfilled using {{IPA}} (or whatever) or {{audio}}.
Finnish: See Template:fi-IPA and Module:fi-IPA. Similar templates can be found here. Someone familiar with Finnish needs to say how predictable the pronunciation is.
Suzukaze-c 05:32, 15 September 2018 (UTC)
Please don't add pronunciations in languages you don't know. The fi-IPA template gives the correct pronunciation in this case: /ˈhuːhkɑjɑ/, [ˈxʷuːxkɑ̝jɑ̝] - no ˣ or (ʔ) in the end, where did you get that? - but Finnish spelling is not completely phonetic, for example the glottal stop is never written down. The fi-hyphenation template may produce nonsense for combined words. The templates are an aid for editors who know what the result should be. Nobody wants to go through all the IPAs and hyphenations to find mistakes added by a non-speaker. --Makaokalani (talk) 10:43, 15 September 2018 (UTC)
I suspect the ˣ came from the editor trying to use {{fi-IPA|*}} (note the * parameter), which adds the glottal stop at the end. It's intended for words that actually have them; most words would get by without with just {{fi-IPA}}. But as said, it's not automatically correct, and editors not familiar with Finnish should probably not try to add pronunciation templates. For compound words, one needs to supply a parameter with hyphens at word boundaries, such as {{fi-IPA|valta-tie}} for valtatie. For fi-hyphenation, it also needs hyphens at word boundaries, but only under certain conditions. SURJECTION ·talk·contr·log· 18:06, 19 September 2018 (UTC)

Genitive case in the Scandinavian languages[edit]

According to w:Danish grammar#Grammatical case, the "genitive" of Danish is really a clitic like in English. w:Swedish grammar#Genitive says the "genitive" in that language behaves the same, and attaches to the last word in the phrase and not the head noun as would be expected of a case. w:Norwegian language#Genitive of nouns again says pretty much the same thing. Given all that, and the fact that we decided to exclude possessives of English nouns on the same grounds, I think we should do the same for these three languages. —Rua (mew) 19:00, 17 September 2018 (UTC)

On'yomi and katakana[edit]

According to w:Katakana#Usage, on'yomi readings of kanji are typically written in katakana, not hiragana, in kanji dictionaries, because they were historically borrowed from Chinese and katakana is used for foreign borrowings. I think we should be doing the same thing. For example, the kanji currently lists おん (on) as a goon'yomi reading, いん (in) for kan'on'yomi, and おと (oto) and ね (ne) for kun'yomi. Under this change, the first two readings would become オン (on) and イン (in).

RoseOfVarda (talk) 18:02, 19 September 2018 (UTC)

The GFDL license on Commons[edit]

18:11, 20 September 2018 (UTC)