Wiktionary:Beer parlour

Definition from Wiktionary, the free dictionary
(Redirected from Wiktionary:BP)
Jump to navigation Jump to search

Wiktionary > Discussion rooms > Beer parlour

Lautrec a corner in a dance hall 1892.jpg

Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.

Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.

Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!

Beer parlour archives edit
2002
December
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018


Contents

June 2018

Labelling of bound morphemes[edit]

After a little talk with WF at the talk page of Spanish -dumbre, we've drawn the conclusion that our current "dated/archaic/obsolete" labelling scheme doesn't work too well for suffixes (and probably other affixes too). Why?

For that reason, I think it would be better to use the labels (productive/non-productive) when speaking of affixes. In fact I think we already do that in a few places, but it would be good to codify the practice.

That's not going to solve all our problems, however: there's even some disagreement on the productivity of -tion at Wiktionary:Tea room/2017/November § -tion.

Another thing: maybe the (obsolete) label doesn't have to be relinquished completely: if there are no words in current use using a certain suffix, then that suffix can be said to be both "non-productive" and "obsolete"/"extinct"? --Per utramque cavernam 15:58, 31 May 2018 (UTC)

  • I always label (no longer productive) in such cases. Ƿidsiþ 13:38, 1 June 2018 (UTC)
  • Yes, for an affix that is no longer productive but which is found on words which are still used, I think "no longer productive" is better than "obsolete". An affix could also be obsolete, like you say, or archaic (perhaps if it's an alternative spelling of another affix, and sounds archaic: -iren isn't the best example because it's apparently outright obsolete, but it's the vein of thing I'm thinking of ... a- -ing says it's an example). I don't know if it's necessary to specifically label productive affixes or if it's implied by the absence of a "not productive" label. - -sche (discuss) 16:22, 1 June 2018 (UTC)

Footnotes[edit]

  1. ^ By the way, would it be correct to speak of a Spanish -dumbre suffix if there had been no word formed with it in Spanish proper, i.e. if all words "using" it had been inherited from Latin?

Lemmatisation of valent adjectives - preposition in entry title?[edit]

I think it would be better to turn prepositionful titles into redirects, and work only with prepositionless titles. We can then use a label/template there; in fact, we already do that for verbs: the first sense of French succéder is tagged with {{indtr|fr|à}} → (transitive with à), a template similar to {{label}}.

At Template_talk:indtr#Adjectives, Ungoliant suggests we do that with a usex-level template instead. Though I don't know what it would look like, I think I'd be fine with that too.

But in the end, my main concern is consistency; let's either move them all to prepositionless titles or to prepositionful titles. And even more importantly, let's have a way of keeping track of those (Category:English transitive adjectives using on, etc.?).

Thoughts? @-sche, DCDuring? --Per utramque cavernam 21:03, 31 May 2018 (UTC)

@Equinox, Kiwima, Ungoliant MMDCCLXIV, -sche, DCDuring. --Per utramque cavernam 21:04, 31 May 2018 (UTC)
This issue also comes up with verbs: will someone who sees I foobared at/on/to/for/of the widgets really think to look for our entry "foobar at" if they find that we have an entry "foobar" (which simply fails to link to "foobar at" except possibly as a derived term)? So (momentarily speaking orthogonally to your point,) we should probably consistently use an {{only in}}-like template to link from every verb or adjective foobar to every foobar on/at/ etc that we keep a separate entry for. But many foobar at titles could just be turned into redirects, as you say, and I think I'd prefer that (as the probably more-intuitive-to-use and also easier-to-maintain system) to moving things to prepositionful entry titles. We do sometimes use labels to indicate that, in a particular sense, a word is used "with the" or "with on" or "with for", etc. - -sche (discuss) 21:22, 31 May 2018 (UTC)
Phrasal-verb definitions are more difficult than adjective definitions that have as complements phrases headed by certain prepositions.
I still bear bruises from having been beaten about the head and shoulders for saying that not all our entries for "phrasal verbs" were for authentic phrasal verbs and that not all of our definitions in our entries for authentic phrasal verbs were non-SoP.
It is not easy to decide whether a given verb + particle pair constitutes a phrasal verb: purveyors of dictionaries of English phrasal verbs and those who've built a career on them find them everywhere; others not so much. It seems to me that there are 'true' phrasal verbs, with definitions that are related to the bare verb principally etymologically. For example:
have at, have (someone) on, and have up
These have(!) almost no connection with any intuitively obvious definition of have. Moreover, phrasal verb definitions at [[have]] are very likely to be lost among all the other definitions we have there.
As we often have bastard English entries to serve as translation targets, part of a rationale for having phrasal verb entries could be that the phrasal verbs are often more common in speech than their Latin- or French-derived single-word synonyms. OTOH, as -sche writes, many language learners might not know enough about English to look up 'verb + particle' rather than 'verb'. We don't often make our decisions about inclusion etc on normal-user-behavior grounds rather than, say, syntactic grounds, but perhaps we should do so more often.
I doubt that we can formulate decision rules that would work in all cases. I don't doubt that we can come up with templates that will often be misapplied.
As a rule, it seems to me that we don't use hard redirects from common SoP phrases to the appropriate definition of the key noun, verb, or adjective in the SoP phrase. But, also as a rule, I am inclined to follow the 'lemming' heuristic: if other dictionaries and glossaries have a real entry (not a redirect) for a term, we should too. DCDuring (talk) 22:23, 31 May 2018 (UTC)
Thoughts on the adjectives: prone to, sweet on, keen on, etc. aren't adjectives, IMO, and can't be broken down into one POS; better entries (if these were to stay and not be moved to the bare adj.) would be be prone to, be sweet on, be keen on as transitive verbs. That being said, I think only be sweet on should be an entry, and prone and keen should just be senses with lil labels b/c they're not really idiomatic. Redirect prone to to prone (because AFAIK to is the only prep. used with prone), but in keen's case, b/c it can be used just as easily with to and (according to a few Google searches, maybe with), no redirect. Sidenote: is have on (to be wearing) really idiomatic? – Julia (talk• formerly Gormflaith • 15:52, 1 June 2018 (UTC)

Tibetan observations and questions[edit]

I recently discovered the area I now live in in Sydney has the largest Tibetan community in Australia, so I'm taking the opportunity to teach myself some Tibetan.

I've made hundreds (I think) of Tibetan entries and translation entries in the past few weeks, mostly from a scanned PDF of English-Tibetan Dictionary of Modern Tibetan compiled by Melvyn C. Goldstein + Ngawangthondup Narkyid. I've borrowed a copy of Colloquial Tibetan but I'm not very far into it yet.

So far I've only really learned the alphabet and Windows keyboard layout. Very little pronunciation, grammar, or vocabulary.

Most of my Tibetan neighbours I've spoken with have very little English, one has OK English and one has excellent English. Most of them have pre-teenage kids who are all bilingual. I've met two who told me they're from Lhasa or nearby and two or three told me they're from Amdo. At least one of the ones from Amdo also speaks Chinese but some others don't even seem to recognize my attempts to speak Mandarin. They all recognize my attempts to speak Tibetan.

My understanding from a friend from Amdo I knew while staying in Xiamen a year ago is that educated Tibetans in Amdo know Lhasa and Amdo dialects but might not know Mandarin. My friend was probably about 30 years old and told me he'd only recently taught himself Mandarin and was now teaching himself English. He did also seem to speak a peculiar variety of Chinese that sounded very Tibetan to me. He used it once when we ate at an eatery run by Han people from Qinghai.

It seems to me that our Tibetan pronunciation template doesn't cover Amdo Tibetan. Or maybe there's many dialects and I just don't know which dialect names I should be looking at?

Colloquial Tibetan uses a variant of Tibetan with two tones or three tones. High, low, and neutral or no tone / unstressed. How does this relate to the variants covered in our pronunciation tables/template?

If Stephen, Wyang, or any other contributors who have some skills in Tibetan would like to glance at a bunch of my entries and offer constructive feedback if I'm making any consistent mistakes etc, I would appreciate it. — hippietrail (talk) 02:28, 1 June 2018 (UTC)

Update: Tonight I went to the local Tibetan restaurant and the two staff members I spoke to are both from Kham. On my way home I met another Tibetan, also from Kham. So it seems at least those three major varieties are represented in my neighbourhood. — hippietrail (talk) 13:05, 1 June 2018 (UTC)
Thanks for creating those Tibetan entries. Only a formatting suggestion from me for now - the Tibetan links in the Etymology section should be in {{m}} rather than {{l}}: diff (it's a ridiculous formatting rule).
For {{bo-pron}}, the |zeku= parameter is used for the Zêkog dialect, and |labrang= is for the Xiahe dialect; both of these are Amdo dialects.
For Lhasa, there are different ways of analysing its tones. I'm assuming the high/low in Colloquial Tibetan to be following a two-tone analysis. In the four-tone model (which is what {{bo-pron}} uses) each of the high/low categories can be further split into two subcategories: high becomes high flat (f) and high falling (h), low becomes low flat (w) and low rising (v). There are some minimal pairs of words in the two kinds of high tones.
I think I know which area of Sydney you meant (DY?). The Tibetan diaspora in Sydney is quite diverse AFAIK; I know some from Amdo living in that area. The Lhasa dialect is the prestige dialect of Tibetan, and many educated Tibetans know Lhasa and can use it when communicating with Tibetans from other dialect regions. Wyang (talk) 03:45, 2 June 2018 (UTC)
Yep I now live in Dee Why (etymologically comes from the letters "DY" on a map that nobody knows the reason for). Colloquial Tibetan actually describes their MO right at the beginning of the book. They made certain choices for practicality so the learner can communicate with the maximum number of Tibetan speakers, rather than adhere exactly to Standard Lhasa Tibetan. So I believe they used the two-tone model as part of that.
I'll try to remember to stick with "m" in etym sections. I'm so used to seeing a random mix of "m" and "l" that I just stick with the one I knew best. I met another new Tibetan this afternoon, this time a guy with good English who was from Lhasa and taught me several ways to say "goodbye" or "see you" which I've unfortunately already forgotten.
Thanks for your feedback! — hippietrail (talk) 10:26, 2 June 2018 (UTC)

Citing Twitter[edit]

Tweets are an absolute goldmine for vernacular, and this is especially crucial for oral-only languages (including a couple I'm particularly interested in like Scots and Swiss German). I assume since they can be deleted that they aren't considered durably archived? Can we find any solution to this? Has this been discussed before anywhere? Ƿidsiþ 05:29, 2 June 2018 (UTC)

As you acknowledged, Twitter is very, very much not durably archived. We currently have no solution for this, unless you collect a bunch of tweets and self-publish them, I suppose. Ultimately, we could develop new criteria (say, three tweets count as one regular cite, and a photo of the tweets must be uploaded to Commons and the original tweets checked by an admin to ensure that it hasn't been doctored), but that would require a lot of work and need to be subject to a vote. —Μετάknowledgediscuss/deeds 14:08, 2 June 2018 (UTC)
The Library of Congress is (ostensibly) archiving all tweets, save the ones which get deleted. Some are archived by law (POTUS, etc.) and should be citable. - TheDaveRoss 13:37, 3 June 2018 (UTC)
Trump's tweets are rather beside the point, as I presume they will all find their way into published material. —Μετάknowledgediscuss/deeds 14:56, 3 June 2018 (UTC)
True, but it is an interesting point that because his (and other presidents') are known to be archived (presumably durably), by law, arguably they could be cited without waiting for them to get into print. Hmm... I don't know. Also, @ The Dave Ross, the Library of Congress stopped archiving all tweets at the end of 2017. How accessible is their pre-2017 archive? - -sche (discuss) 15:04, 3 June 2018 (UTC)
Ah, I missed the update about switching to "selective" archiving, whatever that means. And the last I hear (a few years ago) the archive was proving technically challenging for them, and public access was limited. They also say they are going to keep the archive of tweets they acquired up until they stopped, so those are citable. Internet Archive is saving and sharing the "Spritzer" Twitter stream (1% of all public Tweets) but since they are essentially random that isn't useful for our purposes. - TheDaveRoss 17:05, 3 June 2018 (UTC)
If this were to come to a vote, I'd definitely support it. IMO tweets are one of the best sources because they're often very close to spoken language. Even if it requires a lot of work, as MK acknowledged, I think it'd be worth it. Also, is "publishing to the internet" via Google Drive durably archived? Like this thing? – Julia (talk• formerly Gormflaith • 17:58, 3 June 2018 (UTC)
Google doesn't have a good track record of keeping its services alive. (It's a pity we now have to rely on Google Groups for a Usenet archive!) Also if we archive others' tweets then there might possibly be legal issues around privacy/copyright. Equinox 18:01, 3 June 2018 (UTC)
I agree with User:Julia and it's very long (years) overdue. Kaixinguo~enwiktionary (talk) 18:01, 3 June 2018 (UTC)
We're not a twitter archiving service. We're not going to become a twitter archiving service. There are currently no Twitter archiving services that we can use. That might change in the future, but right now we would just be doing something half-assed. What do you mean by "a lot of work"? That's meaningless. DTLHS (talk) 18:08, 3 June 2018 (UTC)
Why not just take a screenshot and upload it to Commons? They could be verified by one or possibly two admins at the time, and permission could be asked just as it is for some images. I think the expression 'a lot of work' is fairly common and self-explanatory. Kaixinguo~enwiktionary (talk) 19:35, 3 June 2018 (UTC)
In that case what do we do if the original author comes with a copyright claim, or claiming distress that we are persisting something they chose to delete? GDPR etc... -- Actually I suppose the same questions apply to Usenet! Hmm. Equinox 19:42, 3 June 2018 (UTC)
"a lot of work" as in more steps than just getting a cite from a book or something. Regarding copyright issues (which I don't know much about) what rights do the creators of the tweet have? And what rights would we be possibly violating? – Julia (talk• formerly Gormflaith • 20:06, 3 June 2018 (UTC)
That's a question for the people at Wikimedia Commons. DTLHS (talk) 04:54, 4 June 2018 (UTC)

Display text of Template:der3 and others[edit]

I have reverted Dan Polansky's latest attempts to change the table title, on the grounds that his version looks stupid and repeats "Derived terms" right after the heading that says the same thing. Furthermore his references to the "status quo" seem specious. Finally, the pattern "Terms derived from X" is easy to extend to "Terms derived from X (noun)" if disambiguation is necessary. Other's thoughts? DTLHS (talk) 06:36, 2 June 2018 (UTC)

I admit that repeating "Derived terms" in the table heading after the same section heading looks a little odd, but "Terms derived from X" is a needless repetition of X, and looks really bad to me. Of course they are derived from X; X is the entry. I admit that I did not raise this when this practice started to be introduced but only recently. There are so many practice changes being inntroduced without a discussion. I think the "status quo ante" is fundamentally correct, and was used by me in {{rel3}}; I dispute that diff from 2016 was based on consensus, and I have no evidence of that consensus other than silence. --Dan Polansky (talk) 14:31, 2 June 2018 (UTC)
I like DTLHS's approach for a default. If something else emerges from user-added (non-default) content, that might be considered for a replacement of the default. We (by which I mean DTLHS) could do a dump run to find any potentially desirable innovations. DCDuring (talk) 16:00, 2 June 2018 (UTC)
Derived terms used to have no collapsible tables; for short derived terms, that was much more user friendly and avoided cruft. Now, when I visit the parta#Czech, I find section "Derived terms", underneath collapsible "Terms derived from parta", and within mordparta, and parťák. It looks ugly and stupid, pardon my French. --Dan Polansky (talk) 09:39, 3 June 2018 (UTC)
One solution would be to provide no text in the collapsible table: that would remove all cruft and all repetition, even the odd-looking "Related terms" (section heading), "Related terms" (collapsible heading) repetition. --Dan Polansky (talk) 09:48, 3 June 2018 (UTC)
I consulted English entry party for an unrelated purporse. This is what I see, as a sequence:
  • "Hyponyms"
  • "Hyponyms of party"
  • "Derived terms"
  • "Derived terms of party (noun)"
  • "Related terms"
  • "Terms related to party"
"party" is in italics there. This is what we call in Czech "jak u blbejch na dvorku", like at a morons' yard.
Making all the collapsible headings empty would be a huge improvement.
--Dan Polansky (talk) 10:05, 3 June 2018 (UTC)
In your example of parta#Czech, I think it would be better for the 'derived terms' not to be collapsed. There are only two of them and it is important information. Kaixinguo~enwiktionary (talk) 10:14, 3 June 2018 (UTC)
Our entries don't make good use of space, it's true. There's a lot of whitespace (other online dictionaries usually put all their pronunciation information compactly on one same line as 'pronunciation:', for example, not on three or more short lines with lots of empty space to the right of them). When there is text, a fair amount is redundant. But removing all text from the collapsible headings would be bad because it'd make it too easy to miss that there was information there. The "show" text in small font all the way at the other side of the screen from where most text starts; the collapsible box itself is a light grey which, when my screen is tilted at some angles, isn't even distinct from the background, so it's possible to miss the entire existence of it, and even when it is seen, it's possible to miss the "show text" (as mentioned) and, if no other text is present, to think it's an empty box i.e. that there are no derived terms. I know, because I stumbled onto an entry where someone had manually suppressed the text. Perhaps the redundant "derived terms" text should be replaced with "list" (which would also discourage use of the template when there's only a single derived term), or replaced with floating the "show" link to the left. - -sche (discuss) 15:05, 3 June 2018 (UTC)

Pakistani surnames[edit]

There are Pakistani cricketers named Misbah-ul-Haq, Inzamam-ul-Haq, Imam-ul-Haq and probably others. All spelled as a single term with two hyphens. Is "ul-Haq" a surname? If not, can anyone explain the format of the names please? SemperBlotto (talk) 13:38, 2 June 2018 (UTC)

It is a family name. I think it's "the truth" (حق); see Al-Haqq. Not certain. Equinox 13:46, 2 June 2018 (UTC)
@SemperBlotto, Equinox: الْحَقّ (al-ḥaqq) is the definite form of حَقّ (ḥaqq, truth). In the formal Arabic the definite article الْ (al-, equivalent of "the") is pronounced with an "a-" at the beginning of an utterance, in other positions it follows desinential inflection (iʿrāb) ending of the previous word and the initial vowel of "al-" is dropped (elided). E.g. "Misbah-ul-Haq" would be مِصْبَاحُ الْحَقِّ (miṣbāḥu l-ḥaqqi), "luminary of the truth" in the nominative case. So "u" belongs to the previous word but this vowel is usually not written, and the initial "ا" is silent. Languages borrowing from Arabic often follow these conventions. "al-" is more common, though. "el-" or "il-" is from dialectal/informal Arabic. --Anatoli T. (обсудить/вклад) 15:02, 2 June 2018 (UTC)
Thanks. So, would his father, brothers and sons also be xxxx-ul-haq? (and females??) SemperBlotto (talk) 05:56, 3 June 2018 (UTC)

Idiomatic names.[edit]

I'd like to add a type of category for names with idiomatic/sarcastic usage by language. An example in English: You can say 'You don't say, Sherlock' and even people who have not read Sherlock Holmes will understand that the word 'Sherlock' encodes the information that they're being obtuse and have just stated something obvious. This does not work with other fictional detectives such as 'you don't say, Hercule' or 'you don't say, Continental Op', whose names are therefor not idiomatic. I propose a label like 'langname names with idiomatic usage' or 'with sarcastic usage' since I can't recall names like Sherlock/Gandhi/Einstein etc. to be used as an actual praise. Korn [kʰũːɘ̃n] (talk) 11:59, 3 June 2018 (UTC)

@Korn: Are there many such names? I think the "sarcastic" part would make the category too narrow; besides, isn't sarcasm a contextual/pragmatic phenomenon more than a lexical one?
However, I definitely agree that we should gather all those genericised names (which run parallel to "genericised trademarks", imo) in a category; I even suggested as much last year. By the way, I think the genericisation process is called antonomasia (sense 2). --Per utramque cavernam 15:26, 3 June 2018 (UTC)
"Idiomatic" is broader than "sarcastic", of course, since Einstein#Noun "intelligent person", Joe#Noun "a guy", and arguably "mein Name ist Hase" "I know nothing" all seem like idiomatic but not sarcastic uses of names. I do think it'd be useful to have a category for idiomatically-used names (others: John Doe, Bubba, and arguably Johnny Reb and Johnny Foreigner, etc). I'm not sure a category for sarcasm would be as maintainable, since most names and other words are used sarcastically sometimes and so it might largely duplicate the "idiomatic" category (though a vote excludes separate senses for sarcasm except when terms are "seldom or never used literally"). Perhaps we should avoid the very opaque antonomasia, though (I suspect DCDuring will agree with me on this part?). - -sche (discuss) 15:33, 3 June 2018 (UTC)
The reason I'm thinking about using the 'sarcastic' tag is that it might be that non-sarcastic usages are less idiomatic rather than plain references to the actual person, but the idiomatic label seems preferable to me too. Korn [kʰũːɘ̃n] (talk) 16:50, 3 June 2018 (UTC)

Adapting Wikipedia template to warn about NSFW/sexual images for Persian[edit]

For some reason the Persian Wikipedia has more extreme images than others, for example, there is a gif at 'ejaculate'. When I was checking the translations for 'pearl necklace' and 'footjob' I have seen images of that as well. What if the {{wikipedia}} template were adapted to show a short warning? Would anyone mind? Kaixinguo~enwiktionary (talk) 17:55, 3 June 2018 (UTC)

w:Pearl necklace (sexuality) has the same image as w:fa:گردنبند مروارید. Wikipedia links are basically off-site links, and have the implied warning that we don't control the content there. Trying to track the current NSFW status of every Wikipedia page we link to, even if there were a clear agreement about what NSFW means, is hopeless.--Prosfilaes (talk) 19:06, 3 June 2018 (UTC)
No, there is no implied warning at all. Kaixinguo~enwiktionary (talk) 19:30, 3 June 2018 (UTC)
It says it's going to Wikipedia. Wikipedia is not censored, and a link to Wikipedia could potentially lead to anything, and a stable Wikipedia page for a sexual term may have various types of illustrations. That should give users plenty of warning.--Prosfilaes (talk) 04:54, 4 June 2018 (UTC)
No, Wikipedia isn't censored (much) and sometimes has sexual pictures. That's how it is. You should install censorware on your own computer if you need to stop this. Equinox 19:33, 3 June 2018 (UTC)
I didn't request to censor Wikipedia. It's not unreasonable to warn of, not censor, a gif of ejaculation.
By the way, I'm not thinking of myself, although I had never seen those images before. Kaixinguo~enwiktionary (talk) 19:39, 3 June 2018 (UTC)
I understand the notion and agree that there's too great a laxity with preventable exposure in the Wiki community (We have an entry with a picture of a corpse, which I abhor.), but I too think this is an issue to be fixed at Wikipedia and that the very fact that you're moving to another site, to read an article about sexual practices, implies that you might get exposed to the act in question. —This comment was unsigned. My bad. Korn (talkcontribs)
I don't think it is at all reasonable to expect that there will be a video demonstrating a sex act on every page which describes a sex act. While I don't think we should censor content, I don't see any reason why labeling content that is likely to be offensive or otherwise problematic to a large segment of the population would be a bad thing. The fact that we can't be comprehensive is not an argument against such labeling when we are aware. Personally, I often use Wiktionary and Wikipedia while at work, and would be annoyed if a video of someone ejaculating showed up on my screen if I wanted to find out what some term in a random song or comedy act I was listening to meant. I don't think this is a prudish or censorial viewpoint. - TheDaveRoss 00:00, 4 June 2018 (UTC)
Then don't click on Wikipedia links. The discussion of what to display on Wiktionary is entirely separate from whether we should try to mark up certain Wikipedia links as potentially NSFW.--Prosfilaes (talk) 04:54, 4 June 2018 (UTC)
The idea that one should never click on Wikipedia links because some of them will show graphic videos is patently ridiculous. - TheDaveRoss 11:22, 4 June 2018 (UTC)
The idea that Wiktionary should keep track of and put warnings on links to pages on Wikipedia that might contain "NSFW"/"explicit" images, when Wikipedia itself doesn't put warnings on, is also ridiculous, especially because the pages for which the presence of explicit images is least likely to change and invalidate any such warning-labelling, namely pages about sex or body parts, are ones readers can expect an uncensored encyclopedia to illustrate.
"Explicit"/"NSFW" are nebulous, anyway: is a link to a page with an image of a nipple going to be tagged initially? (I'm sure it'll be tagged in the end; censorship creeps.) Is a link to a page that might or might not contain an illustration of breastfeeding to be tagged, initially? Is a Wikipedia "List of Foo slang" that documents swear words or words for sex acts "explicit", given that we do have at least one user who has a filter that deletes swear words? There are entire communities of bigots who would prefer not to see images of gay people, or of any women. Maybe some people only want to tag the most explicit animations, but the censorship will inevitably creep "to err on the side of caution".
- -sche (discuss) 14:12, 4 June 2018 (UTC)
There are user-scripts users can individually enable to block images / videos from displaying, if they wish to merely read about ejaculation at work. - -sche (discuss) 14:18, 4 June 2018 (UTC)
Labeling is not censorship, that is a red herring. As is the fact that we cannot be comprehensive, that is an argument against the project as a whole. The subject of this discussion is not to limit what can or cannot be shown on Wikipedia (or Wiktionary); it is merely suggesting that, in cases where an editor knows that the target of a link contains graphic imagery, the editor can let readers know. This does not impede readers from seeing or following links, it just lets them self-select out of certain content if they wish to, instead of forcing them to play roulette. Graphic content lowers the utility of Wikipedia for many, labeling such content so that users can actively avoid it as they choose mitigates this problem. - TheDaveRoss 15:11, 4 June 2018 (UTC)
They are labeled; every word that starts this discussion has a definition that clearly warns anyone of what might be in Wikipedia. I'm not even sure where Kaixinguo~enwiktionary expects us to put the warning; he talks about the Persian Wikipedia and says "When I was checking the translations"; are we supposed to warn on every translation? If you can't handle what's on Wikipedia, don't go there, or at least install a filter that should try and protect you.
We are pretty comprehensive in English at least. Our usefulness drops drastically in other languages where we don't have a decent coverage. What good does tagging a handful of Wikipedia links as NSFW if there's ten times as many NSFW links that aren't so tagged? It gives you no reason to think you can ever safely click on a Wikipedia link, exactly where you started.--Prosfilaes (talk) 21:29, 4 June 2018 (UTC)

Eliminating undocumented withtext= param in {{borrowed}}[edit]

I plan to use a bot to eliminate the remaining places where withtext= is used in {{borrowed}}. The plan is to use a bot to replace "{{bor|...|withtext=1}}" with "Borrowed from {{bor|...}}" whenever the template occurs at the beginning of a line or sentence, and to handle the remaining cases by hand. I've spot-checked a dozen or so cases so far and all of them have {{bor}} at the beginning of a line or sentence, and all of them read fine when using the "Borrowed from" text instead of the auto-generated "Borrowing from" text (and in many cases, "Borrowed" reads better than "Borrowing"). Benwing2 (talk) 18:19, 3 June 2018 (UTC)

I think you're good to go, this was part of the plan anyway. See Wiktionary:Beer_parlour/2017/November#Template:bor:_Replace_notext=1_with_withtext=1 --Per utramque cavernam 18:27, 3 June 2018 (UTC)
OK, I wrote the script and it's ready to go. With some special-case hacking, there are only around 135 lemmas (out of 10,200+) that need to be handled manually; most of these are erroneous uses of withtext=1 of various sorts. I'll wait a bit longer to make sure no one objects. Benwing2 (talk) 20:19, 3 June 2018 (UTC)
I fixed all the manual cases and am running the script to fix the automatic cases. Benwing2 (talk) 03:04, 4 June 2018 (UTC)
Finished. Benwing2 (talk) 07:21, 4 June 2018 (UTC)

Parents of foo-mid languages, foo-old languages[edit]

The parent of bn-mid (Middle Bengali) and bn-old (Old Bengali) are given as bn (Bengali), which seems totally wrong. Same for or-mid, or-old, kok-mid, kok-old, etc. etc. Is this correct? Maybe so because these are etym-only languages but it seems weird. Benwing2 (talk) 03:03, 4 June 2018 (UTC)

If the Old/Middle forms of the language aren't considered sufficiently distinct to treat as different languages from the modern language, then I guess it makes sense; after all, Biblical Hebrew is an etymology-only language with (modern) Hebrew as its "parent". - -sche (discuss) 14:06, 4 June 2018 (UTC)
No, that isn't right. I don't think Old and Middle Konkani have enough attestation to deserve full codes, but Old/Middle Bengali and Odia (or Oriya, whatever we call it here) should be upgraded to real codes. —AryamanA (मुझसे बात करेंयोगदान) 01:47, 5 June 2018 (UTC)

Soft redirection template for Japanese[edit]

Hi everyone. What do you think about the soft redirection format on 貴方?

For pronunciation and definitions of 貴方 – see あなた.
(This term, 貴方, is a kanji spelling of あなた.)

The soft-redirection template is meant to serve the same function as {{zh-see}} for Chinese. Although currently not implemented, it should be able to display glosses and copy categories from the lemma entry in the future. If the idea of having a Japanese soft-redirection template is accepted, we can create alternative forms (mainly of pairs like まっとう / 全う) faster by doing away with the need of copying POS headers as well as manually providing glosses, which can become out of dated if the lemma entry changes.

(Notifying Eirikr, Wyang, TAKASUGI Shinji, Nibiko, Atitarev, Dine2016, Poketalker, Cnilep, Britannic124, Fumiko Take): --Dine2016 (talk) 11:31, 4 June 2018 (UTC)

I support centralisation of Japanese entries. The ongoing trouble is to decide what IS the Japanese lemma. This can't be decided easily. There are good arguments in favour of kana entries and in favour of kanji entries (if a term has both) and it very much depends on:
  1. What is the most frequent spelling?
  2. Are there multiple etymologies and what is their distribution? Unrelated homophones with different etymologies are better off having kanji entries as lemmas, if the kanji spelling is more common than kana.
  3. Verbs or adjectives with the same reading/pronunciation might be better off lemmatised at kana entries with only one inflection table. They are mostly native Japanese words.
  4. Sino-Japanese entries (or more broadly words with 100% on'yomi readings) are better off lemmatised at kanji with the most frequent spelling, only if it happens to be kanji.
It's roughly my position before we jump into making redirects for Japanese entries. --Anatoli T. (обсудить/вклад) 12:02, 4 June 2018 (UTC)
I was under the impression that the discussion two months ago reached a preliminary consensus for native words. Korn [kʰũːɘ̃n] (talk) 12:28, 4 June 2018 (UTC)
Thanks for your replies. I think the choice of lemma forms is a separate issue, and existing entries can be left as is before we settle on an approach of centralization. There are lots of words for which most editors would agree on the kanji spelling as the lemma form. At the current stage, the soft redirection template could facilitate the creation of their hiragana forms. --Dine2016 (talk) 12:47, 4 June 2018 (UTC)
Another term would be ふるさと (furusato, hometown, amongst other meanings). It has (at least) three known attested kanji spellings: 古里, 故里, and 故郷. Better than calling an alternate spelling of one kanji compound (as in the third)? ~ POKéTalker) 13:26, 4 June 2018 (UTC)
ふるさと (furusato) is a good candidate to be a lemma. A native Japanese word and Chinese characters for it are only a visual help, possibly none of them is more common than the kana form but if one of them is more common than that spelling should be the lemma. If it's decided that native words are lemmatised at kana, then so be it. --Anatoli T. (обсудить/вклад) 13:37, 4 June 2018 (UTC)
I support this proposal for {{ja-see}}.
Re: which spelling to choose as lemma, I'll reiterate my preference, as discussed and developed in the earlier thread: native-Japanese terms (i.e. 和語 (wago)) and non-Chinese foreign-derived terms (外来語 (gairaigo)) would go under the kana spellings, while Chinese-derived terms (漢語 (kango)) would go under the kanji spellings. The rationale for this is that wago and gairaigo may have multiple possible kanji spellings (where such exist), whereas kango generally only have one (rarely two) spellings. Kana entries for kango would be soft redirects to the kanji entries (the current status quo), while kanji entries for wago and gairaigo would be soft redirects to the kana entries (this would require changes from our current state). The wago and gairaigo entries could specify spelling frequencies with usage notes or labels.
Example: とる (toru) has a basic meaning of to take. However, which kanji spelling is used most frequently depends on which sense is intended: 撮る for photography and video, 採る for samples to be used for something, 捕る for capturing or trapping a pest or catching a pop-fly, 獲る for capturing or trapping prey, 盗る for taking something illicitly, etc. etc. Personally, I think it makes the most sense to consolidate all of these under the とる (toru) kana spelling, and in fact, this is what monolingual JA-JA dictionaries essentially do. See this entry in Daijirin for the 採る spelling, grouped with the other spellings of the same etymology. ‑‑ Eiríkr Útlendi │Tala við mig 16:59, 4 June 2018 (UTC)
@Eirikr: Hi. OK,  () (かえ) (kurikaeshi) is a wago. Could you explain, why 繰り返し should be a redirect to くりかえし if 繰り返し is the most common spelling? --Anatoli T. (обсудить/вклад) 23:01, 4 June 2018 (UTC)
Two reasons: 1) technical constraints inherent in the MediaWiki platform, and 2) consistency.
  • The technical constraint has to do with how we have redirects. Electronic JA-JA dictionary apps that I've been able to try out generally have pretty slick redirection, where the user can input kanji or kana and still get to the desired entry either directly or with one additional input. If the user clicked the wrong entry, the list is still there on the screen, so no need to go back: just click another entry in the list. We could do that with hard redirects, but since a single spelling might overlap with terms in other languages, we cannot do so across all JA entries.
  • The consistency consideration is in part to match other JA-JA dictionaries (lemming-wise, including our cohorts at the JA WT), in part so all wago are lemmatized similarly, and in part for usability.
Wago readings tend to be unambiguous, with one reading matching one term. This is consistent with the history, where wago derive from the verbal language. Spellings were an afterthought as literacy was imported into Japan from a completely unrelated donor language. Even today, usage conventions for okurigana (the kana after the kanji) can be somewhat loose -- the KDJ lists kurikaeshi under the spelling 繰返, for instance. But if a user knows the pronunciation, they can always spell out the kana.
Kango, meanwhile, were borrowed from written Chinese, with a focus on the meaning inherent in the characters and without much regard for what they sound like. In extreme instances, a single kango reading might have tens of spellings. せいしん (seishin), for instance, generates 21 distinct hits in my local electronic Daijirin, each with distinct derivations and senses. せいせい (seisei) generates 28 distinct spellings, belonging to 25 different terms. とうし (tōshi) generates 24 distinct spellings for 20 terms.
This difference in history and verbal / visual distinctiveness carries over into how terms are used: wago are used more in spoken and informal speech, where auditory disambiguation is key, while kango are used more in written and formal texts, where the written text allows authors to visually specify meaning that might be lost in a spoken medium.
→ Broadly speaking, wago are phonemically distinct (kana spellings), while kango are graphemically distinct (kanji spellings).
Conversely, one could turn your question around: for any given wago, not just 繰り返し (kurikaeshi), why would we use the kanji for the lemma? There's more overhead for editors in having to identify which spellings are more common (clear for kurikaeshi, but not always so simple for other terms, and not always provided even in modern dictionaries), duplication of data at multiple spellings and/or frustrating arbitrariness where we have to just choose one among multiple current variants of roughly equal frequency, and more potential for confusion among users (which spelling? which okurigana? why is Wago A under a kanji spelling, but Wago B under a kana spelling?). Using kanji spellings for wago can also obscure otherwise-clear relationships, as observable at あばく (abaku). For instance, if we split each sense of とる (toru) out to its kanji spelling, we would fracture the entry and make it harder for users to see that all the spellings of toru are just shades of meaning of the same verb. Imagine if English get were similarly split up, where each sense had a distinct spelling and separate entry, despite all senses having the same reading, same derivation, same underlying meaning.
‑‑ Eiríkr Útlendi │Tala við mig 00:09, 5 June 2018 (UTC)
@Eirikr: OK, thanks for the detailed answer, almost convinced. Well, the Chinese handling is not perfect either - simplified characters act as redirects, even if their usage is much higher than that of the traditional. --Anatoli T. (обсудить/вклад) 07:22, 5 June 2018 (UTC)

Transclusion[edit]

@Eirikr: I know it's a bold idea, but what if we make the template transclude the appropriate sections from the lemma entry, like this?

==Japanese==
{{ja-see|繰り返し}}
For pronunciation and definitions of くりかえし – see 繰り返し.
(This term, くりかえし, is a kana spelling of 繰り返し.)
Pronunciation
Noun

繰り返し (hiragana くりかえし, rōmaji kurikaeshi)

  1. repetition

--Dine2016 (talk) 01:06, 5 June 2018 (UTC)

Very nice. Wyang (talk) 03:33, 5 June 2018 (UTC)
I'm more than okay with that. I've long thought about this kind of transclusion as a means of providing users the relevant info while avoiding flat-out manual duplication. ‑‑ Eiríkr Útlendi │Tala við mig 03:37, 5 June 2018 (UTC)
@Dine2016, just expanded 繰り返し (kurikaeshi). How any sections (etymology, kanjitab, derived terms, etc.) can be omitted in appropriate entry with ja-see? ~ POKéTalker) 04:18, 5 June 2018 (UTC)
@Poketalker: I think every relevant section should be transcluded, so that whether the reader searched for くりかえし or 繰り返し, they will be able to get the same information on the word  () (かえ) (kurikaeshi, repetition). Different spellings, same word, same information. I believe this is how the electronic dictionaries Eirikr mentioned above work, except that we don't require an extra click if we take this approach. An exception may be made for {{ja-kanjitab}}, which is usually spelling-specific. --Dine2016 (talk) 06:59, 5 June 2018 (UTC)
@Poketalker: There are technical restrictions on the amount of transclusion. In addition, homograph entries like 上下 (agarisagari, ageoroshi, agekudashi, agesage, ueshita, kamishimo, shōka, jōka, jōge, noboriori) can be long and hard for readers to find the entry they're looking for, canceling out the advantage of not requiring an extra click for the full entry. Reconsidering the issue now, I think it's ok to just display the POS headers and definitions (and perhaps pronunciation and usage notes) and direct the reader to the lemma entry for full information. Alternatively, we can include all the information but make the templates collapsed by default (which is probably technically inferior due to page load time). @Wyang, Eirikr, any thoughts? --Dine2016 (talk) 11:24, 6 June 2018 (UTC)
@Dine2016: You have actually made the kana entry a redirect to kanji in your example but Eirikr wanted the other way around, no? Good job, anyway. --Anatoli T. (обсудить/вклад) 07:22, 5 June 2018 (UTC)
@Atitarev: Yes, but {{ja-see}} is expected to support both kanji-to-kana and kana-to-kanji, so converting it to the other way should be easy. --Dine2016 (talk) 07:48, 5 June 2018 (UTC)
The main potential problem I can see would be when the template is on a page with lots of other content- such as huge lists of Chinese compounds on kanji pages. There are limits to the amount of memory and transcluded content allowed on a single page. Going over the memory limit causes highly visible module errors. Going over the transclusion limit, on the other hand, means that every template that hasn't already been transcluded becomes a hyperlink to the template, so {{l|en|example}} is replaced by Template:l (sometimes it just has the invoke statement, but either way, it's useless). Worse, these unexpanded duds are almost always at the bottom of the page, where editors are less likely to spot them- unless you happen to check Category:Pages where template include size is exceeded or notice the category at the bottom of the page, you may not realize anything is wrong. See User:Hermitd/Greek wordlist for an example. That said, the limit is 2 MB of transclusion, so it may not happen much. Chuck Entz (talk) 04:56, 6 June 2018 (UTC)
@Chuck Entz: Thanks for the heads-up. What contributes to reaching the transclusion limit: fetching the wikicode of the lemma entry with getContent(), expanding the relevant sections with preprocess(), or only returning the expanded wikicode from module to page? --Dine2016 (talk) 07:12, 6 June 2018 (UTC)

Wiktionary:Summer Competition 2018[edit]

Hey. I'm probably gonna start the new Wiktionary:Summer Competition 2018 soon. It's Wiktionary:Christmas Competition 2013 repeated, but with a less able Gamesmaster. I should probably fix a few things first. I'll keep y'all posted about publication dates etc. when I can be bothered to. --Genecioso (talk) 21:09, 4 June 2018 (UTC)

How come...[edit]

Wiktionary:Translation requests is such a popular page? --Genecioso (talk) 22:04, 4 June 2018 (UTC)

WMF doesn't keep track of (or at least doesn't publish) referrer statistics for individual pages. If someone here was so inclined they could make a tool to database referrers and add some javascript on this end to populate the table, then we could find out where all of the traffic is coming from. We don't seem to result highly on Google for many seemingly obvious queries. - TheDaveRoss 14:11, 5 June 2018 (UTC)

Important: No editing between 06:00 and 06:30 UTC on 13 June[edit]

This is just to tell you that your wiki will be read-only between 06:00 UTC and 06:30 UTC on 13 June. This means that everyone will be able to read it, but you can’t edit. This is because of a server problem that needs to be fixed. You can see the list of affected wikis on Phabricator.

If you have any questions, feel free to write on my talk page on Meta. /Johan (WMF) (talk) 12:37, 5 June 2018 (UTC)

For those who don't know what to do during that time, take up the harmonica - I have one to donate - send me an email with your address, and I'll send it off to you. --Genecioso (talk) 13:43, 5 June 2018 (UTC)
Sounds great! User:Equinox, c/o Wikimedia Foundation, 1 NOTPAPER Way, Banville, CA 95966. Equinox 16:41, 5 June 2018 (UTC)
In the post. Let me know when it arrives. --Genecioso (talk) 20:13, 6 June 2018 (UTC)
Oh no! We can't edit Wiktionary for 30 whole years minutes! What are we gonna do?! It's not like we can read a book or play a video game or watch a movie or anything in that time! I mean, it's 30 minutes! We are truly doomed as a species. PseudoSkull (talk) 03:21, 10 June 2018 (UTC)

Buryat IPA transcription[edit]

I don't know if this is appropriate, but can anyone check if the IPA transcriptions for the Buryat lyrics of the Buryatia regional anthem here are accurate? There might be some errors. If there are any errors, would anyone (who is familiar with Buryat phonology and/or phonologies of Mongolic languages) provide a better transcription than the current one? Thanks. 213.183.63.189 04:18, 6 June 2018 (UTC)

I didn't go through all of it, only the first three stanzas. I think the IPA transcriptions are very good. I only noticed one thing that I take issue with: the letter ө should, in my opinion, be transcribed as IPA œ. However, the transcription was added by w:User talk:Lucarubis, and he or she may have had a good reason for transcribing it as a simple o. We should ask him or her about it. —Stephen (Talk) 11:49, 10 June 2018 (UTC)

On the placement of constructed languages, and on the attestation of appendix-only languages[edit]

Wiktionary:Votes/pl-2018-04/Disallowing appendix-only languages has failed. I think Gamren has taken things a bit backwards, and it made it look like he wanted to delete the mainspace-like content (i.e. the entries) currently hosted in appendices altogether. If I've understood the issue correctly, that wasn't his intent at all. I think his view is that our current separation between main space and appendix-only languages

  • 1) is artificial;
  • 2) leads to our hosting unchecked content.

I'll address the second issue first. In my view, he's made a valid point: it would seem that, at present, Appendix-only languages are not subject to any attestation criteria. Is that really what we want, and what we wanted when we relegated Lojban to the Appendix namespace (Wiktionary:Votes/2018-02/Moving Lojban entries to the Appendix)?

I don't think so, or at least I hope not; in my opinion, all words, wherever they are, should be subjected to some kind of attestation criteria. Does everyone agree on that, or does it need to be put to the vote?

If that's agreed upon, the question would then be: what kind of attestation criteria do we want for Appendix-only languages?

  • the stringent ones of WDL?
  • the more lenient ones of LDL?
  • a middle ground (two quotes?)?
  • a mix of both (i.e. submitting some languages to the WDL criteria, others to the LDL criteria)?
  • something else, but something?

Whatever the answer, I see one problem: given that the distinction between main space languages and appendix-only languages would be:

  • neither one of attestation (since we'd have agreed that all of them need some kind of attestation);
  • nor one of strength of attestation (since we already make a distinction in the main space between WDL and LDL – it thus seems difficult to find a third one which would be completely specific to appendix-only);
  • nor one of "naturalness" (since there are both natural languages and constructed languages in the main space);

what exactly would be the criterion? This brings us back to problem number one: is there a meaningful distinction to be made between main space languages and appendix-only languages?

  • If we want to make strength of attestation the criterion, we move all LDL-subjected languages – natural or constructed – to the appendix namespace, and all WDL-subjected languages – natural or constructed – to the main space (that's a shit idea, if you ask me);
  • If we want to make "naturalness" the criterion, we move all constructed languages to the Appendix: all natural languages belong in the main space, and all constructed languages (that we've agreed to keep on Wiktionary) belong in the Appendix space, regardless of the attestation criteria we'll choose for each. That would be my preference, but I think many people will be opposed to that.

Given that neither of those solutions is particularly appealing, we have to look further. Fact one: there are only constructed languages in the appendix. Fact two: constructed languages kept in the main space are subject to stringent (WDL) attestation criteria. Fact three: ...

But if, in the end, there really isn't any meaningful distinction to be made (which, again, I'm not convinced of), we go back to Gamren's solution: disallowing appendix-only languages, which means:

  • 1) moving everything to the main space;
  • 2) working from there: what do we want to keep, under which criteria, and what do we want to see deleted for good?

I might have taken a shortcut or two, but I think that's the gist of it. I hope I haven't misrepresented the facts. --Per utramque cavernam 17:52, 6 June 2018 (UTC)

With LDL languages the presumption is that we're referencing a dictionary that recorded use, even if we don't have primary access to that use. With constructed languages that may not be the case. I could imagine doing something where words in a particular constructed language would be disallowed in mainspace unless they had three actual examples of use (*not* based on RFV, but required examples before they are in mainspace at all), moved to the appendix if all they had was a dictionary reference, and deleted otherwise. DTLHS (talk) 19:03, 6 June 2018 (UTC)
Indeed, that would be another solution. Per utramque cavernam 19:08, 6 June 2018 (UTC)
@DTLHS: By the way, I've edited my message a bit (and a little bit more) since you replied. Per utramque cavernam 19:19, 6 June 2018 (UTC)

I would like to have a bit of discussion regarding the bigger topic of votes and resolution of Lojbannic issues. I think it reasonable to discuss if a better criteria for inclusion of lojban entries can be made. Since Lojban like other languages involves putting together words from pieces or affixes, how do natural languages like German which do things like this decide which words to include on wiktionary ? Jawitkien (talk) 02:14, 7 June 2018 (UTC)

Per's presentation of my position is accurate. @DTLHS As I pointed out to Meta, LDL doesn't mean that we have to allow dictionaries, if no descriptive and reliable ones exist, as would be the case for most conlangs. I would argue that an unreliable mention should be worth nothing, just like for natural languages. Generally, I don't understand treating conlangs and natural languages differently; if people have used a word in their writings, what does it matter whether we knew who invented it, or it developed through centuries?__Gamren (talk) 07:45, 7 June 2018 (UTC)
Except if very few people have written in it.__Gamren (talk) 12:10, 7 June 2018 (UTC)
The communities of editors for each language decide on what counts for LDL attestation. Unfortunately, the editing community for Lojban has been very antagonistic toward CFI, and was outright ignoring it for some time, so I don't think they are likely to exclude dictionaries that include newly coined words (which is all Lojban dictionaries that I know of). —Μετάknowledgediscuss/deeds 13:08, 7 June 2018 (UTC)
  • Having entries in languages that can't be attested by any reasonable means but are correct is inherently valuable. That's why we have reconstructed protolanguages (note that reconstructed languages are constructed languages — it's easy to forget that since they now have their own namespace). I want fish to be able to link to a Proto-Germanic etymon just as I want bat'leth to be able to link to a Klingon one. But that doesn't mean that Klingon belongs in mainspace, or that much of its lexicon would pass CFI (although potentially more than Lojban!). I think there is a meaningful distinction to be made there, and that we don't want Proto-Germanic in mainspace either. But we can definitely close the loophole by establishing explicit attestation standards for the appendix, perhaps one durably archived use. —Μετάknowledgediscuss/deeds 13:08, 7 June 2018 (UTC)
I don't think constructed language is commonly used to include reconstructed languages, and we don't have such a sense at constructed language.__Gamren (talk) 17:10, 7 June 2018 (UTC)
I'm talking about the concepts, not the words. —Μετάknowledgediscuss/deeds 12:00, 8 June 2018 (UTC)
I don't understand what you mean by that.__Gamren (talk) 14:27, 8 June 2018 (UTC)
@Per utramque cavernam, Dan Polansky, Metaknowledge, Jawitkien, DTLHS I've made a quick vote draft for inclusion criteria. Edit/discuss/{add more options} if you want. After that, I think we should have a vote to move the language fully or partially back into mainspace (because the decision to move it in the first place was influenced by inclusion concerns, which doesn't have to be related). Then we can decide what to do about the other languages. Sound good?__Gamren (talk) 10:51, 23 June 2018 (UTC)
I don't like any of those options. Option 1 is the same as before we moved everything to an appendix, so we would just be reverting that vote. Options 2 and 3 are too loose. DTLHS (talk) 16:00, 23 June 2018 (UTC)
Several people voted for moving to appendix because they just didn't want it in mainspace, not because they wanted looser criteria, so it's not just a reversal, and it might fall differently than before. Above, it seems like you want to include everything that has either three citations or a dictionary reference, is that correct? I've added that, but we need to specify what dictionaries are usable. For now, I've just put jbovlaste.__Gamren (talk) 16:46, 23 June 2018 (UTC)
The vote is very problematic. By bringing up a non-durable dictionary (jbovlaste), it muddies the waters considerably. What we should actually do is have a vote about attestation of constructed languages in the Appendix in general, rather than making a vote with such poor options that I can easily imagine all of them failing. —Μετάknowledgediscuss/deeds 20:20, 23 June 2018 (UTC)
Sigh. So, what do you want? You contributed to making this mess, why not give us your idea for cleaning it up, instead of complaining each time I show some initiative?__Gamren (talk) 07:46, 24 June 2018 (UTC)
I just told you what I want: a general vote concerning CFI for appendix-only languages. We could create new criteria for them (which I've suggested elsewhere), or just borrow the LDL criteria, but it should be consistent with how we normally approach attestation (e.g. everything must be durably archived, no matter how lax the attestational requirements). I'd be happy to help, but you'll have to take a step back from blaming me and recognise that your last vote failed and unless you craft it better, the next will as well. —Μετάknowledgediscuss/deeds 08:25, 24 June 2018 (UTC)
Sorry for the tone. I made a vote specifically for this language because I think people feel differently about it than the others. If a vote to introduce criteria for all of them could pass, a vote to introduce it to one language should also pass, since the result of the latter is a subset of the result of the former. When I asked you what you wanted, I meant exactly what options did you want to have a vote about?__Gamren (talk) 09:47, 24 June 2018 (UTC)
@Gamren Sorry for weighing in late, work is not kind to my wiktionary/Lojban efforts. Gamren, thank you for creating a sample vote form. I looked over your options and feel that constructed languages such as Lojban will tend to be more dynamic than many other legacy languages. Generally, the community of usage will determine the words that are "official" with perhaps a starter set created by the initial language creator. In legacy languages, we have a language corpus which shows usage as durably published. For a language still under construction, you are more likely to find online usages. They aren't as stable as durably published records, and they will tend to be less grammatically correct. I know for Lojban there are websites where new words may be proposed and then voted upon. Perhaps that can be part of the criteria for inclusion in Wiktionary. The more folks who like a word + meaning pair, the more likely they will use it, and thus the more likely the need to include it. I wonder if there is any tool that examines wiktionary access logs and tells over a time period how often the page for that word has been delivered to an end user? It might be very instructive. Of course, when the link is red, we will never know how many folks would have clicked on it if the page had existed. Jawitkien (talk) 12:32, 29 June 2018 (UTC)
If you go to an entry and click "Page information" in the sidebar to the left, you can see how many views it has had in the last 30 days. Not sure what you can do with that information, but, I mean, we could just make empty pages if the data really was useful (not that I think we should). Also: if Lojban has such a "starter pack", we could add some option(s) permitting those words. I looked at this The Complete Lojban Language, and while it contains a glossary, it calls it "brief and unofficial".__Gamren (talk) 12:48, 29 June 2018 (UTC)
As I'm relatively new to the language, I would see the phrase "brief and unofficial" as a warning that the language has well defined construction rules, and the listing there is not complete because the writer did not apply the construction rules to each of the entries in the glossary to even approach a listing that would be considered more than "brief" Jawitkien (talk) 23:36, 12 July 2018 (UTC)

Finnish category redirects[edit]

The Finnish categories in Category:Category redirects which are not empty are populated by the {{suffix}} template. Looks like Rua moved the categories but did not update the template to point entries to the new categories. Can someone who knows Finnish confirm that the new categories are the correct ones and, if so, update the template so that it categorizes correctly? There is a German one too. - TheDaveRoss 13:16, 7 June 2018 (UTC)

All of the categories are for the front-vowel variants of the suffixes. It seems the ones we use for lemmas are the back-vowel variants. For example, -yys is the front vowel variant of -uus. The practice I have mostly seen and have myself too been following is to use the back vowel variant with an |alt2= to create the appearance of the front vowel suffix (so that the only category would be the one for the back-vowel variant that is being used as a lemma). The best approach would probably be to change them via a bot, AWB or a similar approach. (The conversion process itself is easy: ä -> a, ö -> o, y -> u to convert a front vowel variant into a back vowel variant). SURJECTION ·talk·contr·log· 22:41, 7 June 2018 (UTC)
I could probably take care of the smaller categories by hand, but -jä‎, -tön, -ys and -yys‎ definitely need automation of some kind. SURJECTION ·talk·contr·log· 22:46, 7 June 2018 (UTC)
I am going to also leave -tä or -y to someone who has a bot or AWB, but I have now emptied some of the categories. SURJECTION ·talk·contr·log· 23:11, 7 June 2018 (UTC)
Thanks. If nobody else gets around to it before I do I can make a script to update the remaining entries. I might ping you again with some examples to make sure I am not messing everything up. - TheDaveRoss 23:19, 7 June 2018 (UTC)

"Linguistic phenomenon of the week/month"?[edit]

I think it could be interesting to create a new section on the main page, similar to WOTD and FWOTD, to feature funny linguistic phenomena (such as rebracketings: a napronan apron, etc.), or groups of funny lexical items (such as exocentric compounds: cutthroat, rotgut, spitfire, etc.), or words that maybe aren't that interesting in themselves, but denote an interesting lexical concept (libfix would be an example), etc.

I sometimes try to do that in my WOTD and FWOTD nominations, and hope that readers notice the categories at the bottom of the page, or pay attention to the etymology section, etc., but it's not really suited to that purpose. Plus I'm not too keen on featuring libfix or exocentric because 1) the WOTD waiting list is already several months long and 2) as I said, those words aren't that interesting in themselves.

What do you think? Would that be feasible?

@-sche, DCDuring, Metaknowledge, Sgconlaw and whoever is interested. Per utramque cavernam 18:31, 7 June 2018 (UTC)

  • I wouldn't have thought that there were very many instances of such things. SemperBlotto (talk) 18:52, 7 June 2018 (UTC)
Such things could be of interest to some. I doubt that it could be featured daily rather than, say, weekly or monthly. How many categories do we now have that might contain examples of "interesting" linguistic phenomena? I assume that we are talking about English, though I could imagine something similar for cross-language phenomena, though the audience might be small. DCDuring (talk) 22:47, 7 June 2018 (UTC)
Sure, if you’re happy to maintain it. I’d say make it less frequent (at least in the beginning) so you don’t kill yourself thinking of themes or looking for words. Also, it needs a catchier name ... — SGconlaw (talk) 01:24, 8 June 2018 (UTC)
We should call it "Word Focus", "Weekly Fun", "Wondrous Formant" or "Wordishly featurized". Funnily enough, these would all be acronymed as WF. --Genecioso (talk) 11:44, 8 June 2018 (UTC)
@DCDuring: I'd try to feature things relevant to English, but the ideas I've had tend to be cross-linguistic anyway.
@Sgconlaw: Yes, you're right, I'll have to come up with a better name. What about "linguistic thingamajig of the week"? It almost rhymes :p
@Sgconlaw, DCDuring: Yes, daily or even weekly seems unreasonable. I think monthly would be manageable (I've already gathered a dozen ideas on my userpage, so we would be set for one year), but maybe a bit slow? Per utramque cavernam 16:58, 8 June 2018 (UTC)
I don't know how we'd handle it, exactly, but obligatorification is another interesting phenomenon. (On which note, I notice we have an entry for wreak havoc but also an entry for wreak which would seem to make clear that it is SOP even if it is the most common collocation these days.) - -sche (discuss) 19:13, 8 June 2018 (UTC)
  • I abstain on this. It's interesting, but we're not an encyclopaedia of linguistics (Wikipedia covers that); we're a dictionary. It's appropriate to feature words and categorise them according to the phenomena that have affected them, but featuring those phenomena is going out on a limb a bit. —Μετάknowledgediscuss/deeds 12:01, 8 June 2018 (UTC)
  • I would be concerned about maintainability, in terms of the maintainer(s) burning out (several times over the years WOTD has gone unupdated because maintaining it was too much work) and things to feature running out. Setting it monthly would make it manageable. I don't know how much interest there would be in it, and it might be boring that the featured thing was the same for a month at a time, but perhaps we could make it into an invitation (announce it monthly in the BP), a la fr.Wikt's LexiSessions, for new and old editors to edit in the featured area, e.g. identifying/adding libfixes, cases of phonetic erosion, etc. - -sche (discuss) 17:37, 8 June 2018 (UTC)
  • This might be off-topic/redundant, but we could do joint focus weeks for WOTD and FWOTD that feature the same phenomenon, which I personally would find interesting. And have a box above the two WOTD boxes explaining the phenomenon a bit, with a wikipedia link too probably. This way cool phenomena could be featured more visibly but it doesn't have to be every week. – Julia • formerly Gormflaith • 17:46, 8 June 2018 (UTC)
    I like this, just because more dedicated focus weeks reduce my workload in maintaining FWOTD. —Μετάknowledgediscuss/deeds 01:04, 9 June 2018 (UTC)
    I second that. I think it would be more manageable and might make(F)WOTD more interesting. Andrew Sheedy (talk) 17:04, 15 June 2018 (UTC)

Persian as ancestor of Tajik[edit]

For some reason, Persian has not been set as an ancestor language of Tajik. I'm sure this happened only by accident, but let me nevertheless explain the case. "Persian" is Modern Persian, which begins in the 8th century AD. So in order to claim that Persian is not ancestor of Tajik, one must claim that for the past 1200 years there has existed a Tajik language independently derived from Middle Persian. In fact, the area was probably only Persified by the Samanids in the 9th and 10th centuries. Moreover, Tajik is still often considered a mere dialect of Persian and otherwise its independent history begins only with the Russian rule of the 19th century. So may I ask you to please make the appropriate changes. Thank you! (PS: I'm neither Tajik nor Iranian nor Afghan, so no nationalism involved.) —This unsigned comment was added by 84.188.181.78 (talk) at 22:28, 7 June 2018‎ (UTC).

The current arrangement is in fact intentional and is the result of Wiktionary talk:About Persian#Tajik. I have no special knowledge of the matter, so I have only edited the modules in the ways the editors there have requested. The situation is complicated; if it needs to be changed again, we'll need to be sure our Persian editors / the people who participated in that discussion are on board. - -sche (discuss) 00:03, 8 June 2018 (UTC)
Tajik is descended from Classical Persian, but since we treat Classical Persian and Modern Persian as the same thing on en.wikt, it is a bit confusing. Tajik not descended from Middle Persian though. --Victar (talk) 19:25, 28 June 2018 (UTC)

fake news![edit]

It's been bugging me for months how people keep saying "fake news" about things which are not any kind of news at all. Usually just as a way of disagreeing with something another person said.

I was surprised to see we don't currently have an English entry for fake news (just a Danish one) but though this misuse bugs the hell out of me, as an lexicographer hobbyist I also realize this is a non-sum-of-parts and hence unexpected phenomenon that we should be documenting, being descriptivist and all.

Sometimes it seems to be used as an interjection: Person A says something; person B yells "FAKE NEWS!"

But other times it's used as a noun in a sentence such as "Well that's fake news".

I'm not sure I've seen it in print but I can definitely provide some YouTube links to people saying it in videos, and I bet it won't be difficult to find used in the same way in forums, comment sections, etc. online.

Comments anyone? — hippietrail (talk) 07:18, 9 June 2018 (UTC)

Aren't those people just either deluded or deliberately misapplying the term, like when people accuse any dissenter of being a "shill" for the other side? Equinox 14:06, 9 June 2018 (UTC)
FWIW I use it humorously as
a) a way of disagreeing with a popular opinion — "Salted caramel ice cream is fake news. It's not even that good." (= It's overhyped, therefore it's "fake". idk)
b) synonymous to "No way"; doesn't imply you don't believe it, but that you don't want to — "The Cavs lost again last night." / "That's fake news. How'd they lose again?"
It's just a funny way to make fun of Trump when it's not about actual news. Like pronouncing China as /ˈdʒaɪnə/ or bigly or, recently, "Thank you X, very cool!". — Julia • formerly Gormflaith • 15:59, 9 June 2018 (UTC)
That could be when it's used by non Trump supporters. But I've also seen it used by Trump supporters when not about news. It wouldn't surprise me to see people totally ambivalent about Trump using it this way too. — hippietrail (talk) 05:07, 11 June 2018 (UTC)
Could be that "fake news" is a synonym of "liberal lies/propaganda"? Anyways, I've been trolling around Twitter (not a source! I know) and found a few interesting instances or either it being used as an adjective/not regarding news: [1][2][3][4][5][6][7][8][9][10][11]Julia • formerly Gormflaith • 14:52, 11 June 2018 (UTC)

This is a really interesting phenomenon that I have been personally eyeing for awhile now. "Fake news" was a term used in normal parlance (the mainstream media, casual conversation, etc.) to refer specifically to nonsense semi-news outlets that peddle in wild conspiracy theories and which serve as fronts for selling herbal supplements (e.g. InfoWars). Politically, they are all very right-wing and sometimes "libertarian" as well. Then, the word got perverted by the alt-right to mean "actual news and facts" and the way it is used is as a shorthand for saying, "[x reasonable opinion and set of facts] is fake news/obviously biased and meant just to impede our movement, since normal human beings believe it." It's a fascinating and scary phenomenon. —Justin (koavf)TCM 00:21, 12 June 2018 (UTC)

I think that underlying the peculiarities of the use of fake news is a meaning shift in news. If the meaning is learned inductively, then the content of media that label themselves as offering news becomes the meaning of news. To the extent that news is aimed at increasing viewership/readership, as it usually is when the media are advertiser supported, then 'news' comes to include 'entertainment news', 'sports news', 'weather news', 'human interest news', 'lifestyle news', online video content, etc., principally delivered in 30-90-second bits, much of managed by publicists. If that constitutes real news, what defines fake news? DCDuring (talk) 11:50, 12 June 2018 (UTC)
Maybe we should just wait a decade or so for the meaning to settle down before we attempt to define this (ha ha that's not going to happen). DTLHS (talk) 18:21, 15 June 2018 (UTC)

module:ine-nominals[edit]

The module ine-nominals generates Proto-Indo-European declension tables for nouns, pronouns, and adjectives. I'm changing the accusative plural desinence "*-ns" for "*-ms" because of the reasons outlined in it's discussion page. I've already pinged some IE editors and no one has opposed, if anyone does this change please let me know. Greetings -Tom 144 (𒄩𒇻𒅗𒀸) 00:57, 10 June 2018 (UTC)

A category for latinx, lxs, etc[edit]

Would it be useful to gather words like these, where x is used to gender-neutralize a word, into a category? As lxs shows, the phenomenon seems to go a bit beyond / be a bit different from cases where -x is an ending, which could otherwise arguably be handled by the existing suffix categories. What should the category be called? "Langname terms spelled with gender-neutral X"? (See also Xicana, an apparently distinct phenomenon.) - -sche (discuss) 03:48, 14 June 2018 (UTC)

Is this a thing in languages other than Spanish? I've only encountered -x and -e and -@ as replacements with Spanish but not (e.g.) Portuguese. Also, I'm not sure the best way to document this but these -x constructions actually come from English and anglo/hispanic source in the United States far more commonly than non-U. S. Hispanics. It's been adopted increasingly by Hispanics with the mother tongue of Spanish (i.e. not bilingual American Hispanics) but its origin is as a kind of anglo hypercorrection.Justin (koavf)TCM 04:27, 14 June 2018 (UTC)
I've seen it in Spanish, English (including with some words not of Spanish origin, like alumnx, womxn and hxstory, the latter of which are good examples of non-suffixal use of x) and German (although I don't know how many of the examples meet CFI...), and apparently it also happens in Portuguese. (Someone should check Italian, Catalan and other such languages for examples, too) - -sche (discuss) 04:54, 14 June 2018 (UTC)
@Koavf All three suffixes you mentioned are also used in Portuguese. — Ungoliant (falai) 15:18, 14 June 2018 (UTC)
@-sche, do you think it is necessary to separate words with x from other gender-neutral replacements/coinages? If not, something like Category:English gender-neutral terms is sufficient. — Ungoliant (falai) 15:18, 14 June 2018 (UTC)
I do see some benefit to separately categorizing xs vs @s (including e.g. Pin@y where again it isn't a suffix), etc, but I suppose lumping them all together would also work. My concern with calling the category "gender-neutral terms" is that it could attract terms that are, well, (merely) gender-neutral, like person or scientist. I note that for example sex worker is in that category even though it was coined to replace also-technically-gender-neutral terms like prostitute not to gender-neutralize them, but to recognize the work aspect of sex work. And also, that none of the x or @ terms are currently in that category. Still, that's a fallback if we don't want to make a more specific (sub)category for these. - -sche (discuss) 17:12, 14 June 2018 (UTC)

Templatize "a native or resident of"[edit]

The definitions of our words for "person who is from or in [place]" are a haphazard mix of "resident of X", "native of X", etc. The differences suggest the scope of the words differs: for example, "Asturian" is defined as "a native of Asturias", while "Minorcan" is "an inhabitant of Minorca", and "Madagascan" as "a native or inhabitant of Madagascar". But in fact you can still call a native who no longer lives in Minorca a "Minorcan" and, in the same situations as you can call a non-native resident of Madagascar a Madagascan, you can call a non-native resident of Asturias an Asturian. I suggest we create a simple template for these definitions so the wording can be harmonized (on something to the effect of "A native or resident of {{{1}}}") whenever applicable. (Compare {{place}}, but this template could be much simpler.) Is this a good idea? Obviously, in the rare case where a word does refer exclusively to a native or to a resident, that could be spelled out manually like all the definitions are at present. - -sche (discuss) 04:15, 14 June 2018 (UTC)

  • Agree Boilerplate definitions (or glosses) should be templatized and ultimately stored at d:. —Justin (koavf)TCM 04:26, 14 June 2018 (UTC)
That seems reasonable.__Gamren (talk) 18:09, 15 June 2018 (UTC)
It would be good to have a text that is equally appropriate for places (towns, cities, ...), regions (islands, provinces, ...), countries, and even continents. It is not unusual to call a person of descent from Zimbabwe (for example) a Zimbabwean, even if they have never been to Zimbabwe. So what about this?
  • {{demonym|Asturias}} → A person from, or of descent from, Asturias
  • {{demonym|Minorca}} → A person from, or of descent from, Minorca
  • {{demonym|Zimbabwe}} → A person from, or of descent from, Zimbabwe
Assuming that the pagename is also the related-to adjective, the output could also be made to read, "A person from Zimbabwe or of Zimbabwean descent". In case that is not appropriate, a second parameter could be used for an alternative descent adjective, for example, for Brit:
  • {{demonym|Britain|British}} → A person from Britain or of British descent
And then the value - for that parameter could signify that the descent clause is to be omitted in its entirety.
An unresolved issue is that English grammar may require the definite article while it should not be linked-in: "... from the Dominican Republic, ...". Perhaps an initial part of the first parameter up to the first word with a capital letter could be copied to the output but kept outside the link. And if a parameter already contains a link, it should just be copied verbatim, so the Dominican Republic and the [[Dominican Republic]] will have the same effect.  --Lambiam 23:51, 24 June 2018 (UTC)

'Male-and-female' vs 'unisex' given names[edit]

Currently, some unisex given names use {{given name|female|or=male}} (or male|or=female), which displays as "female or male given name" and categorizes them exclusively as "male given names" and "female given names" but not as "unisex given names". Others use {{given name|unisex}}, which displays "unisex given name" and categorizes them exclusively as unisex but not as male or female.
IMO, all the preceding inputs should categorize the names as male, female, and unisex, but even if the categorization is fixed, the inconsistent definitions seem undesirable.
I'd like to switch them all to {{given name|unisex}}. Thoughts? Note that I'm only talking about editing entries like Dakota, River and probably Indiana, where the name has the same origin when applied to men as when applied to women; I'm not talking about entries where the male and female names have separate origins or contexts, like Dana, Jocelyn, and Jess. - -sche (discuss) 04:46, 14 June 2018 (UTC)

English names rarely stay unisex. It's often necessary to define that one sex is more common, or that US and British usage is different, or that the gender has changed at some point. Defining George or Shirley as "unisex given names" would be confusing. In some languages unisex names may be the norm (Chichewa, Hawaiian) while in others they are forbidden (Finnish). It would be fine if someone could make a bot adding unisex category to any name with male and female definitions in the same language.--Makaokalani (talk) 15:43, 14 June 2018 (UTC) To avoid confusion, I would rather remove the new unisex parameter from Template:given name and have a bot replace it by "male|or=female". Though I've always thought "male|and=female" would sound more definite. --Makaokalani (talk) 12:11, 15 June 2018 (UTC)
In fact, I've noticed that the template/module accepts (and adds a category for!) anything that is added in the first parameter, so e.g. {{given name|dumb|lang=en}} gets put in "English dumb given names". Probably the module should be updated so that setting the parameter to anything other than "male", "female" or "unisex" puts the entry into a cleanup category we can monitor. We shouldn't do away with "unisex", precisely because for many languages it's the best term, and even in English there are names it's applicable to. Whereas, names like George shouldn't have their male and female uses combined onto one line, anyway, because they need context/commonness labels. I agree that the wording should be changed from "or" to "and" (I think I see how to make that change to the module) for any single-definition-line names where "male [or/and] female" is kept rather than switched to "unisex". - -sche (discuss) 18:37, 15 June 2018 (UTC)
@-sche: The module now adds a tracking template for unrecognized genders. — Eru·tuon 19:55, 15 June 2018 (UTC)
Thanks! One thing that turned up was this, with separate "female given name" and "male given name" on the same line, which only happened to be caught because of a stray space, but should still be combined even if not typoed. I'll mention that on WT:T:TODO. - -sche (discuss) 22:03, 15 June 2018 (UTC)

Clarification to Wiktionary:Entry layout[edit]

I felt that some of the text in the section List of headings was particularly unclear, in the sense that a reader would not get the intended meaning unless they already knew it. So I devised a replacement text. I cannot apply it myself; the page is locked to prevent editing. "View source" calls up a text that suggests recommending any additions or changes to the page on its talkpage. Which I duly did, here: Wiktionary talk:Entry layout#Indentation?. That was three-and-a-half months ago, but nothing happened.  --Lambiam 21:03, 16 June 2018 (UTC)

...and that one editor is back[edit]

See Wiktionary:Beer_parlour/2018/May#Possible_IP_range_blocks_required. The ranges are the same and the edits not much different at all. SURJECTION ·talk·contr·log· 14:13, 18 June 2018 (UTC)

I'm still not sure how best to handle this, there are lots of seemingly unrelated editors in those ranges, some with accounts and some without. Anyone have any thoughts about creative methods for reducing this person's edits without blocking a bunch of good contributors? - TheDaveRoss 14:48, 18 June 2018 (UTC)
I would probably set an IP range block but allow account creation and only block anon edits. SURJECTION ·talk·contr·log· 15:11, 18 June 2018 (UTC)

Still an issue. Found a new broadband IP: Special:Contributions/82.203.184.19. Actively used alongside the mobile IP ranges reported earlier. SURJECTION ·talk·contr·log· 12:25, 24 June 2018 (UTC)

Onomatopoeic PIE *a[edit]

Do some *a words in PIE exist simply because they're onomatopoeic? Case in point, *kan- ("to sing") and *al-al- (to shout) (cf. աղաղակ (ałałak, shouting) and ἀλαλαγή (alalagḗ, shouting)). --Victar (talk) 16:49, 19 June 2018 (UTC)

Old Frisian /j/[edit]

Hey, user @Leornendeealdenglisc and I were discussing the usage of j in Old Frisian orthography. As j didn't really come onto the scene until much later, /j/ typically appears to have been rendered in contemporary texts as i, however some scholars have chosen to transcribe it as j. Case in point, ieva ~ jeva (see references).

Should we normalize to j, and if so, just word-initial, or everywhere before vowels, ex. tohakia > tohakja? I have no problem with the former, but the later seems a bit hyper-corrective to me. @Metaknowledge, -sche, Leasnam, Korn --Victar (talk) 20:11, 22 June 2018 (UTC)

I'm not familiar with Old Frisian to an extent where I could contribute to this discussion. Would that remove any ambiguity? Thinking about it made me notice btw. that we're not consistent with j in Old Saxon where we write it ⟨i⟩ in words like hebbian. Korn [kʰũːɘ̃n] (talk) 21:28, 22 June 2018 (UTC)
@Korn, I can't think of any situations where it would disambiguate. I believe we transcribe OS and OHG primarily with j only word-initial. --Victar (talk) 22:14, 22 June 2018 (UTC)
@Mnemosientje, Isomorphyc --Victar (talk) 06:48, 23 June 2018 (UTC)
Reminds me of how we sorta treat i and j in Latin entries (cf. iaceō vs. jaceō). We show that as i unless actually attested as j if I'm not mistaken. Would this same reasoning also work for Old Frisian ? Leasnam (talk) 15:30, 24 June 2018 (UTC)
As for "unless actually attested": For some languages (including Old and Middle Germanic languages) a so called normalised spelling is used in Wiktionary. For example, ⟨u⟩ and ⟨v⟩ are often normalised by their (assumed) sound. E.g. it's silvir albeit only attested as "Siluir" (capital S as beginning of a sentence). Compared with WT:About Old Saxon#Normalisation ("Any other attested spellings may be listed under an ===Alternative forms==="), there should be both, an actually attested "siluir" and a fictional "silvir". It would also be nice to mark fictional - not properly attested normalised - entries by a note like "This spelling is not attested, but normalised", maybe with an addition giving some sources and actual spellings. -84.161.6.98 01:10, 3 July 2018 (UTC)
Only information specific to the entries belong into the entries, for ease of work and readability. Normalisation of e.g. Old Saxon is something that applies to every single entry and hence is noted in About: Old Saxon, so that our entries don't become spammed with annotations. Korn [kʰũːɘ̃n] (talk) 11:09, 3 July 2018 (UTC)
Not all normalised spellings are unattesed - sometimes the normalised spelling does indeed occur somewhere (well, maybe without diacritics similar to Latin and macra, and with alterations of letters like turning ſ into s if there is only ſ, ı into i if there is only ı, similar to Latin and old ALL CAPS style). On the other hand, some normalised spellings could be unattested, and some are unattested, when not accepting later editions (19th till 21st ct.). Thus the above does not apply to all entries. And it might be useful information to know that silvir (< siluir, text differing between u/v in another way than u=vowel, v=consonant) is fictional, while mīn (< mın, text without i), des (< deſ, text without s), daȥ (< daz, ȥ treated like z+diacritic) do exist. -84.161.48.172 02:38, 5 July 2018 (UTC)

Can a thesaurus page have multiple senses?[edit]

Should a thesaurus page have a single sense or are multiple senses in a single page OK? I'm not sure how to proceed at the moment.

When I started creating thesaurus pages, I kept to a single sense for each page. Mostly, this was for no better reason than the existing pages appeared to do this. That is why there are separate Thesaurus:workaround (noun) and Thesaurus:kludge (verb) pages and why there are separate pages for Thesaurus:smell, Thesaurus:olfact, Thesaurus:olfaction, etc (a lot of Category:Thesaurus:Smell could come under the headword "smell"). However, there are some thesaurus pages with multiple senses and multiple parts of speech, such as Thesaurus:death, Thesaurus:surprise and Thesaurus:worsen.

The guidelines at Wiktionary:Thesaurus and Wiktionary:Thesaurus/Format are not explicit either way. I haven't found anything else yet to clear this up. Multiple senses in a single page would fit with the mainspace. Single senses are clearer and easier to interlink, both for semantic relationships and for other languages (because thesaurus pages are single semantic concepts rather than words). Either could seem more intuitive to different people. - AdamBMorgan (talk) 13:10, 25 June 2018 (UTC)

Requested move: Μεσόγειος θάλασσαΜεσόγειος Θάλασσα[edit]

At some time in the past, the page Μεσόγειος Θάλασσα (Greek for Mediterranean Sea) was moved to Μεσόγειος θάλασσα. I think this was a bad move; the common practice is to capitalize both words. See. e.g., the Greek Wiktionary at Μεσόγειος Θάλασσα and the Greek Wikipedia at Μεσόγειος Θάλασσα. In English we also write Mediterranean Sea and not *Mediterranean sea. I'd move the page back if I could; unfortunately, the redirect page does not have a trivial edit history because interwiki links were added (and then removed when we got rid of interwiki links in general), so I cannot perform the move ("You do not have permission to move this page").  --Lambiam 04:40, 29 June 2018 (UTC)

Seems reasonable. Yes check.svg Done. - -sche (discuss) 21:15, 2 July 2018 (UTC)

naming audio pronunciation files[edit]

For english and non-english words. Could someone kindly help by expanding/clarifying the audio-Help page? I left a message at Help audio Talk. Is it Xx(lang) - cc(country) - word.ogg, or could this also be allowed: Xx(lang) - word - cc/dialect - accent? Thank you. sarri.greek (talk) 09:52, 30 June 2018 (UTC)

Distinguishing between "Derived terms" and "Derived compound verbs"[edit]

This topic is primarily for Azerbaijani and possibly other Turkic languages. In görmək (to see), I attempted a way of distinguishing between derived terms, under which I put deverbal nouns and simple deverbal verbs (verbs that are derived from another verb by suffixation) on the one hand and derived compound verbs, where I put light verb constructions and alike (complex predicates consisting of at least two separate words) on the other. Relatively few such terms are derived from görmek, but there are verbs which are used to derive many more compounded verbs, and in such cases I find it very useful to distinguish between the two categories instead of mixing derived nouns and simple verbs with compound verbs. What do you think? @Anylai, Crom daba etc. Allahverdi Verdizade (talk) 12:18, 30 June 2018 (UTC)

@Allahverdi Verdizade. For what it's worth, hold court is listed as a derived term under hold, and take advantage is listed as a derived term under take, even though you might call them derived compound verbs. So this refinement of the classification may not be needed. Probably, some of these many cases may also be classified as hyponyms, as you can see for Turkish yapmak.  --Lambiam 23:15, 4 July 2018 (UTC)
@Allahverdi Verdizade A similar issue comes up in Russian verbs, where there are many prefixed derivative verbs of simplex verbs as well as derived terms of other parts of speech; cf. вари́ть (varítʹ) for an example. I think you probably shouldn't create a new heading within discussion, but you could create a subheading, maybe like this: Benwing2 (talk) 01:17, 5 July 2018 (UTC)

Derived terms[edit]

compound verbs:

Subheading is an excellent idea. Allahverdi Verdizade (talk) 11:27, 6 July 2018 (UTC)

WOTD: April Fools' Day 2019[edit]

Proposals for a theme for the April Fools' Day period next year (1–6 April 2019) for Word of the Day are welcome. Words should preferably be chosen from the list of existing nominations, which is already rather long. — SGconlaw (talk) 19:05, 30 June 2018 (UTC)

Use "laurel" with an audio clip of "yanny"? "Groom of the stool"? - -sche (discuss) 19:09, 30 June 2018 (UTC)
We've been featuring a series of interesting words which have a common theme, rather than just gags. The theme should preferably be more intriguing than something like "nouns". This year's was words about unusual concepts. We can have up to six words in the series. — SGconlaw (talk) 19:15, 30 June 2018 (UTC)
How about "animal paradoxes/contradictions"? Examples: black swan, buffalo wing, butterfly effect, chicken-or-egg question, Cockney, flying fish, hen's teeth, horsefeathers, I'll be a monkey's uncle, infinite monkey theorem, lipstick on a pig, w:Man bites dog (journalism), neither fish nor fowl, raining cats and dogs, Schrödinger's cat, the straw that broke the camel's back, turtles all the way down, walking catfish, when pigs fly. If that's too broad or vague, I think I see at least a couple of themes within the theme. Chuck Entz (talk) 22:33, 30 June 2018 (UTC)
@Chuck Entz: that sounds cute. In what way are they paradoxes or contradictions, though? At the moment they just look like animal-related terms to me. — SGconlaw (talk) 10:47, 2 July 2018 (UTC)
Not all the examples I gave are perfect reflections of the theme, but there are enough so you can select the best. The apparent paradoxes or contradictions, in order: Until they went to the Southern hemisphere, Europeans thought swans could only be white. Buffaloes don't have wings. How can a butterfly change the weather? Which came first, the chicken or the egg? Cockney is from cock's egg- only hens have eggs. Fish normally swim rather than fly. Hens don't have teeth. Horses don't have feathers. Humans don't have nephews that are animals. How can monkeys on keyboards produce anything meaningful? Pigs are ugly/lipstick is pretty. Dogs bite men. Is it a fish or a fowl? Dogs and cats don't fall from the sky. How can a cat be both alive and dead? Straws are light/camels carry extremely heavy things. What's under the bottom turtle? Fish normally swim rather than walk. Pigs don't fly. Chuck Entz (talk) 12:47, 2 July 2018 (UTC)

July 2018

Entries for hyphenated attributive forms?[edit]

We have entries such as transitive-verb, at-sign, open-book, criminal-law, shoulder-blade, sea-urchin (see a more complete list here). The hyphen being a mere spelling device, I think these are pointless, and I would like to see them deleted.

However, people have argued that using an hyphen turns something into a single word automatically; I disagree with that, and I'm not aware of any policy to that effect. Has there been a vote, or might we need one?

My point is that we should restrict ourselves to creating lexicalised attributive-form entries, such as cookie-cutter (idiomatic meaning, adjectivisation). Per utramque cavernam 15:16, 1 July 2018 (UTC)

I have suggested a vote to the effect that hyphens that are added when a phrase is used attributively would be treated as spaces for the purposes of determining whether the phrase is SOP. It was in the middle of a long discussion last month that you may not have read, and it was marginally off-topic to that discussion. The universe of possible attributive phrases is just too unlimited for us to cover: "6-inch bolts" a "27-foot boat", "Reform-Jewish-rabbi-officiated weddings", etc. Chuck Entz (talk) 16:06, 1 July 2018 (UTC)
@Chuck Entz: I don't remember reading that discussion, no. Where was it?
Found it. Per utramque cavernam 13:59, 2 July 2018 (UTC)
Yes. The actual (attested) attributive forms will only be a tiny subset of all possible combinations, but even then it might be a huge set. Attestation is a necessary condition for having an entry, but I don't think it should be a sufficient one. Per utramque cavernam 17:20, 1 July 2018 (UTC)
OT: I have enough trouble with hyphens appearing in 'vernacular' organism names. Is it blue moor grass, blue moorgrass, or blue moor-grass (all attestable at Google Books) (just to mention one I just ran across)? At least I'm fairly sure that blue-moor grass, bluemoor grass, blue-moorgrass, blue-moor-grass, bluemoor grass, and bluemoorgrass can be ignored. DCDuring (talk) 17:48, 1 July 2018 (UTC)
You've got my vote. DCDuring (talk) 17:48, 1 July 2018 (UTC)
For reference, this was discussed at Talk:transitive-verb#RFDE: All English attributive forms (with hyphens) of noun phrases. Note that treating hyphen as space for the SOP determination is a separate issue; here, transitive verb is kept, yet someone wants to delete transitive-verb, where the sum is not transitive + "-" + verb but rather transitive verb + hyphenation-operator, or the like. --Dan Polansky (talk) 18:54, 1 July 2018 (UTC)
In general I think we should delete entries that are purely for attributive-noun uses, like transitive-verb, but keep entries that can function as non-modifying nouns themselves, e.g. at-sign and probably shoulder-blade (which might be a legitimate British spelling of shoulder blade, as suggested by being listed as an alternative form under shoulder blade). In such entries, I'm undecided about whether to list the attributive use as a possible definition (as it is done under at-sign), and also undecided about cases like open-book, which has both a definition as a non-SOP adjective and a definition as an attributive noun. The logic here is that the hyphen in attributive-noun uses is purely a typographic convention and shouldn't be treated differently from a space. It should be similar to German compounds, where words that function only as SOP compounds aren't included even though written as a single word. Benwing2 (talk) 15:08, 2 July 2018 (UTC)
If "The logic here is that the hyphen in attributive-noun uses is purely a typographic convention and shouldn't be treated differently from a space", then transitive-verb should be kept no less than transitive verb, since, again, "hyphen ... shouldn't be treated differently from a space". --Dan Polansky (talk) 08:40, 3 July 2018 (UTC)
"It should be similar to German compounds, where words that function only as SOP compounds aren't included even though written as a single word": That is not our practice, as per e.g. Talk:Zirkusschule. --Dan Polansky (talk) 08:42, 3 July 2018 (UTC)

Slovenian Pleteršnik orthography[edit]

@Atitarev, Guldrelokk, Dan Polansky Slovenian orthography is very confusing, as there are at least two incompatible diacritic systems (see Appendix:Slovene pronunciation). On top of this, Pleteršnik's dictionary uses yet another system that I don't understand; see [12] for an example. Apparently this system encodes a lot of additional dialectal information, but I haven't been able to find a description of this system and I can't read Slovenian. Can anyone help discover what the Pleteršnik symbols mean? Thanks! Benwing2 (talk) 17:53, 1 July 2018 (UTC)

BTW, see [13] for a somewhat blurry image of the page that explains the symbols. It's in Slovenian; maybe someone can read it? Benwing2 (talk) 17:58, 1 July 2018 (UTC)
The description is here. It seems to say:
ẹ and ọ signify close vowels, ę and ǫ signify diphthongs /ie/ and /uo/, which are always long and only occur in stressed syllables. e and o stand for open vowels.
ɐ is [ə].
ł is [u̯].
Three kinds of accent exist: two for long vowels, falling, marked with circumflex, and rising, marked with acute, and one on short vowels, marked by grave. Guldrelokk (talk) 18:12, 1 July 2018 (UTC)
(with e/c) I know very little about Slovenian. I am not sure what you are trying to do. Your link[14] shows "zdẹ̀"; are you trying to figure out how to render that in IPA? I suspect that "zdẹ̀" is not an actual attested form but rather a dictionary-only form adorned to show pronunciation, and that the attested usual form is "zde"; but I don't really know. For example, tukaj is shown in en wikt as "túkaj" and is shown in Pleteršnik as "tȗkaj" per Fran[15]. If I am right, we are not talking orthography but rather forms adorned to show pronunciation. --Dan Polansky (talk) 18:20, 1 July 2018 (UTC)
@Benwing2: It also says that macron in loanwords only signifies length and that these vowels are pronounced as ‘pure’. In the dictionary it seems to work like another kind of accent (there is only one per word), apparently it was pronounced as a long vowel with flat tone. Guldrelokk (talk) 18:52, 1 July 2018 (UTC)
From their description, can you figure out how "ȗ" is to be pronounced, used in "tȗkaj"? --Dan Polansky (talk) 18:56, 1 July 2018 (UTC)
@Dan Polansky: [ûː], i.e. [ú͜u]. Written identically in the tonal orthography from the Appendix, it’s also in the entry: Tonal orthography: tȗkaj (why are the lemmas in the ‘stress orthography’?). Guldrelokk (talk) 19:05, 1 July 2018 (UTC)
@Dan Polansky Perhaps "orthography" is the wrong word; maybe "notation" is better. In this case, the etymology for the Russian entry for здесь mentions Slovenian zde. I would rather cite Slovenian words in etymologies in the tonal orthography if possible, as it conveys more etymological information. However, some words (like this one) are available in Fran only in the Pleteršnik notation, and in that case my choices are either to cite it directly in that notation along with a note indicating that it's Pleteršnik's notation (which links to a page explaining that notation), or to try to convert it to normal tonal orthography. Cf. templates like Template:l/sl-tonal, which is used to cite Slovenian words in the tonal orthography and adds a note indicating that the word is in the tonal orthography, with a link to Appendix:Slovene pronunciation, the page that explains the diacritics. This is necessary because, unlike with Serbo-Croatian, there are (at least) two different possible notations, which are incompatible with each other, so without the note, it would be unclear which notation is being used. Benwing2 (talk) 19:06, 1 July 2018 (UTC)
@Guldrelokk IMO we should always be using the tonal orthography, but I've heard that nowadays most Slovenians pronounce words non-tonally, so they may be more familiar with the non-tonal notation. Benwing2 (talk) 19:08, 1 July 2018 (UTC)
@Benwing2: It seems that the tonal orthography is basically Preteršnik’s notation, except that the pronunciation somewhat changed: there is (apparently) no more short or unstressed ẹ/ọ, no diphthongs and no ‘flat tone’. No idea what happened to them. Guldrelokk (talk) 19:18, 1 July 2018 (UTC)
@Benwing2, Guldrelokk, Dan Polansky: Late response. I'm not too familiar with the Slovene tonal notation either and when adding Slovene terms in translations, etymologies or reconstructions, I mostly just use the plain spelling, unless it's already defined here by native/advanced speakers. Rather than making/copying a mistake, I prefer to use what can be confirmed. A good Slovene dictionary is [16] - no tonal notations. You can also use monolingual [17] with some tonal notations. --Anatoli T. (обсудить/вклад) 07:12, 2 July 2018 (UTC)
@Benwing2, Atitarev I wonder what monolingual dictionaries use stress notation? Guldrelokk (talk) 13:13, 2 July 2018 (UTC)
@Guldrelokk: If you learn the notation and the phonology a bit, it will give you the stress as well. The notation à la Ali govoríte slovénsko? is probably enough to know how to pronounce Slovene correctly, if you're already familiar with basic phonetic rules. And [18] I mentioned above is probably your best bet online. --Anatoli T. (обсудить/вклад) 13:35, 2 July 2018 (UTC)
@Atitarev: Yes, exactly, the ‘stress notation’ is redundant once you have the ‘tonal’, and the only reason it’s there is its alleged use by natives. However, I see that the monolingual dictionary you pointed out ([19]) uses the ‘tonal’ notation. What do other monolingual dictionaries use? Guldrelokk (talk) 13:45, 2 July 2018 (UTC)
@Guldrelokk: SSKJ2 appears to use both; headwords are in the stress notation but then they put the tonal notation in parens after. See for example [20], which has a whole bunch of dictionaries including SSKJ2 and Pleteršnik. Benwing2 (talk) 14:49, 2 July 2018 (UTC)

Default title for column templates[edit]

Views are sought on what the default title for column templates such as {{der2}}, {{der3}}, {{der4}}, {{rel2}}, {{rel3}}, and {{rel4}} should be. @Dan Polansky feels it should be the same as the relevant section heading (e.g., "Derived terms"), whereas I am of the view that it is more useful for the title to be "Terms derived from xyz", "Terms related to xyz", for two reasons:

  • It is (marginally) more useful for the template title to display the root term rather than simply repeat the section heading.
  • Where it is necessary to manually add a title, the practice is to put it in the form "Terms derived from xyz (noun)", "Terms related to xyz (verb)", and so on. Thus, for consistency, the default title should be in the same format.

SGconlaw (talk) 10:33, 2 July 2018 (UTC)

A discussion is at Wiktionary:Beer parlour/2018/June#Display text of Template:der3 and others. In that discussion, there is a post by -sche there that makes lot of sense. I acknowledge that repeating "Related terms" after "Related terms" is not so nice, so I propose other possibilities, like leaving the collapsible bar empty, or saying "Items:", or "List:", or coming up with other options that are user-friendly and non-repetitive. In that discussion, I give an example of how entry party looks; not so nice. --Dan Polansky (talk) 08:48, 3 July 2018 (UTC)
As for "Terms derived from xyz (noun)", that is another cruft that should ideally be reduced. The derived term section in question is in the noun section, so this does not need to be repeated. By my estimate, the practice originates from some people's taking pleasure in expressly stating obvious things and things of marginal relevance. These are two very different aesthetics. --Dan Polansky (talk) 09:38, 3 July 2018 (UTC)
I have no strong views on the latter point, but would just like to highlight that having a phrase like "Terms derived from xyz (noun)" does make the section easier to locate in a long entry. — SGconlaw (talk) 10:39, 3 July 2018 (UTC)
Question book magnify2.svg Input needed
This discussion needs further input in order to be successfully closed. Please take a look!

News from French Wiktionary[edit]

Logo Wiktionnaire-Actualités.svg

Hello!

Sorry that we skip two months, we missed people available to translate to English our publication, but this edition is ready and it's a great pleasure to invite you to read the June issue of Wiktionary Actualités!

A lot of content this time! Articles are about examples, a new tool to record pronounciations, integration of a specialized lexicon offered by its authors, a dictionary about popular words in the XIXth, the linking with Wikipedia and a story about fishermen and fishes. As usual, there is also plenty metrics, a short resume of some newspapers articles and unidentified pictures!

This issue was written by nine people and was translated for you by Dara. This translation can still be improved by readers (wiki-spirit). I hope you will see some interest to know what's up in your neighborhood! Face-smile.svg Noé 09:11, 3 July 2018 (UTC)

@Noé: Merci! Was wondering what happened. I'm always willing to help out with translation work if needed. – Jberkel 09:39, 3 July 2018 (UTC)
The issues of April and May now also are translated ! @Jberkel: too bad I only see your message now; you can check on the mistakes for these issues, and maybe we could call on you for the next one ! DaraDaraDara (talk) 11:12, 4 July 2018 (UTC)

July LexiSession: sauces[edit]

This month, suggested topic is sauces! Because of Caesar sauce, maybe.

In French Wiktionary, we just started by the creation of a thesaurus.

LexiSession in short: a collaborative transwiktionary experiment. Several wiki, a same topic, learning by looking at what our colleagues do. You're invited to participate however you like and to suggest next month's topic. The idea is to look at other community improvements on the same topic to improve our own pages and learn foreign way of contributing. If you participate, please let us know here or on Meta, to keep track on the evolution of LexiSession. If you can spread the word to other Wiktionaries, you are welcome to do so. Also, sorry I was very busy this last two months and I forgot to notice you. Face-smile.svg Noé 09:16, 3 July 2018 (UTC)

Improving sorting of items in categories via Mediawiki customization[edit]

Currently, in categories, items starting in č are sorted after items starting in z. This is very unconventional. For instances, in Category:cs:Amphibians, čolek is after salamandr. The example is Czech, but a similar problem is there probably for other languages as well.

As a remedy, it seems we could customize en wikt Mediawiki instance to use uca-default for sorting of items in categories. Czech Wiktionary has this, created via cs:Wikislovník:Hlasování/Změna abecedního řazení v kategoriích. An example Czech category is cs:Kategorie:Česká substantiva; an example Russian category is cs:Kategorie:Ruská substantiva.

One consequence would be that, for instance, instead of č being after z, it would be collated together with c. That is still not the conventional Czech order, in which č is after c rather than being on equal footing, but still seems to be an improvement.

A relevant page is https://www.mediawiki.org/wiki/Manual:$wgCategoryCollation.

Maybe someone knows more and can explain impact across languages.

--Dan Polansky (talk) 10:14, 3 July 2018 (UTC)

There's no perfect solution: ä, for example, is sorted with a in German and after z (and å) in Swedish. But that would get it closer to how an English speaker would expect it.--Prosfilaes (talk) 03:39, 7 July 2018 (UTC)
Ideally what we need is a mechanism to enable per-category collation: that is, Czech collation for Czech categories, German collation for German categories, and so on; multiple sortkeys per page, for Japanese; and a way to write our own collation algorithms for languages that do not have collation algorithms available, such as Ancient Greek, Egyptian, and Coptic. (See Module:egy-utilities for a makeshift collation algorithm used for Egyptian words in Module:columns, Module:cop-sortkey for one that is used in Coptic categories. Module:zh-sortkey provides a sortkey for Chinese categories, but MediaWiki might have an equivalent collation algorithm that would be available if we had per-category collation.) At least per-category collation has been proposed (see phab:T30397), but I don't know what's happened with it since 2012.
Besides changing the default collation, a workaround is to add some sort_key replacements to the language table. It would be ugly, but I think you could impose the order c < č < d by replacing c with cc and č with . — Eru·tuon 00:06, 8 July 2018 (UTC)
Surely the "natural kludge" would be to use cˇ instead of č, o´ instead of ó, etc. (though of course this doesn't really work for cases like ł). --Tropylium (talk) 10:10, 14 July 2018 (UTC)

Most searched-for entries[edit]

Do we have a list of the most searched-for – or even viewed – pages here? I couldn't see anything under the Special Pages list. It would be nice to make a concerted effort to work on things that most people are looking up. Ƿidsiþ 11:43, 4 July 2018 (UTC)

[21]. DTLHS (talk) 16:45, 4 July 2018 (UTC)
Thanks for both the question and the answer. DCDuring (talk) 17:58, 4 July 2018 (UTC)
Nice! Thanks. Ƿidsiþ 06:42, 5 July 2018 (UTC)
Well, it was a good thought, but I think that I'd rather not waste your efforts on improving our pornographic content. —Μετάknowledgediscuss/deeds 06:54, 5 July 2018 (UTC)
:-D — SGconlaw (talk) 06:48, 6 July 2018 (UTC)
I think this is part of the same phenomenon as the rash of bogus "xx" content being added: as far as I can figure out, Wiktionary is being bundled with mobile operating systems in Africa and Asia, and there are lots of users who don't speak English well enough to realize that it's a dictionary and not part of the user interface. They apparently think they're searching the web for porn sites, but they're actually searching Wiktionary. Chuck Entz (talk) 07:31, 6 July 2018 (UTC)
What's odd is that people are searching for Roman numerals like XXXIX and XXIX. Must be typos, I guess! — SGconlaw (talk) 08:05, 6 July 2018 (UTC)
Not odd at all, considering the search engine's auto-suggestion feature. Chuck Entz (talk) 18:28, 6 July 2018 (UTC)
These are views, not searches. So if someone searches for "XXXIX definition" in google (because it's the number of the current Superbowl or something) they may end up here. DTLHS (talk) 18:38, 6 July 2018 (UTC)
In context, this is clearly not about SuperBowls... —Μετάknowledgediscuss/deeds 18:39, 6 July 2018 (UTC)
The February 2018 Super Bowl was LII. DCDuring (talk) 18:54, 6 July 2018 (UTC)
Look at the logs for Abuse Filters 54, 70, and 74 (to start with). Obviously people (probably horny teenagers) from the areas I mentioned are entering a lot of "x"es in the search engine, and when the auto-suggest doesn't land them in actual entries, they're going to the "not found" page, which gives them plenty of buttons for creating entries. Chuck Entz (talk) 19:50, 6 July 2018 (UTC)

Livonian alphabet[edit]

After some discussion it seems Livonian (ie. word Lețmō) should use ţ with cedilla (like Latvian), and not Romanian ț (T with comma below). Latvian-Livonian-English Phrase Book (gramata.pdf[22]) uses cedilla, but Tarto University Estonia-Lețkēļ dictionary[23] uses Romanian Ț in its entries for some reason. --Mikko Paananen (talk) 13:20, 4 July 2018 (UTC)

I think the usage of the Romanian ț is due to technical reasons. I think that Latvian ţ should be used here though, since there are no restrictions like that here. SURJECTION ·talk·contr·log· 23:00, 5 July 2018 (UTC)
I can't imagine what technical reasons would require the use of ț instead of ţ. Unicode-wise, ț was added in a later edition (3.0), whereas ţ was there from the start. Though I'm confused about Latvian; w:Latvian alphabet doesn't show any modified t's. w:ţ says it's only used in a Turkic language.--Prosfilaes (talk) 03:30, 7 July 2018 (UTC)
This is probably an artifact related to how k g n l r with cedilla as also used in Livonian (and Latvian) are rendered with a comma-like diacritic in most fonts; it seems clear that a single palatalization diacritic is what's aimed for here. --Tropylium (talk) 10:16, 14 July 2018 (UTC)
See w:T-comma#Software_support. It was only added in a later Unicode version, and as a result, many fonts did not support it initially, replacing all instances with T-cedilla instead. That is still done for many Romanian texts (at least according to the article). SURJECTION ·talk·contr·log· 18:52, 14 July 2018 (UTC)

Discord server[edit]

Hi. I just want to announce again that the English Wiktionary has a Discord server. If you are a Discord user and a Wiktionary editor, we would very much appreciate if you join in via this permanent invite. We would love to have you there. Cheers, and happy editing! PseudoSkull (talk) 05:44, 6 July 2018 (UTC)

Multi-stage borrowing[edit]

When a word is borrowed from language A into B, and then from B into C, would you say that C has borrowed the word from both A and B, or just from B? So for example, at kakao (Nahuatl > Spanish > Danish), one could say

From {{bor|da|es|cacao}}, from {{der|da|nah|cacahuatl||cocoa}}.

as I've put, or

From {{bor|da|es|cacao}}, from {{bor|da|nah|cacahuatl||cocoa}}.

depending on whether one thinks Danish can be said to have borrowed from Nahuatl.__Gamren (talk) 14:48, 6 July 2018 (UTC)

The first one. Whether Spanish borrowed from Nahautl or not doesn't really matter. —AryamanA (मुझसे बात करेंयोगदान) 15:21, 6 July 2018 (UTC)

Entries descending from themselves[edit]

On reconstruction pages the different Persian varieties are now normally grouped as descendants of Classical Persian (see, for example, Wiktionary talk:About Persian#Tajiki_Persian_is_not_descended_from_Iranian_Persian.). An example of such a layout is at *wŕ̥kah.

However, Classical Persian has (understandably) not been given separate headings and entries; instead Classical and Modern Persian words are united under the ‘Persian’ heading. See, for example, گرگ. The only difference between links to fa and fa-cls is the language name before the lemma.

Now, a page آهو listed itself as its own descendant. On the talk page @Victar argued that it is correct; it represents the same inheritance of the Modern Persian word from the identical Classical Persian word. However, I don’t think that the layout of reconstruction pages should allow entries to list themselves as their descendants, for the following reasons:

  1. It is confusing; the heading says Persian and one of the descendants is Persian as well, being no different. Moreover, Dari and Tajik words are first listed as regional variants, and then again as descendants; I understand that it’s supposed to represent different instances of the word, one as a modern Persian, one as a Classical, but it is still confusing.
  2. Nor am I aware of any other language that does so. As for Persian, @Victar says it is normal, but I haven’t been able to find any other entries that name themselves their descendants, not even among ones that are descended from Classical Persian on the reconstruction pages, like گرگ‎.
  3. @Victar argued that such descendants are there to be included into Reconstruction pages with {{desctree}}. However, in reality they break {{desctree}}, as @Chuck Entz pointed out on the talk page. Even if this can be fixed, I fear that such an unusual layout may cause other technical issues.

In my opinion, either Modern and Classical Persian should be separated, with one inheriting from another, or Persian entries shouldn’t list the same Persian entries as their descendants. Continuity with Classical Persian can be implied whenever a word is inherited; a modern word descending from Middle Persian or further must have gone through the Classical Persian stage. Likewise, English entries don’t list themselves as descendants meaning that they come from Early Modern English; it is sufficient to provide a further etymology, or label words that have come out of use since EMnE obsolete. In either case the current layout of (Indo-)Iranian reconstruction entries can be kept – they would simply be the only place to distinguish consistently Modern and Classical Persian in the latter case, as they have reasons for that.

Alternatively, if a آهو-like layout be the accepted one, then I think a bot can be made to automatically list every inherited Persian entry as its own descendant, together with useful notes like @Chuck Entz have added.

Guldrelokk (talk) 21:48, 6 July 2018 (UTC)

Continuing off the original discussion linked above, here is an example, building off what was discussed there. On the Old Persian entry 𐎠𐎰𐎥 (a-θ-g) we have the descendent tree constructed as so:
* Middle Persian: (/sang/)
*: Book Pahlavi: 𐮽𐮵𐯋𐮲 (sng), [script needed] (KYPA)
** Bakhtiari: سنگ (sang)
** {{desctree|fa-cls|سنگ|tr=sang}}
Now on the Classical/Modern Persian entry سنگ, we have the descendents like this:
* Iranian Persian: سنگ (sang)
* Tajik: санг (sang)
* Coptic: ⲃⲁⲥⲛϭ (basnc)
* → Hindustani:
** Hindi: संग (sang)
** Urdu: سنگ (sang)
* Ottoman Turkish: سنگ (seng)
  1. I've added a fa-ira etymology code, per @Calak's example in the previous discussion. If that isn't clear enough, I'm not vehemently opposed having some text in Persian descendant sections that reads (Descendents listed reflect that of Classical Persian). Most borrowings from Persian are from the Classical period, which means that virtually all Persian descendents sections would have that note above, so I do find it a tad excessive.
  2. The example of this, which is the basis of the Persian model we currently use, can be seen on Latin entries where we find descendent in the form of Medieval Latin, Late Latin, etc.
  3. There is nothing mechanically "broken" about this method, if that's what you mean, as you can see it working just fine on both pages.
--Victar (talk) 00:28, 7 July 2018 (UTC)
I see 𐎠𐎰𐎥 (a-θ-g) works fine, but *HaHĉúkah doesn’t. Guldrelokk (talk) 00:36, 7 July 2018 (UTC)
@Guldrelokk:, it's broken because an {{rfc}} tag got thrown in there. --Victar (talk) 00:39, 7 July 2018 (UTC)
Good. Even if the technical issues are resolved, others are not. Latin entries do not list ML and LL forms as their descendants when they are identical, so I don’t see how it’s comparable. Guldrelokk (talk) 00:40, 7 July 2018 (UTC)
And yet we do do that, especially on reconstructed entries, like *blavus. --Victar (talk) 00:53, 7 July 2018 (UTC)
What about non-reconstructed entries? These are very different cases: *blāvus and blāvus are different entries, Classical Persian: سنگ and Iranian Persian: سنگ are one and the same. Guldrelokk (talk) 01:00, 7 July 2018 (UTC)
Another point: your new solution makes {{fa-regional}} redundant. The ‘descendants’ will always be the same; either of these will have to go. Guldrelokk (talk) 01:00, 7 July 2018 (UTC)
Both Latin headers, both using the same language code la, both identical. There are non-reconstructed examples, as I know I've made some, but hard to sift through thousands of entries; I'll look though. It makes a difference, for example, in Latin descendants from a -w- form, and those from a -v- form.
Not really. It would generally only be used on pages with Classical Persian borrowings. --Victar (talk) 01:12, 7 July 2018 (UTC)
If the younger descendants are only for pages with Classical Persian borrowings and for others {{fa-regional}} will suffice, then I don’t see what additional info do they provide by duplicating {{fa-regional}} and introducing confusion by descending from themselves. If it’s important to show that the borrowings are from Classical Persian (if it indeed is important in general in Persian entries), then a note can be added that ‘the following words were borrowed at the Classical Persian stage’. Similar entries having distinct layouts are bad for consistency: it can confuse readers as well as less experienced editors, who may list New New Persian descendants in entries without borrowings and omit them in entries with borrowings. Guldrelokk (talk) 01:21, 7 July 2018 (UTC)
And no, *blāvus and blāvus are not identical: mind the link colours. Guldrelokk (talk) 01:26, 7 July 2018 (UTC)
You've already given your opinion, and I mine. Let others chime in so we're not just discussing this in a circle. --Victar (talk) 01:29, 7 July 2018 (UTC)
I don't think that showing descendants that are all the same as the headword and have no descendants themselves is at all useful- it doesn't explain anything, and the same information is included in the regional template- it feels like a tautology. In cases where dialectal variation at the Classical Persian stage is reflected in differences among the regional forms, or where there's borrowing from one of the descendants into another language, that's something you would want to show. Chuck Entz (talk) 02:44, 7 July 2018 (UTC)
I agree, @Chuck Entz. I'm certainly not advocating adding Iranian Persian, Tajik, and Dari to the descendents section of all Persian entries, as that would be needlessly redundant. This only becomes an issue when we have borrowings from Classical Persian or from Dari and Tajik, i.e. whenever a Persian entry merits a descendents section. --Victar (talk) 03:57, 7 July 2018 (UTC)
My main concern was with the state of the entry when I saw it and the complete lack of explanation in it of what was descending from what. Guldrelokk's initial reaction is basically what I would expect from any of our readers who don't know the finer points of Persian's historical stages and dialectology. Changing the language name in the descendants to "Iranian Persian" was helpful, and better than the qualifier method I used, but I'm still concerned that it's a bit opaque to the average reader. My edits were just a quick mock-up to show what I was talking about- the usage note, especially, is probably overkill.
As for my comment that "{{desctree}} can't use such things": it was based on a quick (mis)reading of the code and the comment about substituting to prevent template loop errors. When I looked at it again it was obvious that it was just substituting a module invocation for a template invocation that did the same thing. I never said anything about it being broken, just that (I thought) the code had a safety mechanism that prevented it from working. I still don't see how it avoids infinite recursion, but I'm not all that good with Lua. Chuck Entz (talk) 02:31, 7 July 2018 (UTC)
@Chuck Entz: The constraint that the module is avoiding is in the parser: a template isn't permitted to contain another instance of itself (for instance, if you put {{sandbox}} on its own page), and a module that's invoked on a page can't expand another instance of that page. But apparently you can have an invocation of a module function print the result of preprocessing another invocation of the same function. (I tested this with Module:doublet table. It just made the tables in Appendix:English doublets disappear; the preprocessing generated the empty string once for each invocation of the function. No recursion. Maybe I did something wrong, or the developers made the loop terminate somehow.) — Eru·tuon 05:49, 7 July 2018 (UTC)
It seems to me like there is a technical gap. {{alter}} could perhaps accept additional parameters which would mark alternative forms as being parents or childs or siblings. Forms created by {{fa-regional}} are probably too loosely connected and should go into the {{fa-noun}} template so they can picked up appropriately. Perhaps this would be realized in a way generalized for pluricentric languages by saving siblings into language data (being effective aside from Persian for Hindustani, Serbo-Croatian, perhaps Aramaic …).
Without wise technical solutions discord will stay real. What is wanted at the end is a “semantic web” where the logic as being imagined by the dictionary editor can also be picked up to be displayed in a different fashion (as in descendant trees) by machines. (A drawback would be that correct wikitext would become increasingly byzantinic for new editors).
In any case of course nobody wants to read duplicated entries. Fay Freak (talk) 02:34, 7 July 2018 (UTC)
I'll sit on the fence for now - interested in the outcome, though. Notifying (Notifying ZxxZxxZ, Dijan, Irman, Kaixinguo~enwiktionary): , @Vahagn Petrosyan. --06:49, 7 July 2018 (UTC) —This unsigned comment was added by Atitarev (talkcontribs).
Personally I think listing Modern Persian descendants on Modern Persian entries is somewhat redundant. However, if a term is only used in Classical Persian and has a different Modern Persian descendant, then the Classical Persian entry should have the Modern Persian listed as a descendant. But yeah, IMO having "entries descending from themselves" is unnecessary. —AryamanA (मुझसे बात करेंयोगदान) 16:08, 7 July 2018 (UTC)
@AryamanA: The true redundancy is having all the borrowing from Classical Persian manually duplicated on the Persian and OP/PIr entries. Imagine if we had to do that for Sanskrit. Also, when we have borrowings from CP, Dari, Tajik, Tati, etc., we run into the same problem with as the original discussion in having it look as though Tajik and the CP borrowings descend from Modern Persian. I think better to just have the descendants section on Persian entries represent CP, so we can be consistent and clear. And again, this only applies when we have borrowings from cliefly CP and Tajik, otherwise there would be no descendants section. --Victar (talk) 18:30, 7 July 2018 (UTC)
Repeating my stuffed-up call to Persian speakers and Vahagn, people who might be interested: (Notifying ZxxZxxZ, Dijan, Irman, Kaixinguo~enwiktionary): , @Vahagn Petrosyan. --Anatoli T. (обсудить/вклад) 02:08, 10 July 2018 (UTC)
As noted by others, Latin is not a good model in this matter for other languages; this new system causes redundancy for Persian as borrowings from language variants other than Classical Persian is rare (the same goes for Hebrew and Arabic). In those rare cases, we can simply use something like {{qual|via Iranian Persian}} in the descendands section.
Entries having the header "Persian" correspond to Classical Persian, Iranian Persian, Dari, and other regional forms of Persian that use the Perso-Arabic script. {{fa-regional}} should be probably deprecated in favor of {{alter}}. This model has already been used for other languages as well, e.g. Ancient Greek. Descendants section would cover the descendants from all of these variants of Persian, in practice, it reflects mostly descendants of Classical Persian. Rare cases from other forms can be indicated using {{qual}} as I noted earlier or by other similar means. --Z 08:50, 10 July 2018 (UTC)
@ZxxZxxZ We are currently using the Latin model, which is treating Classical Persian and Modern Persian as the same language. Did you mean something else?
You haven't addressed the problem of CP borrowings appearing as from Modern Persian and the original discussion of Tajik appearing as a descendant of Modern Persian when it has borrowing and we add it to the descendants section, which is surprisingly quite common. You also haven't addressed the content duplication of borrowings on PIr/OP entries and Persian ones without the use of {{desctree}}. Do you have any thoughts on those points? --Victar (talk) 16:39, 10 July 2018 (UTC)
Let me clarify my previous comment: "Persian" in wiktionary should refer to all forms of Persian (including Classical, Iranian, and Dari Persian), except Tajik. This has been our practice for a long time. This includes the descendants section, so seeing "Persian" in this section does NOT refer to for example modern Iranian Persian unless otherwise stated (a rare situation). So in my view the problem you mentioned does not actually exist.
Latin is good as a model in that particular area you mentioned right above, I meant it is a bad model in the descendant section, because, as I understood, it's not uncommon to have many descendants from each form of Latin for a single lemma. This is not the case for most other languages, and causes redundancy.
I'm not exactly familiar with the functionality of {{desctree}} and how and why it is causing problem here, so I can't comment on this. --Z 17:27, 10 July 2018 (UTC)
@ZxxZxxZ, OK, so let's use a real world example of Persian خرما (xormâ).
  1. As you can see, all the root borrowings are actually from CP, but to the untrained reader, they appear to be borrowed from Modern Persian.
  2. Tajik is also listed because of its borrowing into Uzbek, which again, makes it appear as descended from Mod. Persian. Now you could argue that the Uzbek borrowing should just be on the Tajik page, but that would a) hide the borrowing away from readers, and b) be contrary to the premise that Persian reflects both Mod. Persian and CP.
The fact is, readers are always going to assume Persian means Mod. Persian because we give them little to no indication otherwise. Personally, I think the only solutions are to
a) treat all Persian descendants lists as CL,
b) treat all Persian descendants lists as Mod. Persian,
c) have a note at the top of all Persian descendants lists specifying that it reflects one or the other, or
d) split Mod. Persian entries from CL, aka the Armenian method.
For the functionality of {{desctree}}, see 𐎠𐎰𐎥 (a-θ-g) and سنگ. Hopefully, that is enough for you to understand its use and the problem at hand. --Victar (talk) 17:52, 10 July 2018 (UTC)
On the other hand, that untrained reader may also be confused by seeing the names "Dari" and "Iranian Persian". Indeed, many times we follow this practice of mentioning the language variants simply as a poor replace for more accurate information regarding the time of the borrowings. Instead, I suggest adding a new feature to {{desc}} to add this information (year or century, e.g. "before 14th century"). We ultimately should be adding such information in Wiktionary. Doing that would automatically eliminate such problems with Persian and other languages. --Z 18:27, 10 July 2018 (UTC)
@ZxxZxxZ, I'm not understaning. Could you give us an example of your suggestion in the context of the Persian خرما (xormâ) and {{desctree}} problems mentions? --Victar (talk) 19:40, 10 July 2018 (UTC)
See my last edit there. This way it becomes clear to all readers that it's not a modern borrowing. --Z 11:52, 11 July 2018 (UTC)
Thanks, @ZxxZxxZ. So basically you're recommending that we should add the date prefix [11-15th century] to every CP borrowing in descendants lists. Although I do think adding the date of the earliest attestation of a borrowing is a good idea (I do that for Frankish borrowings), I don't think it's a very elegant solution, nor does doesn't address the Tajik or {{desctree}} issues. --Victar (talk) 14:53, 11 July 2018 (UTC)
Why is it even important to indicate that the borrowing is from Classical and not Modern Persian? If we don't make this distinction explicit in our Persian entries why make it explicit in descendant trees? Crom daba (talk)
For the same reasons made here. --Victar (talk) 17:54, 11 July 2018 (UTC)
Political correctness? That's a drag. Dating prefixes don't sound so bad, if we have the necessary data I'd say go for it. Crom daba (talk) 18:19, 11 July 2018 (UTC)
Accuracy, clarity to readers, etc. As I pointed out, date prefixes don't address the various other issues listed above. --Victar (talk) 18:54, 11 July 2018 (UTC)

More entries than English Wikipedia[edit]

As of now, and for the first time ever, we have more mainspace entries than Wikipedia. That makes us better than them. Now who can delete their main page to show them who's really in charge around here? DTLHS (talk) 02:51, 7 July 2018 (UTC)

THIS. --Victar (talk) 04:11, 7 July 2018 (UTC)
They say the main page is undeletable. They say it can't be done. But *handing out briefing materials, playing suspenseful music* we're assembling a team to do it.
  • SemperBlotto: the lookout. Ever-watchfully patrolling RecentChanges here, he has the skills to keep a lookout for any admins over on 'pedia who might get in our way.
  • Wonderfool: the demolition specialist. He knows how to delete pages that shouldn't be deleted. Pages that "can't" be deleted. He can evade any blocks and get us inside, especially with the help of...
  • BD2412: the inside man. He's been an admin on WP since 2005. Studying them. Gaining their trust (and sometimes their ire, like any rouge admin). He can edit and unprotect protected pages.
  • Equinox: the getaway driver. I don't know how we're gonna incorporate a getaway car into this, but most of the movies I've seen about this kind of thing have one, so we're bringing one. :)
  • Other spaces are still available: volunteer below.
The devs have made it impossible to use the "delete" function on Wikipedia's Main Page, but our haxx0rs have found a backdoor: replace the text of the page with the text of MediaWiki:Noarticletext.
</joke> don't ban me WMF...
- -sche (discuss) 05:49, 7 July 2018 (UTC)
Wow! It's slightly unfair though because we have one or two non-English entries and they have none. Equinox 12:47, 7 July 2018 (UTC)
No it is very unfair because we have a lot of bot-created entries that only contain non lemma forms. Dixtosa (talk) 13:20, 7 July 2018 (UTC)
Hmm, I wonder if there should be an entry for rouge admin? Imaginatorium (talk) 14:40, 7 July 2018 (UTC)
If it's attested outside WP. Otherwise, w:Wikipedia:Rouge admin covers it. - -sche (discuss) 17:28, 7 July 2018 (UTC)
Conversely, Wikipedia has lots of entries like w:List of Indian states and union territories by literacy rate, w:List of Indian states and union territories by GDP, w:List of Indian states and union territories by access to safe drinking water, w:List of Indian states and territories by highest point and w:List of Indian states and territories by Human Development Index, in addition to its entries on the actual states themselves. - -sche (discuss) 17:28, 7 July 2018 (UTC)
Wikt entries that have made me laugh today: National Teacher Appreciation Week. Equinox 17:29, 7 July 2018 (UTC)
  • -sche...you forgot to add <joke>. BTW, apparently WF has already WP Main Page, if this article is to be believed. --Harmonicaplayer (talk) 15:14, 9 July 2018 (UTC)
  • Also, I hope we have told the whole world about these feat - on our Twitter page, Facebook page, Instagram feed, in the Online Club of Dictionaries, on Wikicommons, and the Wikipedia Signpost itself. I'll do my bit and try to have it published in El Pais. --Harmonicaplayer (talk) 15:18, 9 July 2018 (UTC)
  • Yes, I could literally delete the Wikipedia main page. It is highly unlikely that I would do that, as I have zero familiarity with Equinox's getaway driving skills. By the way, Dixtosa, Wikipedia has millions of bot-created entries that have nothing but, i.e., census data for obscure localities. bd2412 T 19:42, 8 July 2018 (UTC)

Are comparative or superlative forms lemmas?[edit]

There seems to be some inconsistency when it comes to entries: in Finnish alone, there are ones marked as lemmas: katalin, and ones that are not: kallein. (The exceptions would naturally be if the forms are themselves used in some idiomatic way) SURJECTION ·talk·contr·log· 10:20, 7 July 2018 (UTC)

I don't think they should be. Big is the lemma; bigger/est are inflections. Equinox 12:46, 7 July 2018 (UTC)
It is probably better to unify either way for most languages. One could argue that because they can be inflected (in Finnish at least), they could be classified as lemmata, although I also think that they shouldn't be classified as such. Anyone got a bot lying around? SURJECTION ·talk·contr·log· 12:47, 7 July 2018 (UTC)
It seems Finnish and Spanish are primarily affected; I tried to check the comparative and superlative categories of other languages, and they do not seem to be set as lemmas. SURJECTION ·talk·contr·log· 13:59, 7 July 2018 (UTC)
Update: Also affects the adverbs. Russian adverb comparatives are also set to be lemmas, when they should not be. SURJECTION ·talk·contr·log· 14:03, 7 July 2018 (UTC)
I have started working on the Finnish entries - Russian and Spanish seem more numerous, so it is probably better to automate the conversion there. The head templates are what needs to be changed. SURJECTION ·talk·contr·log· 16:06, 7 July 2018 (UTC)
In Ancient Greek, many comparative and superlative adjectives are treated as lemmas. I think this makes more sense than it does in English, because they have inflected forms of their own, and a few adjectives have more than one comparative associated with them, sometimes with a different range of meaning. For an extreme example, see the bottom of the declension table for ἀγαθός (agathós), which currently lists six comparatives and five superlatives. I agree that consistency is a good idea, but would request that you get agreement from the editors who've worked hardest on a language before making any changes. For the record, I prefer treating Ancient Greek comparative and superlative adjectives as lemmas. — Eru·tuon 18:26, 7 July 2018 (UTC)
I did actually point this out a bit earlier: "One could argue that because they can be inflected -- , they could be classified as lemmata". The reason why I do not believe so though, is because how to actually derive the comparative and superlative forms is usually predictable and very much resemble how inflection works, making them inflected forms instead. SURJECTION ·talk·contr·log· 18:32, 7 July 2018 (UTC)
@Surjection: I guess comparatives and superlatives are usually predictable (in English and Ancient Greek at least), but I'm not sure if that is a feature that is used to distinguish inflected forms from derived forms. — Eru·tuon 20:02, 7 July 2018 (UTC)
The difference is made based on the words you can logically do it to. Most adjectives have comparatives and superlatives, with uncomparable adjectives being the exception. Being comparable is the status quo. That is the opposite for derived forms, where not being able to derive from a specific word is the status quo. The existing categories too say that comparatives are "adjectives that are inflected to display relative degrees of given qualities between nouns". SURJECTION ·talk·contr·log· 20:06, 7 July 2018 (UTC)
Okay, that makes more sense. I am not sure how to verify "most adjectives are comparable" though. I'm guessing that, for English at least, that would have to include phrasal comparatives like more fun, most fun (as opposed to the silly-sounding funner, funnest). — Eru·tuon 20:21, 7 July 2018 (UTC)
Based on the English entry for fun listing those as the comparative and superlative, I would assume they are included. SURJECTION ·talk·contr·log· 20:24, 7 July 2018 (UTC)
Well, despite that, I think funner and funnest usually sound silly, as if they are almost ungrammatical. I have no idea why, because short adjectives usually can have totally normal-sounding comparatives. But longer adjectives such as intelligent usually don't have synthetic comparatives. (Intellegenter, intellegentest sound even sillier than funner and funnest. That is, they are felt as more ungrammatical.) So, while I do think synthetic comparatives and superlatives in English can be considered inflectional forms, or at least that it is traditional to do so, and would be most practical to categorize them as such on Wiktionary, I'm not sure about the generalization that adjectives are comparable by default. — Eru·tuon 20:36, 7 July 2018 (UTC)
The fact that Category:English comparable adjectives is not a thing but Category:English uncomparable adjectives is should be sufficient evidence to say that comparable adjectives are the default. (This also applies to other languages) SURJECTION ·talk·contr·log· 20:41, 7 July 2018 (UTC)
There are various factors involved: I'd say, roughly, -er, -est are likelier to "sound right" on words that are older, Germanic, and have fewer syllables. Comparability is certainly not the default for long, modern, Latinate scientific words as found in biology/chemistry. Equinox 20:45, 7 July 2018 (UTC)
Scientific words tend to be uncomparable due to their rigorous definition, as well as the fact that many describe a "set" and you cannot really compare the degree something is included in a black-and-white set like that. SURJECTION ·talk·contr·log· 20:48, 7 July 2018 (UTC)
I don't agree: if something can be "rounder", why not "*subovater"? If "smaller", why not "*microscopicer"? Equinox 20:50, 7 July 2018 (UTC)
"more subovate", "more microscopic". I did not say all scientific words are uncomparable, but that they tend to be. SURJECTION ·talk·contr·log· 20:51, 7 July 2018 (UTC)
Well, category structure is based on a variety of concerns besides the linguistic concern of which state is the default. If the number of entries is any guide, uncomparable is the default in English because there are somewhat more adjectives in the uncomparable category (63,451) than outside it (116,594 - 63,451 = 53,143). — Eru·tuon 21:15, 7 July 2018 (UTC)
The large number of uncomparable adjectives to due to two distinct reasons: 1. large number of uncomparable scientific terms and 2. nationality and language terms (which are naturally not comparable). For other languages, there are more comparable than uncomparable adjectives. Beyond that, most basic adjectives in everyday use are comparable. SURJECTION ·talk·contr·log· 21:22, 7 July 2018 (UTC)
Okay, I guess that makes sense. (Though some nationality adjectives are given as comparable, like Englihs and Russian: after all, one can display more of the typical characteristics of a nationality.) — Eru·tuon 22:24, 7 July 2018 (UTC)
It actually would seem Ancient Greek is different - no "adjective comparative form" categories, but "comparative adjective" categories. Whether that is done should be decided on a language-by-language basis, and if we are going to do that, this is probably the time to decide for some languages. SURJECTION ·talk·contr·log· 18:38, 7 July 2018 (UTC)
No, there actually are adjective comparative forms and superlative comparative forms categories for Ancient Greek: see Ancient Greek adjective comparative forms and Ancient Greek adjective superlative forms. Remember to look under adjectives for comparative adjectives and superlative adjectives and under adjective forms for adjective comparative forms and adjective superlative forms. — Eru·tuon 19:37, 7 July 2018 (UTC)
I did actually find the former category later, and it only has a single entry, which is not an adjective comparative form but a comparative adjective form; it's an inflected form of a comparative adjective. As to the latter category, it seems inconsistent; I cannot find a rule to differentiate between the entries at Category:Ancient Greek adjective superlative forms and ones under Category:Ancient Greek superlative adjectives. SURJECTION ·talk·contr·log· 19:43, 7 July 2018 (UTC)
@Surjection: Oh, you're right. μεῖζον (meîzon) is the only entry in Ancient Greek adjective comparative forms, and it is the neuter form of μείζων (meízōn), the comparative of μέγας (mégas). That reminds me of another concern: if comparatives and superlative adjectives are categorized as adjective comparative forms and adjective superlative forms, what will we name the category for their inflected forms? (And are there practical difficulties with having a three-link chain: inflected forms of comparative or superlative forms of adjectives? Not sure.) — Eru·tuon 19:53, 7 July 2018 (UTC)
All of that will depend on whether we will consider comparatives or superlatives lemmas or not. If we do, comparative adjectives > comparative adjective forms, while if we don't, leaving only comparative adjective forms, we will probably have to rename to something else, like adjective comparatives. SURJECTION ·talk·contr·log· 19:56, 7 July 2018 (UTC)
And then there's words like northernmost, which can be turned around to read "most northern". DonnanZ (talk) 21:09, 7 July 2018 (UTC)
I meant, if comparatives and superlatives are not categorized as lemmas, then inflected forms of comparatives are a non-lemma form of a non-lemma form of a lemma. That is confusing. I think it is less confusing to treat Ancient Greek comparatives and superlatives (not English ones though) as lemmas. A similar case is participles, which are inflected forms of verbs, but in some languages have their own inflected forms. But actually participles are categorized as non-lemma forms that have their own non-lemma forms; there is no lemma–nonlemma split for participles. — Eru·tuon 21:15, 7 July 2018 (UTC)
I do not really find it that confusing - polysynthetic languages could go even further than that. Drawing the line between lemmas and non-lemmas based on whether they can be inflected comes across as somewhat disingenuous, as English adjectives are an exception here - many languages have comparative and superlative forms at least have plural forms. I would be completely okay with having comparatives and superlatives be adjective forms, while those categories would have their own subcategories for inflected forms of those. SURJECTION ·talk·contr·log· 21:30, 7 July 2018 (UTC)
Yeah, well, I can't comment on how to treat polysynthetic languages, because I haven't really studied any. — Eru·tuon 21:47, 7 July 2018 (UTC)
I've studied a couple, but not in the depth to help much here (and not recently- I'm fuzzy on the details)- @Stephen G. Brown could give you chapter and verse. Basically you may have one undisputed lemma, and then you have concentric layers of affixes that, depending how you look at it, could be derivation, inflection, or even parts of complete sentences- in lots and lots of different combinations. I remember Dr. Bright pronouncing for our class many years ago a string of 13 consonants, which he said was a single Bell Coola "word" for "I saw those two women come this way out of the water". Suffice it to say, you don't want to even try a binary distinction like this for polysynthetic languages- that way lies madness! Chuck Entz (talk) 23:28, 7 July 2018 (UTC)
As for participles, weelllllll... SURJECTION ·talk·contr·log· 21:33, 7 July 2018 (UTC)
Based on this, my proposal is: Category:LANGUAGE comparatives and Category:LANGUAGE superlatives, both of which are under Category:LANGUAGE adjective forms (and therefore not lemmata), with both having their respective Category:LANGUAGE comparative forms and Category:LANGUAGE superlative forms subcategories for inflected forms of such. SURJECTION ·talk·contr·log· 21:37, 7 July 2018 (UTC)
Hmm, but then where do you put comparative and superlative adverbs? — Eru·tuon 21:42, 7 July 2018 (UTC)
That is a good point, maybe the categories need the actual part-of-speech after the language to get Category:LANGUAGE adjective comparatives and Category:LANGUAGE adverb comparatives. I will admit that is a bit of a mouthful (especially the forms subcategories), but it is still better than the status quo or classifying comparatives or superlatives as lemmata. SURJECTION ·talk·contr·log· 21:44, 7 July 2018 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Well, comparative adverb and comparative adjective sound better. The reverse order seems quite awkward; I doubt it's very often used, if at all. — Eru·tuon 21:55, 7 July 2018 (UTC)

That is also good. Based on a quick Google search, it is actually used somewhat often, albeit comparative adjective is not as common as adjective comparative. SURJECTION ·talk·contr·log· 22:01, 7 July 2018 (UTC)
Wait no, that was all wrong. comparative adjective is more common and is the better option here. So LANGUAGE comparative adjectives and LANGUAGE comparative adjective forms. SURJECTION ·talk·contr·log· 22:02, 7 July 2018 (UTC)
Since this would be quite a major change, it is probably a good idea to create a vote. SURJECTION ·talk·contr·log· 22:10, 7 July 2018 (UTC)
Created: Wiktionary:Votes/2018-07/Restructure comparative and superlative categories. SURJECTION ·talk·contr·log· 22:28, 7 July 2018 (UTC)

Isotopes[edit]

Do we want systematic names of isotopes, like uranium-235 and oxygen-18? Including theoretical ones, a great deal of these can be attested, but I don't see them as being of lexical interest. (Some isotopes, like deuterium, have a special name that should obviously be kept.) —Μετάknowledgediscuss/deeds 04:00, 10 July 2018 (UTC)

  • I think that the hyphen makes them a "word in a language" so we should keep them. I went through a phase of creating lots of them some years ago but got bored. SemperBlotto (talk) 04:04, 10 July 2018 (UTC)
The contents of the ones that are formed systematically as element - number seems to be predictable from the entry name, so they seem lexically uninteresting; even pronunciation information is coverable by the entries for the element and the number. They seem as (mostly) useless as 58-degree (angle or day), 59-degree, etc (other hyphenated strings), so I am inclined towards deleting them. They also seem (mostly) harmless, so I don't feel too strongly about deleting them. (But it would be absurd, IMO, to keep these but delete attributive-hyphen forms.) - -sche (discuss) 05:20, 11 July 2018 (UTC)
Yes, they are predictable - so our most plurals. All words should be treated alike. Either keep them as well as attributive-hyphen forms (if they could pass RfV) or delete both. SemperBlotto (talk) 05:27, 11 July 2018 (UTC)
I'd favor not including them. They are predictable and uninteresting. It is also hard to imagine a human looking them up on Wiktionary. Almost any compound of a word and any of a range of numbers would seem unworthy of inclusion, though there could conceivably be exceptions. Obviously an expression like cloud 9 would be different, but 9 is, I think, the only number that can occupy that slot to create an expression with a distinctive meaning. DCDuring (talk) 14:20, 11 July 2018 (UTC)

Global preferences are available[edit]

19:19, 10 July 2018 (UTC)

Live vlogging fr.WT[edit]

Lyokoï has been occasionally live-editing fr.WT on video as an introduction and contributor recruiting project. The next event is scheduled for 12 July at 20:30 (not sure if that is UTC, url has a countdown) on YouTube. Commentary and editing in French, of course. - Amgine/ t·e 16:52, 11 July 2018 (UTC)

Do German participles get their own inflection tables?[edit]

I tried looking it up in the archives, but I couldn't find a clear answer. I'm talking about regular Attributive verbs, where the adjectival form has the same meaning as the verb.

  1. Do German participles get their own (non-comparative) inflection tables?
  2. If so, does the inflection table go in the existing Verb section or a new Adjective section?

Mofvanes (talk) 20:05, 11 July 2018 (UTC)

I tried to check what Finnish does, and it seems inconsistent... some participles have no declension tables, others do under the Verb section, others have their own adjective section and a declension table there, some have a "Participle" section... SURJECTION ·talk·contr·log· 15:50, 12 July 2018 (UTC)
Examples of all four: juotu, ajettu, keitetty, hakkeroitu. SURJECTION ·talk·contr·log· 15:54, 12 July 2018 (UTC)

Moving all Volapük entries to the appendix[edit]

A few months ago all Lojban entries were moved to the appendix. I think this should also happen to all Volapük entries. In the category Category:Volapük lemmas there are 2643 entries, but only one of them has any citations. There are currently 27 entries on the page Wiktionary:Requests for verification/Non-English and I doubt any of them will pass. Maybe it would be better if everything would be moved to the appendix instead. Robin van der Vliet (talk) (contribs) 15:46, 12 July 2018 (UTC)

My small contribution to this is that you should do a great job of communicating this if you do so. I felt hurt that the Lojban words were moved when I was busy with other non-Wiktionary business. I was actively editing, came back after a few months, and couldn't find out what had happened to my hard work. I understand the desire to not spend energy and time on a language that you don't know and don't use, I just would like to register that rare languages mean the number of people actively working on them is small, and there needs to be a LOT of communication to make sure the ones who care can find out about changes. Jawitkien (talk) 17:51, 12 July 2018 (UTC)
@DtheZombie, Lingo Bingo Dingo, Lunaris filia, Malafaya, Nielsheur, Pereru, Raekmannen: I would like to invite you all to this discussion, as you all have indicated Volapük in your Babel box. Robin van der Vliet (talk) (contribs) 18:02, 12 July 2018 (UTC)
The fact that a lot of Volapük words have had to be sent to RFV is not a reflection of the corpus so much as the fact that they were all created by a single problematic editor who seems to have made many of them up on the spot. Volapük does indeed have a corpus on Google Books that shows that a lot of vocabulary is indeed attestable, as was pointed out to me by User:Mx. Granger. The constructed languages that really need moving to the appendix are, in my opinion, Interlingua, Interlingue (Occidental), Novial, and potentially Ido. —Μετάknowledgediscuss/deeds 19:39, 12 July 2018 (UTC)
I agree that there is a literature in Volapük which simply does not exist in (e.g.) Lojban where all of the "literature" is purely used in experimental contexts amongst the dozen or so speakers exclusively to extend the language and test its underlying philosophy. There are several thousand lemmas in Volapük that can be attested from literature and that is not true for most constructed languages. —Justin (koavf)TCM 21:27, 12 July 2018 (UTC)

Before we do such a thing, I think we should solve the issues that have been raised at Wiktionary:Beer parlour/2018/June § On the placement of constructed languages, and on the attestation of appendix-only languages. Per utramque cavernam 21:31, 12 July 2018 (UTC)

  • Thanks for the ping. I agree with Metaknowledge and Koavf that the Volapük corpus seems to be large enough for us to cover a good number of words under CFI (unlike Lojban). It's true, though, that I've been RFVing a lot of Volapük words, mainly because one prolific user has been adding a huge number of unattestable Volapük words. It's tedious getting rid of all these entries through RFV, though, and it might be better to do it faster. Here's one suggestion: temporarily give me (or some other administrator) permission to delete on sight any entry for a Volapük noun or adjective that gets no hits on Google Books or Wikisource. Or something along those lines. That would at least put a dent in the mountain of unattestable Volapük entries, and there wouldn't be much risk of losing good entries, because a Volapük word that has no hits on Google Books or Wikisource would be unlikely to pass RFV if nominated. —Granger (talk · contribs) 01:13, 13 July 2018 (UTC)
    • Yes, the same editor of which we speak also has done similar things to other languages, including Esperanto. Anytime someone in an RFD discussion says "it's not like someone is going to create entries for all possible permutations", I think of him and say, honestly: "if you open that door, we have people who will try to bring the entire universe through it". Chuck Entz (talk) 03:34, 13 July 2018 (UTC)
      • I support the idea of giving you permission to delete under those criteria, for that editor's entries only. It's not unlike mass-deleting a vandal's contribs. —Μετάknowledgediscuss/deeds 04:17, 13 July 2018 (UTC)
        • Does anyone else support this idea? —Granger (talk · contribs) 14:02, 16 July 2018 (UTC)
          • @Mx. Granger: I do, but I'm not an admin nor do I work with Volapük. Per utramque cavernam 19:09, 17 July 2018 (UTC)
            Thanks. I'm happy to undertake the task myself. I just want to make sure I have the support of the community before I start. —Granger (talk · contribs) 00:11, 18 July 2018 (UTC)
            I think that if nobody airs a problem with it in another day or so, you should take that as good enough to run with it. We could ping people who have already commented here if you want, but it's not like they don't have this page watchlisted. —Μετάknowledgediscuss/deeds 00:24, 18 July 2018 (UTC)
        • I think it's a reasonable idea, for the reasons Metaknowledge said. The user is known to have created a lot of unattested terms. - -sche (discuss) 01:26, 18 July 2018 (UTC)
          Okay, I'll get started deleting entries created by the user in question that meet the criteria I gave above. —Granger (talk · contribs) 01:58, 19 July 2018 (UTC)

See Also vs Related Terms[edit]

Could someone tell me if it is better to have a sub-header of "See Also" or "Related Terms" ? I'm seeing both used in the Lojban entries, and I'd like to standardize if it has already been decided. Jawitkien (talk) 17:51, 12 July 2018 (UTC)

Related terms are for terms that are somehow etymologically related. "See also" is not really defined, you can put whatever you want in it. DTLHS (talk) 17:55, 12 July 2018 (UTC)
@DTLHS I see entries using "Derived terms" also.
My current usage will be:
if it is syntactically derived, I use "===Derived terms==="
if it is etymologically derived, I use "===Related terms==="
if it is related but not derived, I use "===See Also==="
Does this sound reasonable ? Jawitkien (talk) 23:30, 12 July 2018 (UTC)
Sure. DTLHS (talk) 04:18, 13 July 2018 (UTC)
What I do is put all derived terms in the Derived terms section, terms that are etymologically related but not derived (like etymological sisters, cousins, aunts, or nieces) in the Related terms section, and random odds and ends in See also (though I probably haven't used See also as much as the other two). I'm not sure what syntactically derived means. If it means phrases that contain the term in the current entry, then I put those in Derived terms. I think it's misleading to put etymologically related terms in See also rather than Related terms! But sometimes people put terms that are really derived in Related terms, or do other odd things. — Eru·tuon 04:41, 13 July 2018 (UTC)
One example of See also is in the Spanish entry gallo, meaning "rooster", where "pollo" (chicken meat) is listed. The words themselves aren't related and they aren't synonyms, but there is a clear connection between the two words. Andrew Sheedy (talk) 21:48, 15 July 2018 (UTC)

Eau, blast![edit]

Is there a way to nominate pages with only unattestable entries for deletion? I am thinking of eaublast. See also Wiktionary:Requests for verification/English#eaublast.  --Lambiam 09:38, 13 July 2018 (UTC)

{{speedy}}, if you're sure the page is too bad to merit discussion through normal channels. Equinox 12:34, 13 July 2018 (UTC)
I feel it was sufficiently discussed at WT:RFVE.  --Lambiam 14:20, 13 July 2018 (UTC)

Eye-dialect phrase alternative form entries[edit]

In 2017, the deletion discussion for thank ya so much happened. The result of this discussion was to delete the page, along with others like thank u so much, etc. Also in 2017, there were also a deletion discussion for fer cryin' out loud. The result of this discussion was different; the page was not deleted, but was instead redirected to the entry for crying out loud.

These two discussions had different results. This is inconsistent; we need a consistent way to deal with these entries, a clear community consensus on it. The problem with entries like these is that many of them have overabundant possibilities; see the comments by User:Mihia in the discussions. By current rules, technically, as I summarize some of Chuck Entz's statements in Talk:thank ya so much, these entries are not sums of their own parts, since you're inserting an eye-dialect variable (or more than one) into a phrase that is already not a sum of its own parts. Thus, I've brought up this discussion to propose that we modify WT:Criteria for inclusion to make a brief statement about these eye-dialect phrase entries, based on the consensus reached by this discussion.

Consensus from both deletion discussions clearly is that trivial eye-dialect forms of phrases should not have dictionary entries. However, there's also a similar discussion for for cryin' out loud. The result was to keep as a dictionary entry due to how common of an alternative form this one actually is. So the exception to the CFI policy I propose would presumably be if a phrase was particularly common in its eye dialect form (i.e. see ya < see you).

However, we can go either one of two ways with this. 1.) Entries such as let's get dis party started should hard-redirect to let's get this party started. 2.) Entries such as let's get dis party started should be deleted completely.

This might be a tough one to figure out. So, before starting a policy vote, I'm gonna need help forming such a vote, as I'm not even sure which direction this should necessarily go. For instance, how should we treat entries that are particularly common as eye dialect forms (such as see ya)? Should another exception be that phrases with only two words in them (X Y) or three (X Y Z) should allow as many eye dialects as possible for entries, or redirects for the second proposal? Please help me out here, much appreciated. Thanks for any input.

I'll go ahead and make some subsections here for some pre-support votes for either side of this debate. (As usual, if there was already a similar discussion to this, I don't recall it and wasn't able to find it, so don't pounce on me if there was.) PseudoSkull (talk) 22:35, 13 July 2018 (UTC)

Trivial eye dialect forms should redirect[edit]

Put support votes here if this is your opinion. PseudoSkull (talk) 22:35, 13 July 2018 (UTC)

  • Support. If someone goes to the trouble of typing an attested variation of a CFI-worthy phrase into the search bar, it should take them someplace useful. I do not trust the search function to produce useful results. I would require citations first (and shoot on sight the uncited), and use the citations page to gather citations for all incoming variations. bd2412 T 01:46, 14 July 2018 (UTC)
  • Support hard redirects, with exceptions for the eye dialect form being equally or more common than standard spellings. If someone wants to go through the effort of creating these, that's fine with me. Andrew Sheedy (talk) 02:13, 14 July 2018 (UTC)
    [thank|fank] [you|ya|yer|ye] [very|verra] much would be 2*4*2 = 16 combinations for just one phrase (and I'm sure each word has many spellings I've not thought of). The issue here is not disk space... Equinox 02:37, 14 July 2018 (UTC)
    Are each of these variations, as phrases, attestable? How many of them will ever be created if we require attestation in advance? bd2412 T 14:38, 14 July 2018 (UTC)
  • I doubt they are all attestable but I disagree with creating any of them. Eye dialect/nonstandardness IMO should be dealt with at word level and not phrase level, since one word with n variants will otherwise (potentially, depending on attestation) multiply the number of derived phrases by n. I bet there are reasons other than "space on paper" why professional dictionaries wouldn't countenance this. Equinox 17:28, 14 July 2018 (UTC)

Trivial eye dialect forms should be deleted[edit]

Put support votes here if this is your opinion. PseudoSkull (talk) 22:35, 13 July 2018 (UTC)

  • Delete entries like "let's get dis party started" and "thank ya so much" (obvious bullshit, nobody would ever search for them), keep "for cryin' out loud" since that's how the expression is usually written and said. It's important to be as restrictive as possible here since someone will inevitably add thousands of these. DTLHS (talk) 01:53, 14 July 2018 (UTC)
Symbol support vote.svg Support or somebody's going to go cray-cray and slippery-slope a billion stupid (but citeable) phrases in here. Oh, I just glanced upward and DTLHS said exactly what I am saying. Equinox 01:56, 14 July 2018 (UTC)
Symbol support vote.svg Support based on the examples, but what exactly is a “trivial” eye dialect? Something like cah or masta could be described as trivial. — Ungoliant (falai) 02:21, 14 July 2018 (UTC)
I understand the Skull to be talking about phrases, not individual words. Equinox 02:27, 14 July 2018 (UTC)

(edit conflict)

Stuff like let's get dis party started would be trivial, and that's just assuming it happens to be attested at all. By trivial I meant the phrases, not the words themselves. for cryin' out loud is a particularly common one, and it's even more often said that way than "for crying out loud". Also see ya is a very common collocation of this nature, so it should be kept as is too.
But that's part of the problem with this proposal; we need a way to measure by consensus how useful any particular one of these phrases is, but obvious trivial ones should be deleted/redirected according to either of these two proposals. Perhaps we should make the policy with these similar to how we treat misspellings (as in, if "desaire" is not a particularly common misspelling of "desire" it is not kept, regardless of if it has 3 citations as would normally be accepted). PseudoSkull (talk) 02:34, 14 July 2018 (UTC)

What questions concerning the strategy process do you have?[edit]

Hi!

I'm Tar Lócesilion, a Polish Wikipedia admin and a member of Wikimedia Polska. Last year, I worked for Wikimedia Foundation as a liaison between communities and the Movement Strategy core team. My task was to ensure that all online communities were aware of the movement-wide strategy discussion. This year, my task similar. Phase II of the strategy process was launched in April. Currently, future Working Groups members are being selected, and related pages on Meta-Wiki are being designed.

I’d like to learn what questions concerning the strategy process would you like to be answered on the FAQ page? Please answer here, on my talk page, or on a dedicated talk page on Meta-Wiki. Thanks!

If you have any questions or concerns, please, do ask!

Thanks, SGrabarczuk (WMF) (talk) 18:29, 14 July 2018 (UTC)

I'm live streaming my editing![edit]

I'm live streaming my Wiktionary activity on YouTube right now, if anyone wants to watch. https://www.youtube.com/watch?v=r3-rNoIA7cU PseudoSkull (talk) 22:06, 14 July 2018 (UTC)

The stream is over but I might do it again sometime perhaps. However, you can still see the contents of the stream. I timed out at 56 minutes. PseudoSkull (talk) 23:04, 14 July 2018 (UTC)
Just remember that being an admin shows you things that shouldn't be visible to the public- be careful to limit the kinds of things you do while streaming, and make sure what you're working with is clear of any vandalism so you won't be giving it undue attention. Chuck Entz (talk) 20:48, 15 July 2018 (UTC)
Thanks for the video. It is interesting to see how other people contribute and especially on another Wiktionary (I contribute mainly on the French Wiktionary). Pamputt (talk) 06:12, 16 July 2018 (UTC)
I disagree with Chuck. Next time, PS, do all the craziest and most hardcore admin stuff possible. I'm thinking Unblocking Vandals, Showing Deleted Edits, Mass Deletion, Hiding Edit Summaries, Editing Protected Pages and, my favourite as I've never done it before...Whitelisting. --Harmonicaplayer (talk) 17:03, 18 July 2018 (UTC)
Perhaps a Core War-style multiplayer game on Twitch, with one admin against a team of three vandals. Equinox 19:42, 18 July 2018 (UTC)

Wiktionary:Foreign Word of the Day/Nominations[edit]

What's with the new layout? Are we in Europe?? Wyang (talk) 22:05, 16 July 2018 (UTC)

The layout is by User:Per utramque cavernam (see his talk page for recent discussion of it). It reflects the approximate division of FWOTDs, which is in turn based on our strengths at Wiktionary. Hopefully more non-European languages can be featured in the future, but that also means I'll need more such words to be nominated. —Μετάknowledgediscuss/deeds 22:31, 16 July 2018 (UTC)
This ‘strength’ at Wiktionary is something to be ashamed about. < 10% of the world’s population is in Europe, yet we still pride ourselves on this Eurocentrism. All words in all languages (in Europe)... with a smattering of words elsewhere? Wyang (talk) 23:20, 16 July 2018 (UTC)
Remember this is en.wikt and has a user base that somewhat reflects that. There is no automatic way to pull in all the content from other Wiktionaries. Equinox 23:23, 16 July 2018 (UTC)
The layout of the nominations page has nothing to do with the proportion of words from different regions that are featured. DTLHS (talk) 23:31, 16 July 2018 (UTC)
Then what's the point? Don't forget that the project's main page reads "Welcome to the English-language Wiktionary, a collaborative project to produce a free-content multilingual dictionary. It aims to describe all words of all languages using definitions and descriptions in English." NOT all words of all European languages. Some editors have been working very hard to increase the coverage of the world's major languages, such as Chinese ― the language with the most native speakers in the world, outnumbering the rest by a wide margin. Yet there are some who view Europe as the centre of the world and actively try to suppress the rest: National European language vs Minor or extinct European language vs Non-European Language. Are you kidding me??? Might as well split it into Wiktionary:European Word of the Day and Wiktionary:Non-European Word of the Day. Wyang (talk) 03:34, 17 July 2018 (UTC)
I have no idea what the fuck you're talking about. Again, how does the layout of the nominations page affect what words are chosen? Are you volunteering to run the FWOTD project? Are you actually complaining about the distribution of words that are actually featured, in which case why are you talking about the nomination page? DTLHS (talk) 03:47, 17 July 2018 (UTC)
I have no fucking interest in editing in this system either. Wyang (talk) 03:49, 17 July 2018 (UTC)
During the 60s in the US South, I'm sure there are white southerners who were asking "what's the big deal about separate lunch counters? The colored folks get served the same food as everyone else?" Chuck Entz (talk) 14:05, 17 July 2018 (UTC)
When I took linguistics at UCLA, we were required to take at least one year of a non-Indo-European language in order to graduate (I chose Mandarin). The fact is that European languages were so dominant that it was hard to find courses outside of major universities in other languages, so even linguistics students tended to have no exposure to other language families before they came to UCLA, and it was too easy to stick with what was already familiar (things have improved since then, but it's still true to some extent). In that case, it was necessary to address the bias explicitly in order to do something about it. Chuck Entz (talk) 14:05, 17 July 2018 (UTC)
The current layout is definitely wrong. Not only because it really is Eurocentric but because it doesn't reflect the huge contributions in some non-European languages, such as Chinese or Japanese, etc., the current or any future true distribution of lemmas and it shouldn't. I don't approve Wyang's slamming the doors, though. It doesn't achieve anything.
The layout has to change back to what it was. --Anatoli T. (обсудить/вклад) 13:49, 17 July 2018 (UTC)
What about convenience to the FWOTD caretaker (i.e. Metaknowledge)? Unless he says otherwise, I think it might help him run the thing.
However, I agree with -sche below that it shouldn't send an undesirable message either, and if it does it's a problem (Maybe I should have named the headers "Type 1", "Type 2" and "Type 3" :p). Per utramque cavernam 15:45, 17 July 2018 (UTC)
Yes, the split into European and non-European, while probably well-intentioned, is sending a undesirable message/effect and should be undone... the current layout with all the continents seems like an improvement...? What do you think? And though we're constrained by what words people enter in enough detail to feature, a la Equinox's and other people's point, maybe we could try to explicitly counter the preponderance of Indo-European a la Chuck's point by featuring one word from each continent per week? (So people might realize they could copy the formatting of when adding more words from that language?) With two days leftover for constructed languages and repeats of continents? Or at least we could try to feature, say, at least four different continents per week? - -sche (discuss) 15:09, 17 July 2018 (UTC)
We really don't have the ability to do one word per continent per week. You ran WOTD, so you know how hard it is already to avoid burnout. If anyone volunteers to help with these issues, I'd be happy, but I haven't seen any volunteering yet in this thread. —Μετάknowledgediscuss/deeds 16:14, 17 July 2018 (UTC)

What a lame discussion. And what a twisted accusation! Obviously, the layout has only reflected what had already amassed for long, not to segregate, just to sort, bringing what the mildest system of order has to comprise. It could even help to get away from Eurocentrism, but that progressive dogma whereby disparities disappear when they aren’t exposed is apparently too attractive. No, @Atitarev, that page, as a medium, cannot just simply reflect contributions across the Wiktionary, people are still invited to post them thither, and if the managers don’t have a secret agenda, then apparent unevennesses are just. And they are also expected, a priori, for a Wiktionary of an European language attracts users of European ties and the economic and even individual probabilities (who gets educated in which languages, becomes computer-literate and has the spare leisure to come hither) play an innegligible role too. Fay Freak (talk) 01:04, 18 July 2018 (UTC)

@Fay Freak, what is it that you want to to add here? I see name-calling and pooh-poohing. Was that your intent?
I may not share Wyang's tetchiness, but I understand his concerns and I do share them, albeit perhaps to a lesser extent. Please also see Chuck's comment above about lunch counters. ‑‑ Eiríkr Útlendi │Tala við mig 17:15, 18 July 2018 (UTC)
@Eirikr I don’t see any of it. The point is that there is nothing surprising in the appearance of “Eurocentrism” and people see only things that aren’t there instead of the things that are there, in which latter case nobody would move an eyebrow. Fay Freak (talk) 17:20, 18 July 2018 (UTC)
If you don't see any of it, why comment? It seems you're trying to make the case for Wyang being wrong, and for Eurocentrism being right. I cannot agree with either proposition.
Again, see Chuck's comments above. ‑‑ Eiríkr Útlendi │Tala við mig 17:41, 18 July 2018 (UTC)
@Eirikr No it doesn’t seem like that. The whole point is that in the depth there is no Eurocentrism there. It’s just a certain distribution of edits to Wiktionary, to that page, and what the managers find, streamlined. It is a mapping to that “Eurocentrism” of the editors seen together. Which isn’t “Eurocentrism” either, of course because one cannot see the editors together but everyone has different motivations, but a natural result of economic and individual probabilities. Thus I conclude that there is nothing to complain about. Fay Freak (talk) 18:10, 18 July 2018 (UTC)
I'm not saying whether there's Eurocentrism in content. My point is that it was unnecessarily giving the appearance of Eurocentrism. It doesn't matter whether the appearance isn't a reflection of the reality or not: if people are put off by the appearance, they're not going to stay around long enough to find out about the reality. "Other than that, Mrs. Lincoln, how was the play?" Chuck Entz (talk) 05:05, 19 July 2018 (UTC)

Replace {{unreferenced}} with {{rfr}}[edit]

Hey, could we replace {{unreferenced}} with {{rfr}}? It would fit to the scheme we use for {{rfe}} and {{rfv}}. --Victar (talk) 01:28, 17 July 2018 (UTC)

I agree the templates should be merged. If you can make {{rfr}}'s parameter 1 default to en or und when not specified (so existing uses of {{unreferenced}} don't break), we could just redirect {{unreferenced}} to {{rfr}}. (We should keep the redirect, of course, because why not? Some people might be used to typing it.) - -sche (discuss) 01:33, 18 July 2018 (UTC)

Has there been a vote on deleting bagua[edit]

Does anyone know if there has been a vote on deleting bagua (the components of an I Ching hexagram ? Jawitkien (talk) 15:39, 18 July 2018 (UTC)