Wiktionary:Beer parlour

Definition from Wiktionary, the free dictionary
(Redirected from Wiktionary:Beer Parlour)
Jump to: navigation, search

Wiktionary > Discussion rooms > Beer parlour

Lautrec a corner in a dance hall 1892.jpg

Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.

Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.

Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!

Beer parlour archives edit


August 2017

travel game[edit]

Would travel game (a board game or card game that was modified to be playable by passengers during a trip) be considered a SoP? W3ird N3rd (talk) 05:34, 1 August 2017 (UTC)

Looks good to me. I'd class I spy and the number plate game as my favourite travel games from when I was a kid in a car. --WF on Holiday (talk) 23:04, 1 August 2017 (UTC)
That's actually not even the definition I meant. I meant games like chess or Ludo that have been modified (e.g. with magnetic game pieces) to be played in a car or on a train. Amazon link to clarify. W3ird N3rd (talk) 00:10, 2 August 2017 (UTC)
I thought about it and the definition I was originally thinking of (and yours as well) is SoP after all because there are other "travel" things. But travel didn't have an adjective section yet. It does now.
  1. (in a compound) An object or activity that has been designed or reworked for use while travelling.
    (object) I've packed the chess travel game in my travel bag and I've got my travel cup in the cupholder, I'm ready to go!
    (activity) Let's play a travel game. I spy with my little eye..

W3ird N3rd (talk) 04:26, 2 August 2017 (UTC)

Aaaaand it's gone. @SemperBlotto, is there a reason you just chucked the whole thing instead of moving it into an additional definition for the noun? I had looked at running (like "running man") and noticed it had an adjective section, but on closer inspection that is used for other meanings of running.. I think. I'm not even sure. I can't entirely explain why running is an adjective in all meanings mentioned but travel isn't. W3ird N3rd (talk) 06:28, 2 August 2017 (UTC)
(@SemperBlotto. —suzukaze (tc) 06:45, 2 August 2017 (UTC))
Thanks, I'm still learning how these things work. I looked it up: https://en.wikipedia.org/wiki/Attributive_verb. So it appears travel acts as a deverbal adjective. So it seems either SemperBlotto is wrong or somebody needs to remove the adjective section from exciting or I may be losing my marbles. W3ird N3rd (talk) 06:57, 2 August 2017 (UTC)
I dunno, travel in travel game sounds like the noun travel to me: a game used during travel. It's weird to try to think of it as a verb. — Eru·tuon 07:07, 2 August 2017 (UTC)
Right, so it's https://en.wikipedia.org/wiki/Noun_adjunct. So "travel game" can't be added because I suspect it's SoP yet the information in travel and game don't really allow one to figure out what a "travel game" would be. And this information can't be added to travel either. Okay, my marbles are definitively gone. W3ird N3rd (talk) 07:28, 2 August 2017 (UTC)
Were I looking at this naively, it would be ambiguous to me whether this meant "a game suited for travel" or "a game related to travel" or "the travel industry" or "one of a genre of games somehow related to some definition of travel", or ..... I don't think that dictionaries should act as if they are well suited to hold users' hands as they try to figure what a phrase or sentence or larger unit of language unless there is true novelty or obscurity worse than what I have advanced as my own naive view of alternative meaning. DCDuring (talk) 19:41, 2 August 2017 (UTC)
Well, it's not a phrase, it's a compound noun. I think from what you're saying, it's (for a naive reader) not transparent. — Eru·tuon 19:57, 2 August 2017 (UTC)
Gee, lots of people would call it a noun phrase or NP. Do we have a policy about which school of labels we follow? DCDuring (talk) 21:35, 2 August 2017 (UTC)
Not that I'm aware of. The criterion of spacing bothers me because it means that if you happen to add spaces between the parts of a compound, then it suddenly changes to a phrase. So honeybee is a compound, while honey bee is a phrase. Utterly arbitrary. There has to be a more solid criterion than spelling. — Eru·tuon 22:02, 2 August 2017 (UTC)
If a multi-word expression is attestably spelled solid, we have decided that is sufficient evidence to say that phrase, usually a bare NP, is includable. That criterion is intended to shortcut our repetitive, amateurish arguments about including such terms. DCDuring (talk) 22:29, 2 August 2017 (UTC)
Huh. I was talking about criteria for whether something is a compound noun, not CFI. — Eru·tuon 22:36, 2 August 2017 (UTC)
I think that, in practice, we try to avoid academic discussions with only indirect application to Wiktionary. It seems to me a good practice. DCDuring (talk) 23:46, 2 August 2017 (UTC)
You're probably right. I'm quite annoyed by compounds being called phrases, but it isn't particularly useful to discuss. Back to the content of your post, you recognize potential ambiguity with travel game but still don't think it should be included. I find that baffling, given that English Wiktionary is used by lots of people who don't speak English well. I would imagine that at least some of them would misunderstand travel game in the ways you mention. — Eru·tuon 01:59, 3 August 2017 (UTC)
@Erutuon: To me those ambiguities are typical of those that arise in interpreting any NP/compound noun that one hasn't heard before. In normal speech, the context shows one definition to be the most relevant of all of the ones that are possible from the definitions of the component terms. I consider the situation to be illustrative of why we focus on transparency of meaning in the context in which a term is used, given the definitions of the component terms. DCDuring (talk) 07:48, 3 August 2017 (UTC)
It really, really, REALLY wouldn't be the first time I turn to Wiktionary (or any other dictionary) to look up a word that I have no proper context for. For example when something like this happens in a TV show:
So what do you hate most?
-Any travel game.
Why do you hate that so much?
And then the show continues. Maybe it's a running gag. Maybe it's a reference to something in a previous episode that I missed. Maybe it refers to some character trait. Maybe it refers to some event or tradition that I'm not aware of, like some scandal in the country where the show was made. Maybe it's just plain random.
Alternatively, some word will pop up in my head randomly but I can't remember the context I heard it in. Out of curiosity I try looking it up. What it comes down to is as simple as this: Wiktionary is useless to look up any ambiguous SoP so I'll be forced to go elsewhere. If that's your goal I'd say mission accomplished. W3ird N3rd (talk) 23:47, 3 August 2017 (UTC)
Why wouldn't it be arbitrary? Why should you be able to draw a clean line between a compound and a noun phrase? There's no clear line between a "canoe truck", a "turnip truck" and a "fire truck". Certainly, though, English words spelled without spaces are more likely to be organic unions with a unique meaning, whereas noun phrases are more likely to be spelled with spaces and have meanings obvious from the individual words.--Prosfilaes (talk) 23:24, 2 August 2017 (UTC)
I dunno, it seems axiomatic that syntactic categories (word, phrase) should be based on something other than spelling, such as syntactic behavior. If they coincide with spelling, great. Honey bee behaves no differently from honeybee, so it is in the same syntactic category. Maybe there are spaced-out compounds that could with more justification be called phrases. I agree, though, that there is something determining whether a compound can be written with spaces: if it would be too long as a single word, or its meaning is obvious from its constituent parts. At some point on the continuum of each characteristic, it's acceptable to write a word either way. But I don't think either characteristic has anything to do with syntactic category (word or phrase) either. — Eru·tuon 01:59, 3 August 2017 (UTC)
Given my background in computer science, it seems axiomatic that you lex before you parse, and that you have to figure out what a word is before we starting figuring out what stuff means. That's sometimes not possible in computer or human languages, and pauses in audio would be more reliable than spaces in text, but things should be broken into words ideally before we get into syntax.--Prosfilaes (talk) 03:20, 3 August 2017 (UTC)
@Prosfilaes: I don't know anything about computer science or quite what lex and parse mean, but what I mean by syntactic category is word, phrase, clause, or noun, verb, adjective, etc. So which things are words is connected to syntax. Anyway, from what programming I've done (mostly on Wiktionary), programming languages are far more tightly constrained and more straightforward to analyze (if not figure out what their actual purpose is) than human languages, so I don't know how much of the process is similar to analyzing the lexical or syntactic categories of human words. — Eru·tuon 21:22, 4 August 2017 (UTC)
The basic ideas of lexing and parsing used in computer languages were designed by Chomsky for use in human linguistics. The point is, we can't talk about nouns and adjectives before we figure out what words are. In both human and computer languages, you lex (split text into words and specific punctuation marks) and then you parse, and occasionally you're forced to go back and relex the text in light of the parsing. But in both cases, you do the vast majority of breaking stuff into words before you start trying to figure out the meaning. There's a reason why spaces and verbal pauses exist in languages; it's to make it easy to clearly split things into words.--Prosfilaes (talk) 23:04, 4 August 2017 (UTC)
Well, it seems my use of the name syntactic category for word and phrase got you on the tangent of lexing before parsing. I don't know, maybe syntactic category isn't the right term. I have no idea. And I don't see how lexing before parsing relates to whether compounds are words or phrases. — Eru·tuon 23:22, 4 August 2017 (UTC)
I would imagine (but I can't speak for Prosfilaes) that if "travel game" is a word in your dictionary, you can just look it up and you know what it means. If it's not in your dictionary, you will assume it's just two words and you look up travel and game. From that, some systems (like Google translate, Babel Fish, etc) could probably end up being fooled into assuming this is roadkill or a really annoying basketball game. W3ird N3rd (talk) 00:01, 5 August 2017 (UTC)
DCDuring, I hadn't even thought of those interpretations yet. Thinking about that, I realized game also means wild animals hunted for food. It would depend heavily on context whether a non-native speaker could actually make that mistake, but I think it would be funny as hell. In the text "We were very hungry because we didn't pack enough food. But at least while on this trip, we enjoyed some travel game." the "travel game" could actually be interpreted as roadkill. Bon appétit! I'm hoping Wiktionary:Beer_parlour/2017/August#Allow_more_SoP_compounds.2C_similar_to_Dutch_and_German. or another rule change based on that will fix this in the future, but I don't think I'm going to hold my breath. W3ird N3rd (talk) 02:12, 3 August 2017 (UTC)
The MWE is also a synonym of away game. DCDuring (talk) 01:08, 5 August 2017 (UTC)

order Arabic disambiguating entries orthographically, not by verbal forms[edit]

Currently Arabic disambiguating entries are ordered by verbal forms instead of orthographically, which is not the optimal lexicographical approach. Thus, for ease of reference, يُوجدُ should appear just once in the page for يوجد, specifying it could belong to either verbal form I or verbal form IV. --Backinstadiums (talk) 08:47, 1 August 2017 (UTC)

It should be just as easy as modify a line of code --Backinstadiums (talk) 12:39, 4 August 2017 (UTC)

@Backinstadiums: Huh? What line of code? — Eru·tuon 18:14, 4 August 2017 (UTC)
@Erutuon: I mean it cannot be that much of fuss, just a different grouping in a specific case. If anything should be clarified further, please let me know. --Backinstadiums (talk) 20:57, 4 August 2017 (UTC)
@Backinstadiums: To do this, the template {{ar-verb-form}} would have to no longer display the form number and many entries would have to be edited (there are 35,188 entries in Arabic verb forms, some of which will contain homophonous verbs with different Form numbers). The editing part would be a lot of work, and would probably have to be done by bot, as the entries were in large part created by bot. I'm agnostic on whether the change would be helpful or consistent with Wiktionary organizational principles, and no one else has responded: @Atitarev, Wikitiki89, Benwing2? — Eru·tuon 22:22, 4 August 2017 (UTC)

Just like any issue in life, no matter how much is already done, if it's not in accordance to the optimal lexicographical approach which enables ease of reference to improve the user's usability, action must be taken on it as soon as possible not to worsen resources even more --Backinstadiums (talk) 15:39, 10 August 2017 (UTC)

August LexiSession: circus[edit]

Let's go to the circus!

The monthly suggested collective task is to collect words about the circus. I've noticed that Wikisaurus:circus does not exist, and auguste is a kind of clown, so this a great opportunity to look around this topic together!

Let's stop clowning around and juggle some ideas together!

By the way, Lexisession is a collaborative experiment without any guide or direction. You're free to participate however you like and to suggest next month's topic. If you do something this month, please let us know here or on Meta, to let people know that English Wiktionarians are doing something on this topic. I hope there will be some people interested in making some contributions! Face-smile.svg Noé 13:43, 1 August 2017 (UTC)

Here's a good start - to be added to Category:en:Circus if appropriate[edit]

Circus and sideshow attractions[edit]

Maybe this is me being slightly grumpy because some people in another discussion I started don't seem to entirely grasp what I was suggesting, but aren't some of these SoP?
I personally don't have a problem with any of these and luckily I'm not a SoP nazi, but after a few RfDs this project could end up with more red links than it had when it started. W3ird N3rd (talk) 03:30, 4 August 2017 (UTC)
  • Some of these seem lame and/or SoP. On the other hand sources such as Carny Lingo show that there is a large vocabulary of great charm and linguistic interest. I doubt that we will get very far into that highly desirable content this month, but extracting a list of terms from that and similar sources would be useful for Wiktionary, IMO. I'm not at all sure that the terms fit well into the categories suggested, many better assigned to a category based on usage context, eg, Category:English circus slang or similar. Examples, barnstorm, blow a tip, blow one's pipes, build a tip, burn the lot, carry the banner, clean the Midway, cool out, bail the counter, bat away. Unfortunately, I don't know that we have a good system of such categories, instead duplicating encyclopedic-type "topical" categories. DCDuring (talk) 08:29, 4 August 2017 (UTC)
    I suppose much of this would fit in Category:English circus slang. I hope that {{lb|en|circus slang}} or {{lb|en|circus|slang}} would work. DCDuring (talk) 08:36, 4 August 2017 (UTC)
    My hopes are in vain. I hope someone can rectify the operation of {{lb}} so anyone interested can help us play along with this cross-project effort. Note that there is a considerable overlap between criminal slang and circus slang. DCDuring (talk) 08:48, 4 August 2017 (UTC)

Next steps for Wikidata access[edit]

Hello all,

Thanks to @Daniel Carrero there's now a page to centralize all the discussions and information related to accessing Wikidata data from English Wiktionary. I hope we can improve it soon with examples and documentation :)

We also suggest an enabling date for the arbitrary access: September 7th. If you have any question or concern, feel free to ask. Thanks to the people who worked on this! Lea Lacroix (WMDE) (talk) 13:52, 1 August 2017 (UTC)

Thank you. September 7th looks good to me. --Daniel Carrero (talk) 03:18, 2 August 2017 (UTC)

Best practices for Oxford -ise/-ize variants[edit]

I just made Birminghamize. What should be put at Birminghamise? —Justin (koavf)TCM 00:58, 2 August 2017 (UTC)

It is ridiculous that Wiktionary lacks a basic policy on how to handle these variant English spellings: afraid of offending others / nobody willing to take charge / deeming status quo as good enough / etc., ... so there we go - both color and colour can evolve in parallel. Wyang (talk) 06:05, 2 August 2017 (UTC)
The problem is that someone who dares to set a standard will likely get into an edit war. So nobody touches it with a pole. —CodeCat 19:49, 2 August 2017 (UTC)
This would be a perfect application of Wikidata. —Justin (koavf)TCM 23:56, 2 August 2017 (UTC)
I have never heard of Birminghamise, so unless you can find it being used there is no point in making an entry. But normally -ise verbs are labelled "British spelling" so they can appear in Category:British English forms. DonnanZ (talk) 23:45, 5 August 2017 (UTC)

Etymology giving me problems[edit]

Can someone review:

for the etymologies that I've added? All of these words are directly taken from Spanish but I've clearly not made them all correctly formatted. Also, I'm not sure if there's a different way of noting a language which is a creole based on [x] versus a language which simply adopts one word from [x]. (E.g. the difference between a Haitian Kreyol word derived from French versus using "facade" in contemporary English). Thanks. —Justin (koavf)TCM 02:24, 2 August 2017 (UTC)

The relation between a creole word and its etymon from the lexifier doesn’t fit very well with the inherited/borrowed dichotomy. We should consider adding templates for other special kinds of derivation like this and substrate “borrowings”, semi-learned borrowings, etc. — Ungoliant (falai) 02:45, 2 August 2017 (UTC)
It's good to see someone else basically ratify that. For creoles/pidgins, it's really a different matter than to go from stages of a language (Old English → Middle English → Modern Englishes) or inheritance in a family (Proto-Germanic → English). —Justin (koavf)TCM 04:43, 2 August 2017 (UTC)

French Wiktionary monthly news - Actualités[edit]

Logo Wiktionnaire-Actualités.svg


I am happy to inform you that the 28th issue of Wiktionary Actualités just came out in English!

As usual, Actualités is in English but talk about French Wiktionary and lexicography in general.

In this edition main articles are: a presentation of the Lingua Libre project to record words, a summary of a strange dictionary and a thought about lemmas and grammatical categories. And more: shorts, statistics (including new ones like the number of pages that include a link to a thesaurus) and an explanation about the Linter.

As usual, it is translated in English by non-native speakers, so it is not perfect, but it can be improved by readers (wiki-spirit as usual). Please note that we do not received any money for this publication and we are not supported by any user group or chapter. It is only written by the community. Feel free to leave us comments! Face-smile.svg Noé 09:09, 2 August 2017 (UTC)

Allow more SoP compounds, similar to Dutch and German.[edit]

So there was a discussion last month about deleting SoP compounds in German and Dutch. Now triggered by "travel game", perhaps we could explore pros and cons for the opposite. That is, allowing English SoP compounds in ways similar to the way they would be allowed in German and Dutch.
So exactly what does that mean? Put simply, if some SoP would pass an RfV and is not using any common/universal word (like "brown" or "fan") it would be allowed. This means you still can't create brown leaf or large box, but you could create burger joint and sheep farmer. Also computer chip and lab rat, those already exist but I'm not sure how they could be justified by the current rules. Optionally you could exclude any SoP with a space that is unambiguous. (like sheep farmer)

  • Pro: while it may be possible to figure out the meaning of a SoP by looking up the parts, it's not always easy. The parts may have more than one possible meaning so you need to figure out the correct meaning for all the parts.
  • Pro: in the case of "travel game", travel and game don't really make it clear what a travel game is. Travel game is probably SoP and adding the attributive noun use to travel just results in an instant revert. So basically it's impossible to describe a "travel game" on wiktionary.. in English. I could, however, describe the Dutch word reisspel.
  • Pro: fietshater (bike hater) would pass RfV and should be allowed by the current rules. It wouldn't be allowed with these new rules because hater is universal and can apply to thousands of nouns and verbs.
  • (added august 4) Pro: translations. How would you translate juice extractor (SoP) to Dutch? Juice is sap, but how to translate extractor? The correct answer is sapcentrifuge, but would you have ever guessed it? Wiktionary is useless in this case, and this example wasn't even that ambiguous.
  • (added august 5) Pro: We can delete ex-pilot. (Wiktionary:Requests_for_deletion#ex-pilot).
  • Con: there will be more entries on wiktionary.

I'm not taking a stance on this myself yet, I just think it's worth thinking about. I may not be seeing the whole picture. I haven't made up my mind yet and I think it's a good idea. I wonder what you think. W3ird N3rd (talk) 09:18, 2 August 2017 (UTC)

Sorry W3ird N3rd, strong oppose on that. As I mentioned in the discussion you're referring to, we need to have some sort of quality control around here – having more entries on Wiktionary doesn't necessarily boost our credibility if said entries are redundant. --Robbie SWE (talk) 09:57, 2 August 2017 (UTC)
There would still be a form of quality control. RfV requirements still apply and common words are not allowed. Optionally you could add that if the SoP is fully transparant (like sheep farmer) it is still not allowed, while allowing burger joint and travel game. You talk about quality control, but you actually don't have that control right now as I could create fietshater (bike hater) and you probably couldn't do a thing about it. W3ird N3rd (talk) 10:47, 2 August 2017 (UTC)
I support allowing all attestable English compound words, and no longer making it spelling-dependent. Consequently, WT:COALMINE would be superfluous, as coal mine would no longer depend on the attestability of coalmine for inclusion. —CodeCat 10:53, 2 August 2017 (UTC)

Perhaps we could deal with SOP compounds differently than with other lemmas, effectively soft redirecting them to their constituents while keeping them for consistency, maybe like
(literally) A mine from which coal is dug
(literally) An exhibition (tentoonstelling) of Khoekhoe (Hottentot) tents (tent)
While being subject to usual attestation rules and linked from translation tables as hottentottententententoonstelling f where applicable (of course we won't have a page for Khoekhoe tent exhibition to link translations from).
It could get messy with languages with transliteration though. Crom daba (talk) 13:04, 2 August 2017 (UTC)
Why don't we include any rubbish as terms and forget about CFI? Who cares about this dictionary and its reputation, anyway? --Anatoli T. (обсудить/вклад) 13:33, 2 August 2017 (UTC)
We already host all sorts of rubbish, my approach would make it more manageable and invisible in most use cases. Crom daba (talk) 14:00, 2 August 2017 (UTC)
I don't see the connection between being more inclusive of compounds and quality. The more useful lexicographical content we can provide, the better. RFV provides good quality control, alongside making sure our entries are clean and properly formatted. —CodeCat 14:33, 2 August 2017 (UTC)
@Atitarev and @Crom daba, please be aware that hottentottententententoonstelling is a terrible example that isn't even related to this discussion because it is a joke word and tongue-twister. This word is not, will not and has never been used to refer to any kind of actual exposition. I used it in the other discussion to demonstrate how hard it can be to break down Dutch compound words, but hottentottententententoonstelling isn't SoP. W3ird N3rd (talk) 14:42, 2 August 2017 (UTC)
Yes, I've read the entry, I'm merely using it to show how non-idiomatic words could be formatted. Crom daba (talk) 14:49, 2 August 2017 (UTC)
You may know that, but I think Atitarev possibly doesn't and now thinks Wiktionary will be filled with thousands of rubbish words like that. W3ird N3rd (talk) 14:54, 2 August 2017 (UTC)
oppose. There's no need for most multi-word English terms in English, and nobody will look them up.--Prosfilaes (talk) 23:02, 2 August 2017 (UTC)
Wanna bet? How can you be so sure? Nobody has any idea what passive users search for. DonnanZ (talk) 14:16, 3 August 2017 (UTC)
Also, people do look them up. Pageviews for lab rat are similar to minibar. In addition, how can you say nobody will look something up when the thing in question doesn't (or isn't allowed to) exist? And another thing: translations. We can't have a juice extractor because it's SoP. So now translate juice extractor into Dutch. Good luck with that. You will correctly find sap for juice but how are you supposed to translate extractor? Here's the answer: a juice extractor in Dutch is a sapcentrifuge. Which you could have found if you had looked up juicer (it just so happens a single-word synonym exists here, this is not always the case), but you won't find that if you're looking for a juice extractor, which is the term I'm most familiar with. The very fact is that I had to look up this example on Wikipedia: w:Juice extractor which helped me find juicer. And it's just sheer luck that a juice extractor happens to be encyclopedia-worthy. W3ird N3rd (talk) 01:10, 4 August 2017 (UTC)
This is true. Professional translators seldom need ordinary dictionaries (such as collegiate dictionaries), we want dictionaries that are mainly multi-word, such as the French-English Dictionary of Petroleum Technology. Multi-word dictionaries are the gold-standard and they command high prices. My Dictionary of Petroleum Technology cost me $115 in 1980. In my translating company, we virtually never used any of the ordinary dictionaries (such as Websters, OED, Random House, American Heritage), we only purchased and used the very expensive multi-word dictionaries. Even now that I'm retired, I never use the simple word dictionaries. Almost all the terms I ever have to look up are multi-word terms, and Wiktionary does not handle those. Translators have to equip themselves with a pile of very expensive dictionaries, and all of them multi-word. —Stephen (Talk) 03:46, 4 August 2017 (UTC)
  • I find the juice extractor argument convincing. So far our CFI mainly cover "does anyone want to look it up?" and less "might anyone want to translate it?" Been on a treasure hunt within Wiktionary for translations myself in the past. Korn [kʰũːɘ̃n] (talk) 14:33, 4 August 2017 (UTC)

General question: what does SoP compound even mean?

Compounds have a continuum of transparency of meaning, but they generally do not have a single possible meaning. If they are formed from two nouns (as for instance travel game), there are several possibilities. I'm somewhat rusty on them, but I gather that travel game is a tatpurusha, where travel is added to game to signify a particular type of game, and travel has the meaning of a particular prepositional phrase (or in Sanskrit a grammatical case. Putting aside the other forms of compound, the relationship of travel to game is unknown when you're newly encountering the word. The actual relationship, in terms of grammatical cases, is locative: "a game played during travel". But there are other possible interpretations, such as "a game consisting of travel" (like, I dunno, a long-range treasure hunt?). Meh, it's not a very good example, or I'm not very good at brainstorming about possible meanings.

There are compounds that might be clearer: for instance, bike-hater. A noun combined with hater is often the object of hater (the thing that is hated). But even there, theoretically it could mean "a hater who is on a bike".

So I don't think compounds can be SoP in the same way that regular phrases or sentences are, like "some people hate bikers". There isn't one predictable semantic relationship between the elements of a compound the way there is with phrases. In the previous case, some people is the subject of hate, and bikers is the direct object of hate: that's the only way it can go down.

So what is a SoP compound? I have no idea. I think it should be well-defined for it to serve as a CFI. — Eru·tuon 21:15, 4 August 2017 (UTC)

  • "We were very hungry because we didn't pack enough food. But at least while on this trip, we enjoyed some travel game." roadkill!
  • travel game: A game to play on a journey (like I spy or punch buggy)
  • a physical game, like chess, designed for use on a journey (magnetic pieces etc)
  • geocaching / geohashing
  • travel business (Doug Parker is a big name in the travel game)
  • Something that resembles a game with rules, despite not being designed: in the travel game, being held up for security checks is becoming less of a drag and more of a routine nowadays
  • The ability to seduce someone, usually by strategy:
Watch him. He's got a great travel game.
-He's got a WHAT?
Travel game. Basically he just takes any chick he picks up to Paris. Guaranteed success.
  • The travel game that is used by airlines where they offer cheap tickets but charge extra for additional luggage, meals, toilet visits and use of the oxygen mask is really disgusting.
  • (basketball) I'm getting tired of these travel games. They just travel for most of the game time. It's not funny anymore.
  • (childbirth) There's nothing fun about the travel game, but all that is forgotten when the mother is holding her newborn baby.
Will this suffice? ;-) W3ird N3rd (talk) 23:45, 4 August 2017 (UTC)
Then again, all of these sound valid to me, which could be an argument that compounds really are a sum of parts, or rather a product. Crom daba (talk) 16:03, 5 August 2017 (UTC)
That's the trick. They may sound valid, but most of them are completely invalid.
Unlikely to pass RfV:
  • roadkill
  • a game of basketball with lots of travelling
  • the ability to seduce someone
  • game that involves travelling
  • a questionable or unethical practice
  • childbirth
Using a universal part:
  • travel business (possibly won't pass RfV either)
  • something that resembles a game with rules, despite not being designed (possibly won't pass RfV either)
  • a game to play on a journey
  • a physical game, like chess, designed for use on a journey
So by the proposed guideline, only the last two definitions would be included. But that's not final, you could argue about exactly what should and should not be included. For example, you could argue that if a valid entry for travel game already exists, it's acceptable to add travel business (if that would pass RfV) while at the same time not allowing an entry to be created solely for travel business. W3ird N3rd (talk) 16:56, 5 August 2017 (UTC)

What about if I would word it like this:

  • SoP compounds with an irregular translation in another language are allowed. (or at least their translation section would be) This will allow juice extractor because of sapcentrifuge.
  • SoP compounds with a space or hyphen that have no irregular translations in another language, have only one meaning and this meaning can be reasonably obtained by looking up the first definition of the seperate words are not allowed. This would possibly cover sheep farmer assuming there are no irregular translations in another language.
  • SoP compounds using parts that can be universally applied (the parts are not related in any way) are not allowed, unless they are idiomatic. This excludes "brown leaf", "large box", "luxury boat" and ex-pilot but allows more cowbell. (w:More Cowbell)
  • (added august 6) Compounds without a space or hyphen that have only one non-universal part (like sockless) are only allowed if their usage is vast - far beyond the current three-independent-durably-backed-up-sources rule. Common words like hopeless or pointless should be kept, but exactly how much sense does an entry for boatless make?
  • Any entry still needs to be able to pass an RfV.

Maybe this is more clear? W3ird N3rd (talk) 16:56, 5 August 2017 (UTC)

I believe we should have some rules similar to WT:COALMINE in order to avoid unproductive RFD discussions and give contributors a chance to to predict if their new SOP entries will pass RFD. I guess many editors don't feel liked spending time on creating entries that later on might get deleted. This will slow down the rate of coumpound term entry growth which IMHO are necessary for a usable multiligual dictionary. I dont't believe that a vote to allow all attested terms currently has any chance to pass. In the past some such rules have been proposed:
  • Including all terms with lesser common single-word synonyms.
  • The lemmings priciple which would grant inclusion if a term is covered by a list of trusted dictionaries (which still have to be specified).
  • We already have some translations-only entries (Category:English non-idiomatic translation targets), however there is yet no rule to prevent their deletion. We probably want to keep them if they have idiomatic translations for a number of languages.

Matthias Buchmeier (talk) 23:32, 5 August 2017 (UTC)

From what I understand, I would have to create a vote at Wiktionary:Votes. I would probably need some help to have any chance of getting that right. You say such a vote would have no chance to pass, but if there's anything I learned from politics it's this:
  • If you want something to pass, bring it up for voting when everybody who is against it is on vacation.
  • If you want something to pass, just attach it to another bill that is being voted on that will pass. (not possible on Wiktionary)
  • If neither of those are feasible but at least a third of eligable voters are in favor, just bring it up for voting again and again and again and again. Sooner or later it'll pass because either those who are against it missed the vote, those who are against it don't vote because they figure it'll never pass anyway (that's one of the reasons Trump was able to win) or some current event or hype changes what people think and the vote passes.
There are more strategies, but these are the big ones. Once it has passed, it'll be virtually impossible to take it off the books again. W3ird N3rd (talk) 03:38, 6 August 2017 (UTC)

I just came across the following: towelless, fishless, bikeless, streetless, boxless, fireless, woodless, barless, magazineless, goldless, bronzeless, schoolless, cardless, mapless, pantless, sockless, appleless, watchless, morningless, kingless, bossless, condomless, monitorless... (this goes on endlessly) Next time you see somebody saying Dutch or German needs to be treated differently on Wiktionary, slap them in the face with this list. W3ird N3rd (talk) 07:00, 6 August 2017 (UTC)

The problem is that a lot of contributors have the justified fear that allowing all attested multiword compounds would flood the database with low quality entries. I believe that the best way to overcome this problem would be some set of well-designed inclusion rules. Matthias Buchmeier (talk) 17:48, 6 August 2017 (UTC)
I think the load of -less variants is low quality. A bunch of these wouldn't even pass RfV. So be it German, multiword compounds or just plain English -less variants of words: we need better inclusion rules. The current inclusion rules allow rubbish like boatless while prohibiting travel game. They will also allow fietshater and perhaps even bike-hater while nothing prevents lab rat from being deleted. I think the five-bullet point list I made above is at the very least a good start. But if there's no chance of any change ever becoming policy, I might as well give up. In that case a completely new wiktionary needs to be started, which would be a downright shame. W3ird N3rd (talk) 06:30, 7 August 2017 (UTC)
boatless would easily pass RfV, and is a translation of an Egyptian term (iww, with a hook above the i) that would pass your translation terms argument. I recall an old dictionary has a page of un- compounds without definitions; it hardly hurts us to give stuff like that boilerplate entries.--Prosfilaes (talk) 09:23, 7 August 2017 (UTC)
Looks like boatless is a bit of an odd duck. It's not used a lot on websites (which is what I initially checked for), but quite a few books use the word. As for the translation, I wasn't aware of that and was only referring to the RfV. I don't terribly mind having such entries around, but it just feels like insanity to have those while not allowing entries that are not nearly as obvious "because SoP". W3ird N3rd (talk) 13:30, 7 August 2017 (UTC)

Kajkavian – language, dialect or something inbetween?[edit]

Recent changes to Kajkavian prove that there is a dispute in the linguistic community as to the classification of this dialect/language. I hate to see this entry be turned into a political battlefield, so let's decide once and for all – is it a dialect or language, and should this page be protected to avoid any future disputes? --Robbie SWE (talk) 10:12, 2 August 2017 (UTC)

I'd stick with the conservative option and call it a dialect, at least until it gets an army and a navy. Crom daba (talk) 10:52, 2 August 2017 (UTC)
It's still not settled. Its status has been disputed for a long time, but it has been classified as a dialect of Serbo-Croatian since about 1950 or so. Many of the Yugoslavs get very worked up about it, one way or another. I agree with Crom daba, I think we should keep it as a dialect until there is something closer to a consensus that it's a separate language. I thought about getting an opinion from User:Ivan Štambuk, but Ivan seems to be absent. I think it's been over a year since Ivan's last serious edit. —Stephen (Talk) 11:17, 2 August 2017 (UTC)
If you're interested in opinion of other Yugos, @Vorziblix, Biblbroks might respond. Crom daba (talk) 11:31, 2 August 2017 (UTC)
Opinions from other Yugos might be helpful, but only if they are linguists and are philosophically moderate. The last time we asked for Yugoslav opinions, everybody from the Serbian, Croatian, and Bosnian Wikipedias came here and we almost had a shooting war. With User:Ivan Štambuk, we knew his education and philosophy, so he was very helpful in things such as this. Ethnologue does not recognize it yet. SIL mentions it only as a literary language. I don't know what to make of that. —Stephen (Talk) 13:35, 2 August 2017 (UTC)
I don’t have a strong opinion either way, as I’m not knowledgeable enough about Kajkavian to say whether it would be more convenient to keep it merged or split. For reference, however, here’s an old discussion of this same subject with Ivan Štambuk. — Vorziblix (talk · contribs) 21:33, 2 August 2017 (UTC)
"at least until it gets an army and a navy" Wait, all I need to have my own language is an army and a navy? Why has no one told me this before! **starts gathering troops**
On a more serious note, you may want to look at and compare the West Frisian language, a dialect that relatively recently became recognized as a language. W3ird N3rd (talk) 15:18, 2 August 2017 (UTC)
We are completely indifferent to official "recognition". We consider things separate languages (and give them separate codes) based on linguistic considerations, though admittedly our results are not always consistent: we treat all Serbo-Croatian and Chinese varieties as a single language (each), but we treat Bokmaal and Nynorsk as separate languages. —Aɴɢʀ (talk) 16:23, 2 August 2017 (UTC)
In considering these types of questions, I would like us to put more emphasis on lexicographic convenience and less on "linguistic considerations"- that is, will splitting or merging these languages make it easier to maintain the dictionary? Will it make it easier for users to find information that they want? DTLHS (talk) 16:58, 2 August 2017 (UTC)
I think the Frisian case is still interesting to look at. People who only speak Dutch can barely if at all understand Frisian, but for a long time they were (for example) not allowed to use evidence in Frisian in court. It was not until 1980 that Frisian got the status of a required subject in primary schools. I think it also took a while before they got their own Wikipedia. And they are very, very, very proud of their language and it sounds like that is a factor with Kajkavian as well. If you are curious how different it really is, try https://www.youtube.com/watch?v=m1WTTX_ITIE. The narrator is speaking Frisian, the man who appears after 14 seconds into the video is speaking regular Dutch. For written text, try https://nl.wikipedia.org/ versus https://fy.wikipedia.org/. For a long time this wasn't recognized as a seperate language. W3ird N3rd (talk) 17:08, 2 August 2017 (UTC)
We're already led by convenience, Serbo-Croatian wouldn't have won out were it not massively inconvenient to quadruple our work here. Crom daba (talk) 18:28, 2 August 2017 (UTC)
Only in some cases. We have both Scots and English and two different varieties of Norwegian (as well as just "Norwegian"). DTLHS (talk) 18:36, 2 August 2017 (UTC)

New competition[edit]

Hello. If anyone wants to play Emoji-Pictionary, I set up a game at User:WF on Holiday/Comp. As with most games I started in Wiktionary, there are probably loads of mistakes, loopholes, spellos, bad grammars and confusing instructions. But once we've got used to them, we can play happily. On a side note, I'm sure some of our Previous games could be modified by some tech-savvy folks in such a way as to allow normal people to play them. --WF on Holiday (talk) 23:18, 2 August 2017 (UTC)

Arbitrary behavior of certain administrators[edit]

There is an administrator being completely arbitrary on certain entires, as you might see here for example: [1] where he eliminates a translation of a word on the basis that he does not feel that it is a good translation, and yet the example he leaves in place about an LGBT film festival contradicts his assertion. This is, sadly, a consistent pattern and not merely one example; originally I had added "queer" as a translation while citing a specific example of it being translated that way in the name of an Israeli organization, and he eliminated it on the basis that he personally felt it did not fit and was offensive. His behavior is despotic; instead of requesting verifications he just acts as an absolute authority and is nothing but combative when I ask for simple things like justifications for his actions.

It's bad for the project because there are processes. He does not seem to be holding himself to the standards that other wiktionary users are held to, but acting as if it's his personal dictionary. He disagrees with a translation so instead of putting a RFV template on it, he just deletes it and locks the page.

Furthermore he's projecting a considerable amount in his responses, acting as if I am trying to impose my personal views when I am citing specific examples and he is citing no examples other than "I speak Hebrew," which i don't think is the way things normally go on Wiktionary? Like I speak Esperanto but I still have to justify my work on Esperanto terms, as 99% of Wikimedia users have to do.

I don't think Wiktionary was created so that certain people could impose their opinions without justifying them, and people who justify their edits by giving specific examples are treated as if they are troublemakers. I think it was created for the opposite reason and that fairness and transparency are still supposed to be important. Ligata (talk) 14:04, 3 August 2017 (UTC)

I recommend people engaging in a discussion over this take a look at the respective admin's talk page and think of the fact that Wiki-projects are known to prevent new users from joining by stubborn aggressive culture of long-term users. I also strongly advocate that the discussion here not get derailed by a smokescreen (talking about Hebrew definitions) but instead stay on topic (proper conduct and bureaucracy). Korn [kʰũːɘ̃n] (talk) 14:48, 4 August 2017 (UTC)
I had a similar issue with this travel edit. It may have been the wrong place, but I think those were some good examples. Instead of correcting it or requesting a fix/cleanup he just chucked it. In most cases that would be the end of it, but I mentioned him on this page asking to explain this. I don't expect most new users to be that assertive or to even notice their edit has been undone. He still hasn't shown up here and I thought he ignored it, but only just now do I see he did do something in response to that (or so the timelines would suggest): https://en.wiktionary.org/w/index.php?title=travel&diff=47159186&oldid=47158420 which is nice, but I think that would still benefit from the examples I had written. But I can't risk putting something back in that was removed by an administrator. I can understand his time is limited and he can't properly fix every mistake he finds. I get that. But isn't that what Wiktionary:Requests_for_cleanup would be for? W3ird N3rd (talk) 20:30, 4 August 2017 (UTC)
We don't even have time to resolve everything at WT:RFC as it is now (see all the archived unresolved requests). --WikiTiki89 20:38, 4 August 2017 (UTC)
Is that a valid argument for deleting/reverting edits that aren't perfect? The idea behind a wiki is that a valueable contribution doesn't have to be complete or perfect. But by reverting edits that are not perfect, you can quickly discourage any new users from hanging around. In the long term, you will indeed not have enough manpower to verify and clean edits. The cleanup request page isn't very well advertised, that may also contribute to this. W3ird N3rd (talk) 21:16, 4 August 2017 (UTC)
Some badly formatted entries are found many years after they are created. Thus, dealing with them as soon as they are noted is essential. —CodeCat 21:18, 4 August 2017 (UTC)
Some - so you just delete everything before anyone could even have a chance to fix it. If mice keep getting into your house, the solution is not to burn down your house. W3ird N3rd (talk) 01:40, 5 August 2017 (UTC)
If your house could do with some new furniture, but you can't afford any, the solution is not to fill it with mice... Equinox 10:25, 5 August 2017 (UTC)
But if many of your friends are carpenters, you might fill it with not-quite-perfect furniture and put a post-it on it to remind you something needs to be done about it, instead of sitting around in an empty house. And possibly chuck the nonperfect furniture anyway if it's still not fixed after a month. The very least IMHO is that the user who made the edit is (could possibly be partially automated) informed about what was wrong and what needs to be changed before putting that content back. Right now it's just "POOF, it's gone, and if you put it back you risk a ban". Like my examples for travel, I think they would now fit perfectly below the usage note, but I feel like it's a risk to put them back in because SemperBlotto is an administrator. I would have already done it had SemperBlotto been a regular user.
Obviously edits that you would consider mice (vandalism) are not what I'm talking about here. W3ird N3rd (talk) 13:05, 5 August 2017 (UTC)
I do sort of take your point. It's bad that we automatically revert every mess when some (10%? who knows?) messes contain something good. But the entries are public-facing. It suggests that maybe we need some kind of "limbo" or intermediate edit-o-space that allows stuff to exist before it's shown to every random visitor. I can't be the first wikidork to think of this. For now, although it's annoying, I think our approach is as good as it gets. Equinox 00:29, 7 August 2017 (UTC)
Wikipedia uses "Wikipedia:Pending changes" on controversial pages so that edits don't go live until they have been reviewed. We could perhaps apply it to all pages here, and patrol the log of pending changes needing review, instead of our current system of "patrolling" Special:RecentChanges, which some changes slip through. But the actual result might be an extremely large backlog of pending changes awaiting review. This was discussed at least once before; I don't recall many people having strong opinions, but enough opposed it that it wasn't implemented. - -sche (discuss) 06:01, 15 August 2017 (UTC)
  • Since the actions complained of are not administrative in nature, perhaps it would be better to title this section "Arbitrary behavior of certain editors". Cheers! bd2412 T 14:58, 5 August 2017 (UTC)
@BD2412 But there's a difference. If an administrator removes something, you can't put it back. Even if you slightly alter it and believe that is sufficient to fix it, you can't put it back because the user who removed it happens to be an administrator. If you do it anyway you risk a ban. This wouldn't trouble me nearly as much if a regular user had deleted it, I would just fix it and put it back without having to worry about it. W3ird N3rd (talk) 03:03, 6 August 2017 (UTC)
I don't think that's true at all. It would be a substantial misuse of administrative authority to use that authority in connection with one's own editing dispute. bd2412 T 03:06, 6 August 2017 (UTC)
This.__Gamren (talk) 08:38, 6 August 2017 (UTC)
@BD2412 User_talk:Stephen_G._Brown#Abuse_of_blocking_and_page-deleting_powers_by_SemperBlotto.3B_de-cratting_and_de-sysopping_required feels too much like that for me to risk it. While the user in question was wrong (and making silly demands), it makes it clear to me that putting back any content deleted by an administrator is risky. W3ird N3rd (talk) 09:18, 6 August 2017 (UTC)

Extinct species[edit]

Are there any categories for extinct species, or do they go in other categories? I just unearthed Kangaroo Island emu. DonnanZ (talk) 16:02, 4 August 2017 (UTC)

A taxonomic approach would just put them in existing categories (where they exist) alongside extant species. A language-centered approach would favour putting them somewhere else, and not mixing them with extant species. —CodeCat 16:12, 4 August 2017 (UTC)
There is a convention in taxonomic names to place the symbol "" before the name unless such a symbol is not necessary due to context. (See practice on Wikispecies.) We have begun implementing the practice of putting the "" on the inflection line for entries of extinct taxa and elsewhere if the word extint is not already in a label.
English vernacular names do not use the symbol, so it is arguable that a categorical distinction might be useful for some purposes. For many purposes, however, the presence or absence of the word extinct together with the capabilities of search would be sufficient. DCDuring (talk) 22:12, 4 August 2017 (UTC)
There is no value lexical value in saying if a particular species is extinct or not, anymore than if a particular institution is defunct, or a person is deceased. All that matters is if the term still has some kind of usage or currency. —Justin (koavf)TCM 00:07, 5 August 2017 (UTC)
By what definition of lexical? Does lexical exclude definitions, ie, semantics? We have the word extinct in so many definitions. DCDuring (talk) 01:04, 5 August 2017 (UTC)

Can anyone get through to User:Jeff Weskamp?[edit]

They are adding Cherokee entries with manual transliterations, even though automatic transliterations work perfect for Cherokee. This isn't really a big issue, but it's silly and so I left a message on their talk page. They don't seem to have noticed it at all, though, even after I sent another message. Is anyone able to get through to them? A user that ignores their talk is bad, even if they aren't currently causing trouble. —CodeCat 17:10, 4 August 2017 (UTC)

Jeff also edits other Native American languages, including Navajo. I have not checked all of his edits, but quite a few of them.Those I've checked always seem good, even if he adds transliterations unnecessarily. I have attempted to talk with him a time or two, but I don't believe he has ever replied to anyone. I've known other editors who try to avoid interpersonal communication, so it does not seem all that odd. Jeff just takes it to an extreme level. —Stephen (Talk) 22:11, 5 August 2017 (UTC)
Jeff is now adding improper categories to entries, so I hope they will start listening. —CodeCat 19:10, 19 August 2017 (UTC)

Languages distinguishing dotted and undotted i[edit]

Recently I added some code to distinguish dotted and undotted i (Iı, İi) in Turkish and Azeri sortkeys . Till now, they were merged by being converted to lowercase (→ iı, ii) and then uppercase (→ II, II) using English rules (mw.ustring.upper). Thus, words beginning with both i and ı were sorted under I when they were categorized using templates.

Currently the fix only applies to Turkish and Azeri. Are there any other languages currently on Wiktionary that distinguish dotted and undotted i? — Eru·tuon 20:42, 4 August 2017 (UTC)

The following languages have entries with both dotted and undotted i's: Azeri, Crimean Tatar, Egyptian, English, Gagauz, German, Italian, Karakalpak, Tatar, Translingual, Turkish, Zazaki. DTLHS (talk) 21:10, 4 August 2017 (UTC)
Egyptian, German, Italian? And even English? Really? —CodeCat 21:14, 4 August 2017 (UTC)
Italian: dımlı, German: homurdanmayı, Egyptian: ḥtrı͗, English: Category:English terms spelled with ı. DTLHS (talk) 21:26, 4 August 2017 (UTC)
The Italian and German look like errors. The Egyptian is used with a combining diacritic, and it should just use a regular i. As for the English, most of them are probably better attested with a regular i and therefore should probably moved to those spellings. Regardless, English speakers would not treat i and ı as different letters, so sorting them together is correct. —CodeCat 21:31, 4 August 2017 (UTC)
@DTLHS I don't think the German entry you fixed is correct, still. In the lemma entry, the inflection table says it's the definite accusative form. —CodeCat 21:54, 4 August 2017 (UTC)
I guess, then, what I'm really asking is for which languages would we actually want the sortkeys to distinguish the two? — Eru·tuon 21:30, 4 August 2017 (UTC)

I'm going to guess that all the Turkic (and Turkic-influenced) languages in the list should have dotted and dotless i distinguished: in addition to Turkish and Azeri, Crimean Tatar, Gagauz, Karakalpak, Tatar, Zazaki. — Eru·tuon 21:51, 4 August 2017 (UTC)

(edit conflict) Judging by w:Dotted and dotless i, there's the potential in Turkic languages that use the Latin script, even as an alternative, but nowhere else except for ad-hoc use in romanization. Our entry for ı lists only Azeri, Crimean Tatar, Gagauz, and Turkish. Of course, texts in other languages can have names attested in their original spelling, but such cases are so rare that I doubt there are many (if any, at all) with dotting determining their order in any confusing way. Chuck Entz (talk) 21:57, 4 August 2017 (UTC)
I've put the languages that I listed above in a table in Module:languages. I should verify that each one actually has a regular orthographic system that uses the letters, though. — Eru·tuon 00:05, 5 August 2017 (UTC)
Okay, I looked at Wikipedia articles and Wiktionary categories, and Crimean Tatar, Gagauz, Karakalpak, Tatar, and Zazaki all seem to either regularly use dotted and dotless i, or have entries that use them. — Eru·tuon 00:31, 5 August 2017 (UTC)

Category name: "words pseudosuffixed with" or "words ending in"[edit]

Which naming convention should be used for suffixlike endings: "words pseudosuffixed with" or "words ending in"? Examples for both: Category:Esperanto words pseudosuffixed with -acio; Category:Esperanto words pseudosuffixed with -enco; Category:Hungarian words ending in -ikus. --Panda10 (talk) 23:58, 4 August 2017 (UTC)

I prefer "ending with", because I haven't heard "pseudosuffix" before, but I wonder how we could prevent the creation of ridiculous categories for every sequence of letters at the end of the word: like for naming, ending with -g, ending with -ng, ending with -ing (though that's a suffix), ending with ming, ending with -aming. That is, what counts as a "pseudosuffix" or ending such that it gets to have a category? — Eru·tuon 00:03, 5 August 2017 (UTC)
I think these things (pseudosuffixes) are called formatives. Crom daba (talk) 00:26, 5 August 2017 (UTC)
Also desinence. --Vriullop (talk) 07:41, 5 August 2017 (UTC)
See also: previous discussion in July at Etymology Scriptorium.
"Desinence" means typically "inflectional" rather than derivational. With some ovelap with "formative", there's also "formant", used to refers to endings that are not known to be certainly segmentable at all (so e.g. ölyv would have a "formant" -v). "Ending in" is probably good enough a starting point, provided that we craft descriptions for these that clarifies that they are not pseudo-rhyme categories (e.g. we would not want sing in a category "English words ending in -ing").
Something that specifies the etymological origin, such as "ending in Latinate -ikus" might work. This also prevents the risk of bloat through people starting to add "ending in -X" as useless "wrapper" categories for every "suffixed with -X" category.
I'm not sure how these categories should be meshed with the pre-existing suffix categories, though. Do we put them in parallel, or as a parent category for the corresponding proper suffix category? I would lean towards the former, with crosslinks from the category description, but I'm open to arguments in other directions. --Tropylium (talk) 07:48, 6 August 2017 (UTC)

Sanskrit vs. Old Indo-Aryan[edit]

Currently, Module:languages lists only Sauraseni Prakrit as a direct descendants of Sanskrit. This is IMO completely misleading because there is nothing to prove that Sauraseni is any more a descendant of the Vedic dialect of Old Indo-Aryan than any other Prakrit. A simple example is Sanskrit क्षेत्र (kṣetra, region), from Proto-Indo-Iranian *ĉšáytram. The regular outcome of *ĉš in Middle Indo-Aryan is "ch". This is found in all of the Dramatic Prakrits as "chetta" (alongside a "kh" form, that likely came later as part of artificial alignment with Sanskrit), including Sauraseni. Indeed, where Sanskrit simplifies Proto-Indo-Iranian clusters to क्ष (kṣa), the Middle Indo-Aryan languages preserve the original cluster. If Shauraseni was a direct descendants of Vedic Sanskrit we would see only "khetta", no "chetta". So, that being said, we have two options.

  1. Remove Sauraseni as a Sanskrit descendant – Note that CAT:Terms inherited from Sanskrit has been cleared out with Wyang's help, so no module errors will occur. This is keeping in line with our treatment of Sanskrit as only Vedic Sanskrit (+Classical Sanskrit), not all Old Indo-Aryan.
  2. List all of the Dramatic Prakrits (Sauraseni, Maharastri, Ardhamagadhi) as direct Sanskrit descendant – This was suggested at Category talk:Hindi Tadbhava, and would involve treating Sanskrit as a dialect continuum of all Old Indo-Aryan + Classical Sanskrit. WT:ASA would have to be modified accordingly.

Personally, I think either option is better than the status quo. —Aryaman (मुझसे बात करो) 04:00, 6 August 2017 (UTC)

Pinging @JohnC5, माधवपंडित, DerekWinters. —Aryaman (मुझसे बात करो) 04:01, 6 August 2017 (UTC)
I would prefer option #2, ie, considering Sanskrit to be the entire group of mutually intelligible dialects, for the sake of convenience. Wiktionary treats Avestan, Old Norse & Serbo-Croatian as one language while in reality they're all two or more dialects. We can do the same for Sanskrit. ɱɑɗɦɑѵ (talk) 04:46, 6 August 2017 (UTC)
Not to mention none of the non-Vedic dialects are (well-)attested. And we could always have a reconstructed entry *च्शेत्र/*च्षेत्र (cśetra/cṣetra) if it is needed. —Aryaman (मुझसे बात करो) 06:34, 6 August 2017 (UTC)
There is already dialectal diversity within "Sanskrit". Strictly speaking even Classical Sanskrit does not descend from Vedic Sanskrit precisely, but from a parallel dialect that was not written down until later. This in mind, we could probably treat all Middle Indo-Aryan (and most of New Indo-Aryan) as descendants of "Sanskrit". Where MIA diverges from Classical Sanskrit, it would be possible to create reconstructed Sanskrit forms (similar to Category:Latin reconstructed terms). Perhaps we could outright consider merging "Proto-Indo-Aryan" into Sanskrit? Same deal as how we already equate Latin with Proto-Romance. --Tropylium (talk) 07:58, 6 August 2017 (UTC)
I agree that Sanskrit should be the collection of OIA dialects put together. However we cannot merge it with PIA because we need PIA for the Mitanni language. DerekWinters (talk) 15:11, 6 August 2017 (UTC)

making Tagalog an LDL[edit]

This was supported in WT:RFVN#hagok by @Metaknowledge, Mar vin kaiser, Atitarev, Stephen G. Brown (I think). @Rgt2002, TagaSanPedroAko may also have opinions. Please discuss.__Gamren (talk) 08:19, 6 August 2017 (UTC)

I agree that Tagalog is an LDL. —Stephen (Talk) 08:42, 6 August 2017 (UTC)
I also agree that Tagalog is an LDL. --Mar vin kaiser (talk) 09:22, 6 August 2017 (UTC)
Do we have any quotations of Tagalog in use in Wiktionary? Do we know of any online corpora that we can use? Is http://sealang.net/tagalog/corpus.htm a usable corpus to find quotations in use? Can Tagalog texts be found in Google books? What methods can a third party use to verify that Tagalog is so poorly documented that we should allow single mentions for it? --Dan Polansky (talk) 09:55, 6 August 2017 (UTC)
Yes, Tagalog is very poorly documented here in Wiktionary, but thanks for me being a native speaker of Tagalog, I am making efforts to make Tagalog a largely documented language here, from being a least documented language, or a LDL. I agree that Tagalog is still a LDL, and yes, there will be efforts to add quotations showing sample use of Tagalog words for a certain sense. Maybe finding interesting quotes in Tagalog by notable persons, if not by Tagalog-language publications, may help. -TagaSanPedroAko (talk) 11:23, 6 August 2017 (UTC)
@TagaSanPedroAko: The discussion is not about whether Tagalog is well documented in the English Wiktionary but rather whether it is well enough documented on the Internet, by which the users of the phrase mean, whether there are enough quotations of Tagalog in use (not dictionaries) to be found on the Internet. Since, these quotations of Tatalog in use is what the English Wiktionary uses for verification, per WT:ATTEST. And there is a proposal to allow single mentions in dictionaries to suffice for verification of Tagalog; single mentions do not suffice for English, Spanish, German, and multiple other languages. --Dan Polansky (talk) 11:50, 6 August 2017 (UTC)
There are a very few mainstream Internet sources for use in quotes that use Tagalog. The vast majority of Tagalog sources on the Internet will mostly be self-published, but if you can find one reliable one, like a book in Google Books or a Tagalog news website, then, here we go.I'm aware that there are reliable Tagalog (or Filipino) sources in the Net that attest use of certain words, but that will be difficult with the majority of Philippine Internet media use English. If I can dig through a reliable source, then, good.-TagaSanPedroAko (talk) 11:58, 6 August 2017 (UTC)
@Mar vin kaiser Thankfully, after I added Quiet Quintin to my user gadgets here in Wiktionary, I'll drop my position to make Tagalog a LDL. There are a lot of attestations of many Tagalog words in Google Books, that the language is yes, well documented. -TagaSanPedroAko (talk) 12:32, 5 September 2017 (UTC)
@TagaSanPedroAko Can you attest hagok?__Gamren (talk) 06:49, 12 September 2017 (UTC)
@Gamren I found attestations for it, through Google Books (via the Quiet Quentin gadget). Looks like Tagalog is still a WDL, thanks that I used QQ for attestations. -TagaSanPedroAko (talk) 06:58, 12 September 2017 (UTC)

Unsolicited Babel requests[edit]


"Could you please add {{Babel}} to your user page? I'd appreciate it. --Dan Polansky (talk) 08:41, 5 August 2017 (UTC)"

I suppose @Dan Polansky means well, but in my book this is spam. The biggest problem I have with this is that he makes it look like it's a personal message. He says he re-types it every time he posts it, but it's still the same message every time. I wouldn't mind if he wrote a personal message for every request and explained why it would be so valuable to him to see that user getting a Babel, or if he would make it clear in the message that it's not really personal.

I personally don't appreciate these messages, but maybe it's just me. W3ird N3rd (talk) 10:17, 6 August 2017 (UTC)

  • The primary purpose of user pages it to give other editors an idea of an editor's competence in a particular language. Babel boxes are the best way of achieving this. Please add a babel box to your own user page (if and when you create one). SemperBlotto (talk) 10:22, 6 August 2017 (UTC)
I have seen plenty of users with a babel box, I thought about it and decided not to create a user page at this moment. If and when I do, I don't think I'll add a babel box. I don't really like them. W3ird N3rd (talk) 10:34, 6 August 2017 (UTC)
  • Funny how you're complaining about Dan "spamming" talk pages with something useful to the project... by spamming this forum page. —Μετάknowledgediscuss/deeds 00:44, 7 August 2017 (UTC)
  • I think you don't know what spam is. Spam means unsolicited bulk electronic messages. Being useful or not doesn't matter, although useful spam is less likely to be frowned upon. If you are getting e-mail that you didn't ask for from you local supermarket with various offers that you actually like, it's still spam. I have only brought this up here, nowhere else. I don't have any intention of posting this anywhere else either. I'm also not asking anyone to do or buy anything. You may find this pointless and you are entitled to your opinion, but that does not make this forum post spam. In my opinion the babelbox is getting enough exposure as it is. If such messages are accepted, it might lead to a slippery slope. I just wanted the community to be aware of this phenomenon, if the community thinks it's fine I'll say no more. W3ird N3rd (talk) 05:49, 7 August 2017 (UTC)
The Beer Parlour is the place to discuss these things. This discussion is not spam. That said, personally I'm OK with Dan requesting people to use babel boxes. Sometimes we need to know who speaks a certain language, and the boxes make that job easier. --Daniel Carrero (talk) 05:53, 7 August 2017 (UTC)
I think Babel boxes are a good thing,t requesting them is a good thing, and not responding constructively to such a request is a bad thing. DCDuring (talk) 06:06, 7 August 2017 (UTC)
I also think that it pays for such a request to have some explanation of the purposes served. DCDuring (talk) 06:08, 7 August 2017 (UTC)
Adding a Babel table should be our standard policy, if it's not already. A standard {{welcome}} message includes that request. If users refuse to tell other users what languages they know or they don't they should go somewhere else. Not knowing a language doesn't mean that you can't edit in that language but others editors can check your edits accordingly or monitor edits. --Anatoli T. (обсудить/вклад) 06:21, 7 August 2017 (UTC)
Technically you've fulfilled the request. You've added {{Babel}} to your userpage. Wyang (talk) 06:29, 7 August 2017 (UTC)
And now that we know that you can't speak any languages, any of your contributions will be ignored. SemperBlotto (talk) 06:42, 7 August 2017 (UTC)
[2]suzukaze (tc) 06:48, 7 August 2017 (UTC)
But, at an earlier, saner time: [3]. DCDuring (talk) 11:16, 7 August 2017 (UTC)
I don't exactly like that W3ird N3rd doesn't have a Babel box, but if the user doesn't want one don't make them feel forced to have one. Some very contributing members of Wiktionary don't have user pages at all. That said, W3ird N3rd isn't exactly spamming this forum, but I just don't feel like this discussion is appropriate for the beer parlour, especially since it's targeted at one user alone (Dan). PseudoSkull (talk) 01:49, 8 August 2017 (UTC)
It feels a bit out of place indeed, but I've looked around and Wiktionary:Information desk, Wiktionary:Tea room and Wiktionary:Grease pit were clearly the wrong places. Although this post is indeed about one user, my comment was about the phenomenon. I don't know if any other users are doing this, but what I said would apply to them all the same. My biggest issue is probably this line: "I'd appreciate it." which was repeated for all users. Maybe it's because I'm Dutch (the Dutch are known for being direct), but I just can't stand it when someone pretends to care.
Just one more thing. I mentioned the possiblity of a slippery slope. One of the reasons I don't want a babel box is because (depending on how many languages you know) it looks like a unicorn just barfed a rainbow. We all know the average Wikipedia user page looks like a Christmas tree and while it won't happen overnight, it must have started somewhere and the road to hell is paved with good intentions. It may not happen at all - but if users start pushing a template, even if this one now is a useful one, it might. I believe it would be more wise not to allow any users to promote templates this way and if it is believed the babel box isn't getting enough exposure, have the administrators decide on a way to inform users. But clearly, I'm standing alone on this one. W3ird N3rd (talk) 03:26, 8 August 2017 (UTC)
If the issue you have with the babel box is too much unicorn barf on user pages, then you could use a different method to give information on what your native language is and what your levels of proficiency are in other languages. — Eru·tuon 03:53, 8 August 2017 (UTC)
There's no slippery slope here: Wikipedia-style user boxes aren't allowed, with the exception of Babel, time zone, and maybe one or two others that provide useful information. That's the way it's been since long before I started here 5 years ago, and I doubt it will change. Chuck Entz (talk) 04:40, 8 August 2017 (UTC)
  • I also see no slippery slope. The Wiktionary community has been very careful to avoid unicorn barf.
And I also see no disingenuousness on Dan's part. I, too, appreciate it when users add Babel boxes to their user pages -- at least, when those Babel boxes are at least vaguely accurate, as they provide the community with useful and usable information on who understands which languages, and roughly to what degree. For a multilingual dictionary project, this kind of user metadata is very useful.
FWIW, W3ird N3rd's behavior comes across as immature, and willfully disrespectful of Wiktionary norms, albeit on a minor scale that's more of a slight annoyance than anything actionable. I suspect some of his (her?) reticence comes from the Wikipedia culture and a lack of familiarity with the Wiktionary project. On Dan's part, I see no spam, and nothing inappropriate in asking for a Babel box.
I hope W3ird N3rd can learn more about how Wiktionary functions, and grow to be a comfortable and productive member of the community. ‑‑ Eiríkr Útlendi │Tala við mig 06:09, 8 August 2017 (UTC)
If my contributions in the main dictionary space are not productive, I might as well stop contributing. It's not going to be all that much better in the future. I thought I was being productive, but thanks for pointing out to me that I'm not. I know you think this is immature, but why should I care? Either I really am not productive, in which case you should just think "good riddance" or I am but you insult me (at the very least that's how this comes across), in which case why should I stay? W3ird N3rd (talk) 14:21, 8 August 2017 (UTC)
  • My perspective: 1. Yes, distributing the same message electronically to a larger number of people is spam. Textbook definition. 2. I see no harm in every user receiving this spam message once as it is merely a request for a useful addendum. 3. This is a Wiki-project, not Lord of the Flies, Jante or a Catholic School in a Celtic country. Wiki itself is based on and centered around voluntary contributions. Of course the community can come together and regulate things to prevent harmful additions to the project, but demanding any user share any information on himself or add a specific thing, that is: Forcing involuntary contributions, is the fucking opposite of what this project is supposed to be and everyone who entertains that trail of thought is indeed about to open Pandora's Box and pervert Wiktionary (an open project where everyone can partake) into a generic online dictionary run by a junta of seniors. Korn [kʰũːɘ̃n] (talk) 10:11, 8 August 2017 (UTC)
Just the request isn't even what bothers me most. Had it been worded like "Could you please add {{Babel}} to your user page? The Wiktionary community would appreciate it." I wouldn't have been even close to as annoyed as I was now. I know what many here will say: "what am I complaining about, that's hardly any different at all, what sort of moron are you, yadda yadda yadda". To me this would make all the difference. It would make it clear Dan isn't personally asking me to do this, he is asking on behalf of the Wiktionary community. Which also means that if I decide to ignore it, I'm not letting Dan down personally. To me, that's a big difference. Again, I don't expect anyone to side with me. It's just my opinion. Yes it is a stupid opinion. I'm a stupid person and there's no need to further comment on that, I admit it, move on. W3ird N3rd (talk) 14:21, 8 August 2017 (UTC)
Instead of "Could you place Babel to your user page? I'd appreciate it," you wanted "Could you please add Babel to your user page? The Wiktionary community would appreciate it"? I can't see the difference and English is my native language. Dan is Czech and he does not have a perfect command of English. Most of our editors have a different language as their first language. It has never occurred to me to be offended by English comments that are not just so. I think most people write the best they can and they don't mean to offend or confuse. The reader should bear some of the load of communication by showing a more tolerance and understanding. It improves the atmosphere. —Stephen (Talk) 16:03, 8 August 2017 (UTC)
I tried to explain it, I'll do it again knowing full well it won't make a difference. If you say "Please do X, I'd appreciate it." I feel like I'm letting you down when I don't do it. (and the community may or may not care about X) If you say "Please do X, the community would appreciate it." it tells me the community in general would prefer this, I'm not letting you down personally if I don't. I wouldn't even think this difference, or at least what I perceive as a difference, would be language-dependent. I suppose not every individual would recognize this difference though. And maybe somehow I'm the only one. In which case I'm wrong and my faulty interpretation lead to a long and useless argument of misunderstanding and contempt. Well, if my understanding of the English language is that shitty I probably shouldn't be here anyway. Which was another reason I wouldn't want to add a babel box: I can't judge to what degree I master any language. W3ird N3rd (talk) 16:38, 8 August 2017 (UTC)
@W3ird N3rd: I don't think that I would feel like I'm letting anybody down by not adding a Babelbox, no matter how the message asking for it was worded. It's really not that important to discuss this imo. —Aryaman (मुझसे बात करो) 04:55, 11 August 2017 (UTC)
It seems to me your English is just fine. Personally, I disagree that Dan's phrasing was due to him being Czech. I suspect he prefers in general not to speak on behalf of "the community". But I could be wrong. — Eru·tuon 17:23, 8 August 2017 (UTC)
Indeed, I don't like to speak on behalf of community. The Babel practice is common but the appreciation is mine. --Dan Polansky (talk) 10:47, 19 August 2017 (UTC)
For example this sentence: "The reader should bear some of the load of communication by showing a more tolerance and understanding.". To me, this seems wrong. (the most obvious fix to me would seem to be to change "a more" to "a little more") It could be a joke (writing a broken sentence to prove your point), a genuine error (even a native could make mistakes) or (which would seem more likely as English is not my native language) this is correct but I just don't understand it. I also think I don't write text the way most people do today: I don't use any kind of spell checker or autocomplete. That may also result in me looking at language in a different way. W3ird N3rd (talk) 17:01, 8 August 2017 (UTC)
(An academic discussion on what is spam) "distributing the same message electronically to a larger number of people is spam": Not really. In my job, I receive job-related emails from management that are distributed to a larger number of people, and they are obviously not spam; spam filters are not designed to remove these kinds of messages. A message related to Wiktionary purpose posted in multiple instances on Wiktionary is not necessarily a spam. The definition of spam is not so simple as some people think; I don't think I have a good comprehensive definition. Being posted to a larger number of people is a component of being a spam, but that alone does not suffice. By the way, our welcome messages are much more of a spam than these requests for Babel given how long they take to read. --Dan Polansky (talk) 10:47, 19 August 2017 (UTC)

Weird arrow next to uses of {{taxlink}}?[edit]

@Erutuon, DCDuring, Sgconlaw There is a weird arrow that sometimes appears next to the name of species and such that are formatted using {{taxlink}}. What's its purpose? It looks wrong, and is mentioned nowhere in the documentation. Can we get rid of it? For an example, see пога́нка (pogánka). Thanks! Benwing2 (talk) 20:44, 6 August 2017 (UTC)

I have categories to detect the conditions that cause them, which I consulted as soon as I saw "weird arrow" in the alerts. I found поганка in one of the categories and eliminated it. If they occur when you use taxlink, that means we already have an entry for the taxon involved and the template should be removed. Besides the situation of a new use of the templates there can be "many" entries that are affected by adding a new taxon or vernacular name. When I add either type of entry I try to eliminate any uses of the template in linked entries that would generate the "weird arrow". I will add something about this in the documentation for the two templates, though I don't expect it will be consulted, this being the first time it has come up, though I might be wrong. DCDuring (talk) 21:06, 6 August 2017 (UTC)
Also, I watch the category (as well as most other taxon-related categories) and would have detected the entry the next time I checked my watchlist. DCDuring (talk) 21:09, 6 August 2017 (UTC)
I remember seeing the "weird arrow" before. DCDuring, wouldn't it be sufficient for the template to place entries that require your intervention in the category, without the arrow also appearing? — SGconlaw (talk) 21:23, 6 August 2017 (UTC)
We already have such categories, which I aggressively police to keep empty.
The trouble is that it takes me quite a while to find the instances of redundant templates without using ctrl-f on the displayed text to find "=>". It is always at least a bit faster with the "=>". The problem is worst in entries with unusually large Hyponyms or Derived terms sections, with multiple L2 sections, with the use of {{taxlink}} or {{vern}} in the middle of definitions for polysemous terms or in unexpected locations.
If someone knew a way so that something displayed in the entry that optionally only an anointed few (me included) could see, we could eliminate the need for anyone to consult and grasp the documentation to eliminate the offending "=>". DCDuring (talk) 21:40, 6 August 2017 (UTC)
No idea how to do that. Maybe it could be made more understandable by replacing it with some reduced-size text like "needs attention" (compare the "Invalid ISBN" warning generated by {{ISBN}}), but I don't know whether you think that would make the warning too prominent. — SGconlaw (talk) 21:49, 6 August 2017 (UTC)
The offending "=>" can be eliminated with CSS. If we enclose this symbol in a HTML tag with a unique class name (say class="taxlink-redundant"), and create a CSS style rule that vanishes it (display: none;), which can either be placed in the HTML tag or in MediaWiki:Common.css, then the symbol can be un-vanished at will. Putting the style rule in MediaWiki:Common.css requires the help of an admin. Let me know which option you would prefer and I can give further help. — Eru·tuon 21:56, 6 August 2017 (UTC)
@Erutuon:'s solution seems great. I'm an admin. I would just need to be instruction as to what to put where so that I could still see the "=>" (which has the advantage of being easy to type and rarely used except for this purpose). The name for the style could be something like "redundant template finding aid" or a comprehensible abbreviation of that. DCDuring (talk) 22:16, 6 August 2017 (UTC)
I guess its value as a recruitment tool for proper (non-redundant) use of {{taxlink}} and {{vern}} is not much of a consideration. DCDuring (talk) 22:18, 6 August 2017 (UTC)
Why shouldn't I be taking the approach of having some red text telling folks that they should remove the offending template? I think there is precedent for that. It might even be in continuing use. DCDuring (talk) 22:21, 6 August 2017 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── (edit conflict) Well, I suggested the class name taxlink-redundant, but you can choose a different one. Another idea: redundant-taxlink-mark? Whatever you choose, it should be made up of basic Latin and hyphens. Spaces will be misinterpreted. The code to add to MediaWiki:Common.css (the period . indicates that what follows is a class name):

.taxlink-redundant {
display: none;

And the code to add to your Special:MyPage/common.css:

.taxlink-redundant {
display: inline;

And then, in the template code for {{taxlink}}, replace <sup>=></sup> with <sup class="taxlink-redundant">=></sup>.

If you want to use a different class name, just replace taxlink-redundant in each of the three code snippets with whatever class name you choose. — Eru·tuon 22:33, 6 August 2017 (UTC)

If you want to keep the mark, how about changing it to a message with instructions that only displays in preview mode? For example, <sup class="error previewonly"><small>(Replace {{temp|taxlink}} with a regular link.)</small></sup>. Admittedly, that will be even more annoying than the little arrow thingy. — Eru·tuon 22:37, 6 August 2017 (UTC)

I was hoping to use the same class for both {{vern}} and {{taxlink}}. It might be useful for other similar applications though I don't know of any.
We have plenty of instances of the much more annoying technique used to enforce correct use of templates by displaying 80 or more characters of red text, sometimes with incomprehensible messages buried in them.
I will sleep on this before implementing and give others a chance to weigh in, but thanks for the implementation suggestion. It seems to fit the bill perfectly. I take it that CSS is not much more burdensome on server resources than HTML and doesn't raise the risk of latency problems like JS. DCDuring (talk) 23:15, 6 August 2017 (UTC)
@Erutuon: You had mentioned above that we could accomplish the optional display of default-hidden text if we "create a CSS style rule that vanishes it (display: none;), which can either be placed in the HTML tag or [] ". Where exactly would the HTML tag reside? DCDuring (talk) 18:47, 8 August 2017 (UTC)
The HTML tag that I mean is the <sup>=></sup> that appears in the template source code. — Eru·tuon 18:51, 8 August 2017 (UTC)
That seems like a better implementation, since evidently I am the only one using and virtually the only one aware of this. I could include a reference to the decloaking technique in the documentation for {{taxlink}} and {{vern}}. No adminship required either. DCDuring (talk) 20:01, 8 August 2017 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Well, if you choose that option, you will have to use the following code in your common.css:

.taxlink-redundant {
display: inline !important;

The !important makes the style rule overrule the CSS in the HTML tag; otherwise, the CSS in the tag will win out. — Eru·tuon 20:37, 8 August 2017 (UTC)

That is what I will do. Thanks for the help. If I have problems, I will see you on your talk page. DCDuring (talk) 20:53, 8 August 2017 (UTC)

reading out "-"[edit]

In results, such as 2-0, how is the hyphen spelled out? I think such a pronunciation should be added to its entry --Backinstadiums (talk) 21:16, 6 August 2017 (UTC)

Isn't it silent in many cases? The score would just be read as "two nil". I suppose on occasion it would be read "two to nil". (Also, it should really be an en dash.) — SGconlaw (talk) 21:24, 6 August 2017 (UTC)
@Sgconlaw: The singer Tee Grizzley, in his song First Day Out, says "two and o" for 2-0 at min. 3:48 --Backinstadiums (talk) 22:10, 6 August 2017 (UTC)
I agree with SGconlaw - two nil. It can be different in broadcast results, if the home team loses it would be "Team xx nil, Team yy two". DonnanZ (talk) 23:31, 6 August 2017 (UTC)
If we're talking about sports scores and the like, no one would ever say "two nil" or "two to nil" in Canada. It would be one of the following (in rough order of frequency): "two nothing", "two to nothing", "two to zero", or possibly "two zero". Andrew Sheedy (talk) 18:40, 7 August 2017 (UTC)
I just realized this is irrelevant, as the discussion is about the hyphen, not the 0, but oh well. Andrew Sheedy (talk) 19:03, 7 August 2017 (UTC)
:-D — SGconlaw (talk) 03:46, 8 August 2017 (UTC)
It could be either read as "to" or as nothing. When counting game wins rather than the in-game score, it's often read as "and" (in the US at least), as in "My friend and I are five-and-three", although this is more often done for wins-vs-losses of a single party (this is the case in Backinstadiums's song reference above, even though those are trials and not literal "games"). --WikiTiki89 18:55, 7 August 2017 (UTC)

Missing category?[edit]

I can't find any category for Washington, D.C. or District of Columbia, only for the state of Washington. I guess there should be one, but what should the name be? DonnanZ (talk) 23:17, 6 August 2017 (UTC)

  • The main form is at "Washington, D.C." so I made a category for that. There are plenty of words that refer to the District. Good call. —Justin (koavf)TCM 23:29, 6 August 2017 (UTC)
Brilliant, thanks. DonnanZ (talk) 23:37, 6 August 2017 (UTC)

Share your thoughts on the draft strategy direction[edit]

At the beginning of this year, we initiated a broad discussion to form a strategic direction that will unite and inspire people across the entire movement. This direction will be the foundation on which we will build clear plans and set priorities. More than 80 communities and groups have discussed and gave feedback on-wiki, in person, virtually, and through private surveys[strategy 1][strategy 2]. We researched readers and consulted more than 150 experts[strategy 3]. We looked at future trends that will affect our mission, and gathered feedback from partners and donors.

In July, a group of community volunteers and representatives from the strategy team took on a task of synthesizing this feedback into an early version of the strategic direction that the broader movement can review and discuss.

The first draft is ready. Please read, share, and discuss on the talk page. Based on your feedback, the drafting group will refine and finalize this direction through August.

SGrabarczuk (WMF) (talk) 16:11, 8 August 2017 (UTC)

Unsorted formations[edit]

I've seen Unsorted formations in descendant trees formatted as either * Unsorted formations or ; Unsorted formations. Is there a written guideline for this? --Victar (talk) 16:48, 8 August 2017 (UTC)

The standard practice is with *, so that it's listed on the same level as all other formations. —CodeCat 15:55, 9 August 2017 (UTC)
@CodeCat: Is that outlined in a guide or something somewhere? Like I said, I've seen both, so there doesn't seem be true "standard". @JohnC5? --Victar (talk) 20:47, 11 August 2017 (UTC)
@Victar: I've always used ;. On an unrelated note, Victar, please don't just start moving around entries (specifically the new entries) without discussing with anyone. I'm not convinced that was a good choice and may now have to revert all that. If the is not phonemic then it should not be included; if it is then it shouldn't be subscript. It is extremely frustrating that you just did this. —JohnC5 02:43, 12 August 2017 (UTC)
@JohnC5: I only moved two entries; not the end of the world. Also, very unrelated and should have been discussed elsewhere. --Victar (talk) 03:03, 12 August 2017 (UTC)

Quotations vs. Citations[edit]

I'd like to know the protocol for using them, as well as the differences they are meant to represent --Backinstadiums (talk) 14:43, 9 August 2017 (UTC)

I don't know if we have a standard for the terms, but I have been using the term citation to refer to sources providing evidence for information stated in entries, which are usually placed in "References" or "Further reading" sections. The {{cite}} and {{R:}} groups of templates may be used for this purpose. On the other hand, a quotation is an extract from a source that is provided as an example of the entry in use, and which is placed directly under a definition. The {{quote}} and {{RQ:}} groups of templates is used for them. For example, at merlion, there is one "citation" in the "References" section, and a number of "quotations" under the various definitions. However, note that entry pages have a tab called "Citations" which really contains quotations. — SGconlaw (talk) 15:07, 9 August 2017 (UTC)
I've been confused by this as well. Seems like they are used interchangeably. I've seen plenty of quotations from books that are over a hundred years old, providing no clue of how the entry is used today or how you could use it yourself. Personally I prefer examples. They don't come in a collapsed box, there is no question about proper citing due to copyright issues and they are designed to show how the entry is and can be used without clutter. Personally I put quotes and citations all the same on the citations page. W3ird N3rd (talk) 17:16, 9 August 2017 (UTC)
here are my unpopular opinions: quotations and citations are used interchangeably, I don't think there's a meaningful distinction. "Examples" are made up and may not reflect actual usage, there are no potential copyright issues with quoting from parts of works. The citations page is at best useless and at worst actively harmful and should not be used except to collect evidence for missing words or senses. DTLHS (talk) 17:26, 9 August 2017 (UTC)
If examples don't reflect actual usage they are likely to be bad examples. Copyright issues could arise if a quotation is too long or not properly attributed and laws for this possibly vary around the world. Quotations often are not reflecting actual (common) usage either, so I don't think that's a good reason to have them. W3ird N3rd (talk) 20:18, 9 August 2017 (UTC)
My understanding is that Wiktionary's servers are based in the USA, so it is primarily US law that must be complied with. It is unlikely that the quotations we use would violate copyright for two main reasons. First, all material published before 1923 is in the public domain in the USA and can be freely reproduced. Secondly, most of our quotations are obtained from works available on Google Books and the Internet Archive. If it is possible to view either a snippet or a full page preview of a book on Google Books, then the use of that portion of the book must be fair use under the law. Ergo, quoting an even shorter portion on Wiktionary must also be fair use. — SGconlaw (talk) 11:29, 10 August 2017 (UTC)
I wouldn't go so far as to say that availability on Google Books is indicative of anything, but the amount of text in the kind of quotes we use should fall under fair use. If the quote is too long for fair use, it's way too long for our purposes. Chuck Entz (talk) 14:06, 10 August 2017 (UTC)
In the books I've been reading lately, I've come across at least one to two dozen words in each one that we don't have entries for. Is it safe to take quotations for each of those words from the same book? How many quotations should I limit myself to to avoid violating fair use? Andrew Sheedy (talk) 17:47, 10 August 2017 (UTC)
@Andrew Sheedy First of all you should obviously check those words haven't been made up by the writer and they pass WT:CFI. There is no limit. For each word you should limit yourself to one or two quotes, there is no point anyway in having more quotations from the same work. As for quoting from the same work but on different page entries on Wiktionary, I would say that if the total amount quoted from the work is less than 5% of the entire work you have absolutely nothing to worry about. For a book that means there is no practical limit. For a poem a bit more would be allowed, some poems just might accidentally end up being entirely quoted here in small bits. As long as there's no obvious intention to violate copyright by overquoting a work, you'll be safe. W3ird N3rd (talk) 19:11, 10 August 2017 (UTC)
I'd completely forgotten about the 5% rule--thanks for reminding me. And don't worry, I always make sure to find citations for words before I add them (which is the main reason I haven't gotten around to adding more...). Andrew Sheedy (talk) 17:15, 11 August 2017 (UTC)
@Andrew Sheedy not sure if you are being sarcastic, genuinely grateful, referring to the 5% rule in general or if there really is a 5% rule for quotes/citations. It's just a number I picked, it could have also been 1%, 3%, 7%, etc. The point remains the same however, for fair use (which I think includes the right to quote for the U.S.) it would generally be a reasonable safe threshold. It could be exceeded in various cases, if I wrote a review for a poem that is twice the length of the original poem, there's a good chance I could cite the entire poem in small pieces. But under 5% for all quotes combined you simply don't have to worry about it - which is the majority of the time. W3ird N3rd (talk) 18:33, 11 August 2017 (UTC)
I thought it was an actual rule (i.e. you can legally reproduce 5% of a work). Maybe it is in Canada? I'll have to look that up. Andrew Sheedy (talk) 02:40, 12 August 2017 (UTC)
Yes, Wiktionary servers are in the U.S., but Wiktionary content might be reused by people in other countries without fair use. W3ird N3rd (talk) 19:11, 10 August 2017 (UTC)
Not that it is terribly relevant to the conversation, but Wikimedia servers are not all based in the US, nor should we expect that they will reside in the US exclusively in the future. - TheDaveRoss 12:53, 11 August 2017 (UTC)
Indeed irrelevant to the discussion at hand. I suspect servers outside the U.S. are caching servers, caches have different rules, but if someone wants to know more they should start a new discussion. W3ird N3rd (talk) 05:06, 12 August 2017 (UTC)
  • I think (but this just my interpretation) that citations and the citation page are perhaps meant for long quotes. "I have a dream", "Yes we can" or "Build a wall" would be a quotes. This would explain why quotes are allowed in the main dictionary space: copyright generally shouldn't apply to a quote. For example:
We choose to go to the moon in this decade and do the other things, not because they are easy, but because they are hard
This is a quote and there's pretty much no chance this is copyrighted, similar to the moon not being copyrightable. However:
We choose to go to the moon in this decade and do the other things, not because they are easy, but because they are hard, because that goal will serve to organize and measure the best of our energies and skills, because that challenge is one that we are willing to accept, one we are unwilling to postpone, and one which we intend to win, and the others, too.
Is citing Kennedy and likely can be copyrighted, so it requires proper attribution (what proper is depends on the country you are in) and/or be allowed by fair use. W3ird N3rd (talk) 19:11, 10 August 2017 (UTC)
I doubt very much if the latter quotation is a breach of copyright. It is still only a small portion of the entire speech. It might be a different matter if we reproduced, say, a third or half of the speech, but that isn't what we do anyway. I agree with Chuck that the quotations we use here at the Wiktionary are unlikely to raise copyright issues. — SGconlaw (talk) 21:50, 10 August 2017 (UTC)
Fair use is a lot more permissive than the right to quote, but is specific to the United States. W3ird N3rd (talk) 03:25, 11 August 2017 (UTC)
For what it's worth, speeches made by government officials in their official capacity are in the public domain, so none of that speech is copyrighted in the US. - TheDaveRoss 12:58, 11 August 2017 (UTC)
That's true for this example, I should have mentioned that. Thanks. W3ird N3rd (talk) 05:06, 12 August 2017 (UTC)


Do we have a page which lists all entries containing "Idiom" as a headword? If not, can we get one made? I guess we prefer Verb rather than Idiom for things like take an axe to. -WF

  • Before it was declared forbidden, I've used the ====Idioms==== header in the past for set expressions using a particular word, such as at 糞#Idioms. I see that some other JA entries have these expressions listed under ====Derived terms====, which doesn't seem quite right either, as these aren't "terms", sometimes comprising even full sentences.
What is the accepted header for these items now? ===Verb=== is not applicable for most of the Japanese expressions I can think of. ‑‑ Eiríkr Útlendi │Tala við mig 17:24, 9 August 2017 (UTC)
Please limit this to English-only. “Idiom” is the conventional translation for the Chinese part of speech of chengyu. Wyang (talk) 07:42, 10 August 2017 (UTC)
Are we going to have "Haiku" as a part of speech next? —CodeCat 09:45, 10 August 2017 (UTC)
Tangentiality. Are you OK? Wyang (talk) 09:51, 10 August 2017 (UTC)
That doesn't seem a part of speech coordinate with noun, verb, adjective, but I don't know how it would be used. Are they not used as nouns, verbs, adjectives, or something else? It seems like using "Word coined by Shakespeare" as a part of speech header. Admittedly, we have "Proverb", which might be similar. — Eru·tuon 17:32, 10 August 2017 (UTC)
A proverb is generally nothing more than a sentence. —CodeCat 17:40, 10 August 2017 (UTC)
  • Great discussion, but what I wanted was a page listing all entries with {{head|en|idiom}} —This unsigned comment was added by WF back from hols (talkcontribs) at 15:26, 10 August 2017‎ (UTC).
    There's no way to get a single-page listing, but you can search the wikicode for insource:/\{\{head\|en\|idiom/. [edit: Actually, it is a single page, just because there are so few.] — Eru·tuon 20:32, 10 August 2017 (UTC)
    It's also possible to de-list "idioms" from Module:headword/data. Then it would no longer be recognised as a valid POS category and end up in Category:head tracking/unrecognized pos. —CodeCat 20:35, 10 August 2017 (UTC)
    Thanks Erutuon! That's exactly what I wanted. I'm been making my way through those pages. A little cleanup done, and a few of them sent to RFx. --WF back from hols (talk) 23:40, 11 August 2017 (UTC)
    Without a plan on what to do after that, that would be unwise. There are 2,305 entries using {{zh-idiom}} (insource:/\{\{zh-idiom/) and 4,317 entries in the category for Chinese idioms, so that tracking category would become cluttered. And the cooperation of editors who handle Chinese would be needed to get the entries moved to the proper part of speech. — Eru·tuon 21:16, 10 August 2017 (UTC)
    I really hate the mentality that everything that is “improper” in English is assuredly improper in other languages by default, and needs to be “fixed”. Idiom is a perfectly fine part of speech in Chinese, and is in fact the most common translation for Chinese chengyu. Chinese lexicography treats these as a separate category of words, and there are numerous dictionaries compiled just for words belonging to this category. The comprehensive Chinese dictionaries typically do not mark entries by their part of speech, due to the analyticity of the language. In those monolingual and bilingual dictionaries that do, these words are either marked by  成  (cheng, idiom) (primarily in bilingual dictionaries) or unmarked (in Chinese–Chinese dictionaries), in juxtaposition to  名  (noun),  动  (verb),  形  (adjective),  副  (adverb),  惯  (phrase),  谚  (proverb),  歇  (xiehouyu), etc. Examples include the Contemporary Chinese Dictionary, the Comprehensive Standard Chinese Dictionary, the Oxford Chinese Dictionary, the Times New Chinese–English Dictionary and so on. The same idiom can be used as noun, verb, adjective, adverb, etc. depending on the context in the sentence, and their use is different from that of proverbs, phrases, and xiehouyu. Wyang (talk) 22:27, 10 August 2017 (UTC)
    Ahh. If they can be used as multiple other parts of speech, then I can see the lexicographic usefulness of keeping them as they are rather than trying to list all the other parts of speech they can be used as. However, it would be helpful to distinguish them somehow from the concept of idiom in English, which is quite different. The description in the category page Chinese idioms is probably not correct. — Eru·tuon 22:53, 10 August 2017 (UTC)
    I guess those entries should also contain {{lb|en|idiom}} so they show up in Category:English idioms if they are indeed an idiom. Since there are so few that shouldn't be a problem. Some recently disappeared already so this looks like it's getting phased out. W3ird N3rd (talk) 20:47, 10 August 2017 (UTC)
    But is idiom a context in which the word appears? If not, then it might be a misuse of the label template. —CodeCat 20:52, 10 August 2017 (UTC)
    I agree that the POS should be based on how they are used and not where they come from. Thus, I would say "proverbs" should really have the POS "clauses". --WikiTiki89 21:00, 10 August 2017 (UTC)
    Possibly, but in that case the English idiom category will have to be populated in some different way. insource:/\{\{lb\|en\|idiom\}\}/ gives 210 hits. W3ird N3rd (talk) 21:34, 10 August 2017 (UTC)

MW has a new feature to see dates of coinages[edit]

https://www.merriam-webster.com/words-by-first-known-date/1786Justin (koavf)TCM 07:30, 10 August 2017 (UTC)

Very cool. But really, the dates are sense-specific, and hence word-specific only if the word is monosemic. Wyang (talk) 07:39, 10 August 2017 (UTC)
At first I thought you meant MediaWiki, and was worried they were up to another waste of human resources. --WikiTiki89 15:42, 10 August 2017 (UTC)

-градить and other "combining form"s[edit]

A bunch of Russian entries are appearing in Category:head tracking/unrecognized pos, because they use the POS category "verbal combining forms" which is not valid. They are also being categorised as verbs, which is even less correct because these forms don't actually exist. They are only found in compounds, and are thus comparable to creating cran for the first morpheme in cranberry, or liezen for the base verb of verliezen. Something should be done about these. They can't be moved to the reconstruction namespace, they are not reconstructions because they are not conjectured to exist; we know they don't exist. A valid POS should also be used so that they don't clog up cleanup categories anymore. —CodeCat 18:00, 10 August 2017 (UTC)

What's wrong with "Combining form"? Crom daba (talk) 18:54, 10 August 2017 (UTC)
Perhaps they should be recategorized as "combining forms" or have that category added. It is a recognized lemma type in Module:headword/data. I agree they don't really count as verbs in a sense. But I think you are against the idea of a combining form, because you've recategorized combining form entries that I've created. — Eru·tuon 19:03, 10 August 2017 (UTC)
A combining form is a non-lemma form that is used when combined with another morpheme. That's very different from this. —CodeCat 19:05, 10 August 2017 (UTC)
Why is it not a lemma? I see how you can say it's not a real word, but it is a lemma in that it is a form representative of a paradigm of related forms (i.e. the conjugated forms given in the conjugation table). --WikiTiki89 19:30, 10 August 2017 (UTC)
These are lemmas, I'm not disputing that. I'm saying that combining forms aren't lemmas. Most of the categories in Category:Combining forms by language contain nonlemmas, even though the categories themselves are categorised as lemmas. —CodeCat 20:15, 10 August 2017 (UTC)
Oh. Yeah, it looks like what we use "combining form" to mean is completely different from what these are. In fact I was actually in favor of removing the hyphen from these entry names. I think we need to come up with a special name for these. Something like "unused base verb". --WikiTiki89 20:20, 10 August 2017 (UTC)
There's also things like Judeo- which are a bit in between. They are of course combining forms of nouns in Ancient Greek, but in English they don't really belong to anything. Or do they? —CodeCat 20:28, 10 August 2017 (UTC)
I think that's a separate unrelated issue. **гради́ть (**gradítʹ) morphologically could have stood on its own if it existed, but it just so happens that it only exists with prefixes (although it's quite possible that it did exist in Proto-Slavic or earlier). Judeo- I would say is a combining form whose uncombined forms don't exist. --WikiTiki89 20:41, 10 August 2017 (UTC)

Wiktionary: a translation dictionary only?[edit]

Should we stop pretending to be a good monolingual dictionary, for the achievement of which the wiki way ("wisdom of crowds") seems ill-suited? Would we be better of playing to what seems to be our strength: translation. This would mean "translation target" would be an automatic justification for any English entry and would upgrade the importance of phrasebook entries and common collocations. It would probably benefit from simplification of complex polysemous entries like those for technical, let alone really polysemous terms. DCDuring (talk) 19:21, 11 August 2017 (UTC)

The project should probably be forked, to support the deletionist and never-delete-anything-ist camps. Equinox 19:29, 11 August 2017 (UTC)
A "good translation dictionary" necessarily describes complex polysemous English words, so no. DTLHS (talk) 23:12, 11 August 2017 (UTC)
How can we a good translation dictionary now, then? DCDuring (talk) 04:23, 12 August 2017 (UTC)
I suspect this has been triggered (although probably not initiated) by the revert of my edit on "technical". Wiktionary:Requests_for_cleanup#technical W3ird N3rd (talk) 23:45, 11 August 2017 (UTC)
Don't take it too hard. I know that basic nouns, verbs and adjectives with multiple senses are hard and the most basic function words are harder yet. If cleaning up technical were easy, then I would have done it myself. I'm out of practice and never successfully tackled any basic function words. DCDuring (talk) 04:29, 12 August 2017 (UTC)
I think the solution, to appease both camps, is to actually allow the oft-discussed collocations section/namespace/whatever. This would allow us to be a far better translations dictionary because each collocation would have a translation section and we wouldn't have to resort to the controversial "translation target argument." The inclusionists would also be able to include far more, since much of what is hotly debated in RFD could be kept as a collocation. Deletionists could also be satisfed because there would be less pressure from inclusionists to keep SOP collocations in the mainspace. Andrew Sheedy (talk) 04:53, 12 August 2017 (UTC)
Wiktionary has included languages other than English for a long time, if not from day 1. So it's only right that Wiktionary should be translation-oriented. One inconsistency I have found regarding SoP terms is that there are entries for vegetable soup and pea soup, yet none for tomato soup, and there's bound to be translations for that. One nice touch I have just found is translations for soft-boiled egg and hard-boiled egg listed under boiled egg. DonnanZ (talk) 13:36, 12 August 2017 (UTC)
Inclusion of collocations is at most half of the solution. If, as User:DTLHS notes, "[a] 'good translation dictionary' necessarily describes complex polysemous English words", how do we improve our entries for such terms? Or is the current state of these entries good enough for translation work and for ESL learners, with native speakers mostly ignoring such entries anyway?
If we include more collocations, can we rely on the entries for collocations to share the burden of the definitions for verbs like go (go clubbing) and "particles" like abox and away?
Is it reasonable to admit that we can't really help those users who take a component-oriented approach to looking at sentences? Just as we say that users need help in determining where morphemes break in German and other compound nouns, we should also say that users can't be expected to know which meanings are only fully captured in collocations. Expecting users to wade through derived terms in go to find go clubbing does not seem very realistic. If a user knows to go to clubbing, that user probably doesn't need the go clubbing entry at all. DCDuring (talk) 16:22, 12 August 2017 (UTC)
I don't really object to making translation our focus. However, in order to be a truly comprehensive translation dictionary, I think we also have to be a comprehensive monolingual dictionary. And I don't think we're really doing as badly as you feel. Yes, we're a long way from being another OED, but we're also good enough that I'm able to use Wiktionary as my primary dictionary. Conversely, we actually suck at translations from English into other languages (even common ones like French and Spanish). I'm really not convinced it's our strength. FL to English translations tend to be much better, but even these are often lacking. The reality is that we're a work in progress on all fronts, and always will be.
Now, if we include collocations, I think we should handle them more or less as follows:
  1. Do not move definitions from main entries over to collocation entries (some duplication is fine, and people should still be able to find what the want in the main entry);
  2. Create separate entries for them, rather than hosting them in another mainspace or on the same page as any of their component words (we could potentially treat collocations like "forget about" differently from "piece of furniture", the latter having its own entry, the former sharing a page with "forget");
  3. Label them with a banner just as we do for phrasebook entries, to mark them as SOP and allow us to continue to function as a monolingual dictionary, regardless of our focus;
  4. Include full definitions in collocation entries, for clarity;
  5. Eliminate obvious information like pronunciation or etymology from collocation entries, but retain things like translations and synonyms;
  6. Link to collocations from the entries of each of their component words (excluding really basic words, like articles);
  7. Use either "Derived terms" or "Related terms" (possibly renamed) or a new "Collocations" section to host lists of collocations, subdividing the list into different categories as necessary;
  8. Allow collocations in all languages so that we can truly function as a translation dictionary: someone translating from French should be able to look up "pointe de pizza" or "se faire tuer" (or find these in the entries for pointe and pizza / faire, and tuer) and find the corresponding English collocations, "piece of pizza" and "get oneself killed".
I doubt we'll ever solve the problem of people taking a component-oriented approach to looking up multiword terms or collocations. But that doesn't mean we shouldn't try to be helpful to those who do know how to identify multiword expressions. The best we can do is list multiword terms and collocations in the entries for each of the constituent parts, and make long lists easier to navigate by splitting them up by category. Andrew Sheedy (talk) 17:46, 12 August 2017 (UTC)
I'd prefer it if we hosted collocations but they were not listed at constituent lemma pages and generally had close to zero incoming links to them. Crom daba (talk) 18:48, 12 August 2017 (UTC)
How would a person find them all then? Andrew Sheedy (talk) 21:50, 12 August 2017 (UTC)
  • One interesting aspect of mass inclusion of collocations as a new class of entry is that we would be substituting two boundaries that needed some kind of policing for one. Instead of a single include/exclude decision, we would need to decide whether to include or exclude and whether something was a collocation or an idiom. I am not confident that we would achieve any more agreement in total on these two decisions than we do now on one. Are we imagining that any collocation at all would be entered, subject to current RfV? live free or die? parlare con tono di condiscendenza? Wouldn't we be increasing the number of truly offensive items? Should we exclude full sentences that are not proverbs and not phrasebook entries? (More decisions!!!) DCDuring (talk) 19:31, 12 August 2017 (UTC)
Very true, although maybe we would be able to create a stricter set of criteria for determining whether something is SOP or not? I think a lot of RFD debates would be mostly solved if entries could be kept as collocations: those where terms are technically SOP, but not transparently so (sometimes because they use obscure senses of a word), and are not necessarily easily understood (e.g. "nature preserve"); those where an expression uses a more or less consistent word order and has become a fixed phrase; and those where the only justification for keeping an entry is its value as a translation target. I think most people could agree to keep such entries, but label them as collocations. Andrew Sheedy (talk) 21:50, 12 August 2017 (UTC)

IPA ≠ audio[edit]

Entries where the pronunciation is different from that in the audio, as for example in polemic, should be automatically detected and listed --Backinstadiums (talk) 20:25, 11 August 2017 (UTC)

How do you propose we do that? DTLHS (talk) 20:28, 11 August 2017 (UTC)
@DTLHS: Auto-generated subtitles could be created using some software and then compare both columns of data. 90% of the job would be done that way, the rest could be manually reported individually as @Wyang has proposed --Backinstadiums (talk) 07:12, 12 August 2017 (UTC)
You vastly overestimate the ease of generating written transcriptions, much less IPA, from audio files. Others can probably explain better why this is so difficult. See e.g. [4]. DTLHS (talk) 07:32, 12 August 2017 (UTC)
Not automatically, but perhaps via a more accessible feedback system: “Saw an error on the page? Report it here.” Wyang (talk) 21:52, 11 August 2017 (UTC)

Merging Category:Chinese language and Category:Sinitic languages to a single category (Category:Chinese language(s)?)[edit]

Sinitic languages is just another name for the Chinese languages. It is confusing to have both categories on Wiktionary. It seems there is room for improvement in the category system for macrolanguages; there are categories such as Category:Mandarin terms derived from Sinitic languages which really should be renamed to Category:Mandarin terms derived from other Chinese languages. Wyang (talk) 07:14, 12 August 2017 (UTC)

I agree that the current situation, in which we have two sets of categories for what is basically the same entity, is confusing. It would be hard to merge the two categories, however. x language is a category created by {{langcatboiler}} that uses data from Module:languages, while x languages is a category created by {{famcatboiler}} that uses data from Module:families. And currently only a language with a data file can have entries; a family cannot. I'm not sure how to merge the two in the existing system. And what code would we use for the combined entity? How can we make something be simultaneously a language and a family? — Eru·tuon 23:41, 12 August 2017 (UTC)

Language request: Old Kannada[edit]

Old Kannada (Kannada: ಹಳೆಗನ್ನಡ (haḷegannaḍa)) needs to be included. Proposed code: okn. It is a Dravidian language. Immediate ancestor: Proto-Tamil-Kannada. Scripts: Brahmi, Kadamba, Kannada. Descendants: Middle Kannada -> Modern Kannada. ɱɑɗɦɑѵ (talk) 07:28, 12 August 2017 (UTC)

That seems like a reasonable language to add. Can you give any examples or indication of how different it is from Kannada kn? (Other notes: Exceptional codes need to be formatted differently, so it would have to be dra-okn. We cannot add Kadamba script because it seems that it is not in Unicode. Proto-Tamil-Kannada is also not registered as a language.) —Μετάknowledgediscuss/deeds 07:33, 12 August 2017 (UTC)
@Metaknowledge: There's a significant difference between Old Kannada & its modern descendant. It's barely intelligible with modern Kannada. Some sound changes (like the transformation of Proto-Dravidian *p to Kannada [h]) are not present in Old Kannada. The case-suffixes are also different. As for the script, I hope it'll be acceptable to create lemmas in Old Kannada in the brahmi or the kannada script. -- ɱɑɗɦɑѵ (talk) 12:26, 12 August 2017 (UTC)
@माधवपंडित There seems to be a distinction between Old Kannada and Purva Halegannada. Should we encode them separately? DerekWinters (talk) 13:27, 12 August 2017 (UTC)
@DerekWinters: I saw that as well. About 500 years of time gap. I think Pūrva-Halegannada is what we'd call pre-Old Kannada or Proto-Kannada. But the matter source is small... as it is, Halegannada is poorly documented on the internet. If i'm not wrong Proto Kannada attestations are from just a few oldest inscriptions. Perhaps we can make Proto Kannada an etymology only language, used in etymology but cannot have lemmas of its own. -- ɱɑɗɦɑѵ (talk) 13:35, 12 August 2017 (UTC)
It can be like Primitive Irish or Pictish, attested from very few sources. Personally I think it better to add it separately. DerekWinters (talk) 13:39, 12 August 2017 (UTC)

A quick update on changes of translation adder[edit]

I have updated the gadget to fetch language scripts from the module. Also, it fails (gracefully of course) if the input script is not in the list of scripts from module. So, you may notice some functionality changes. Let me know if the changes are for the worse. Dixtosa (talk) 19:21, 12 August 2017 (UTC)

Is it anything to do with the annoying little +- signs that have popped up in translations sections? DonnanZ (talk) 19:30, 12 August 2017 (UTC)
Those were always there, but the spacing is off now ([5]) (Chrome) DTLHS (talk) 19:37, 12 August 2017 (UTC)
Yes. Fixed. Dixtosa (talk) 20:29, 12 August 2017 (UTC)
Also added the ability to hide the transliteration input if the language has automatic transliteration that overrides manual.--Dixtosa (talk) 13:08, 20 August 2017 (UTC)

Distinction between derived and related terms[edit]

It's been a long while since I did any serious editing here. I've been updating some of the derived words sections. I noticed that the section for "language#Derived terms" looked very sparse, so I added some more entries. After doing so, I saw that many of them were already in the Related terms section.

Has policy changed lately? My understanding has always been that Derived terms is for words formed by appending affixes ("metalanguage") and compounds ("dead language", "language lab") and that Related terms was reserved for words that are etymologically related in some other way ("linguistics", "lingua franca").

I notice that the Wiktionary:Entry_layout page doesn't make this very clear and doesn't give any examples. Perhaps it could be updated?

In the meantime, I'll tidy up Derived terms and Related terms for language, but please revert if this is no longer the way things are done.

Paul G (talk) 19:38, 12 August 2017 (UTC)

Technically a derived term is also a related term. So sometimes there are lists of terms that people have just put all together under "related terms" without distinction. Your understanding matches mine and your edits to language look fine. DTLHS (talk) 19:43, 12 August 2017 (UTC)
That, too, is my understanding of the distinction between the two terms. — SGconlaw (talk) 20:06, 12 August 2017 (UTC)
Confusion can also be caused be placing some derived terms under hyponyms and others under derived terms or related terms. Personally I would like to see hyponyms done away with - I can hear the protests already. DonnanZ (talk) 20:38, 12 August 2017 (UTC)
Thanks for the responses. There seem to be a number of pages where derived terms are words formed with affixes and related terms are compounds — rock, for example — so some editors at least seem to have thought this is what the sections are for. — Paul G (talk) 20:47, 12 August 2017 (UTC)

IPA policy[edit]

The (phonemic) English pronunciation keys in most of the major dictionaries (as well as the associated Wikipedia article use ⟨r⟩ as a standard phoneme. I feel that if this is the common usage it ought to be a standard policy across Wiktionary pronunciation sections. In many articles people have replaced ⟨r⟩ with ⟨ɹ⟩, ⟨ɚ⟩ et al. and while this is phonetically correct, it goes against the phenomic standard, and had created a disparate mess with little to no consistency. The best solution in my opinion is to just have both phonetic and phonemic pronunciations wherever possible, and make it a policy that ⟨r⟩ belongs in /r/ and ⟨ɹ⟩ belongs in [r], etc. This has the advantage of giving the maximum amount of information, while remaining in standard with M-W, Collins, etc. Any input would be appreciated. --Pariah24 05:07, 13 August 2017 (UTC)

I wonder if it is possible to create {{en-IPA}} to standardise the generation of IPA for English (and represent the dialectal variation; cf. International Phonetic Alphabet chart for English dialects). Having manual IPA on all 480,000+ English lemmas would be a logistical nightmare. Wyang (talk) 05:17, 13 August 2017 (UTC)
We had a discussion about this many years ago. At first I was in favor of using /r/ in the phonemic representation of English, but eventually I came around to the idea of using /ɹ/, chiefly because we are not an English-only dictionary. If we were, if English Wiktionary had only English entries, I still would prefer /r/; but because we have entries in thousands of languages, including ones where /r/ really does stand for [r], I think it's ultimately less misleading to use /ɹ/ for English. —Aɴɢʀ (talk) 07:39, 13 August 2017 (UTC)
I agree with Angr. If we use /r/ for [ɹ], readers seeing /r/ in other languages might mistakenly believe that they represent the same, or similar sounds. We fill a different niche than other English dictionaries, and as a result, our policies might differ in some areas. Andrew Sheedy (talk) 17:14, 13 August 2017 (UTC)
I agree with Angr and Andrew. Since most English speakers pronounce ⟨r⟩ as [ɹ], it's appropriate to use /ɹ/. I'd second Wyang on creating an English IPA template. — justin(r)leung (t...) | c=› } 19:28, 13 August 2017 (UTC)
There's no need for a separate template- normalization can take place in the IPA module. DTLHS (talk) 19:29, 13 August 2017 (UTC)
How can one implement a module without a template, exactly? —Aryaman (मुझसे बात करो) 20:39, 13 August 2017 (UTC)
Huh? All IPA is already processed through Module:IPA. All we would need to do is implement specific rules for English. DTLHS (talk) 20:47, 13 August 2017 (UTC)
@DTLHS: I would assume we would make MOD:en-IPA and implement it in {{en-IPA}}, just like every other language with an IPA module. —Aryaman (मुझसे बात करो) 23:03, 14 August 2017 (UTC)
What about English dialects? How do we ensure the correct symbols are used and symbols are used in a consistent manner, for say, RP? Wyang (talk) 21:34, 13 August 2017 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── I'm more concerned with just having a standard to go by than what that standard is. I guess I'll start changing /r/ when I see it, although I still think it would be helpful—especially on pronunciations that differ significantly from the standard phonemes—to have separate /phoneme/ and [phone] pronunciations. Pariah24 (talk) 23:41, 13 August 2017 (UTC)

What do you mean by "pronunciations that differ from the standard phonemes"? — Eru·tuon 23:55, 13 August 2017 (UTC)
Sorry if that's an awkward way to put it...I mean pronunciation differences in accents/dialects, and loanwords, and cases like pun and spun whose pronunciations are both /(s)pʌn/ but phonetically are [pʰʌn] and [spʌn]. It would be helpful to someone learning English to have both versions. Pariah24 (talk) 00:11, 14 August 2017 (UTC)
Ahh, I see. Phonetic transcriptions showing the exact pronunciation of stops are welcome. I think there are some transcriptions like that already. As for accents, keep in mind that many dialectal features are phonemic, because dialects do not all share the same phonological system, and so we show them in the phonemic transcriptions. You can see examples in Appendix:English pronunciation. Not shown on that page are the phonemic transcriptions for American English dialects without the horsehoarse merger. (See hoarse for an example.) — Eru·tuon 00:38, 14 August 2017 (UTC)

Regarding whether to create a separate module and template for English IPA transcriptions: I think it would be much neater than adding a lot of English-specific stuff to Module:IPA. I say a lot, because I think it would be a good idea to automatically convert between different transcription systems for RP, if we could get someone who knows enough about them. For instance, automatically displaying both the OED's more old-fashioned transcription of lot, /lɒt/, and Geoff Lindsey's more modern one, /lɔt/. @Mr KEBAB proposed something like this, but I haven't done anything with the idea yet. — Eru·tuon 00:02, 14 August 2017 (UTC)

@Erutuon: Yes I did, but if we're going to use Lindsey system here we should fully follow it, not cherry-pick some of the symbols and not others (I'm saying this because I believe I proposed a mixed system a year ago, this is not a good idea for several reasons). Mr KEBAB (talk) 00:40, 14 August 2017 (UTC)

I think it would be very nice to have something similar to what we have for Ancient Greek and Latin, with different regional or period pronunciations all indicated. The template input for English would obviously have to be the broad phonemic transcription, though, rather than the spelling of the word. We'd have to be careful, however, of cases where pronunciation variants actually represent different phonemes rather than differences of realization; in such a case we'd probably need to call the template multiple times on the page, each time with a separate phonemic transcription and corresponding generated phonetic transcriptions for various dialects – there would be parameters for which dialects/variants to include or not include. – Krun (talk) 00:51, 14 August 2017 (UTC)

An input from outside. In French Wiktionary, we had a large discussion on pronunciation and neutrality two years ago and we renewed our policy. We started by defining that phonetic information have to be based on audio recordings and have to be several to describe variety (in space, time, social groups). A phonemic information have to be based on a specific analysis, made on a specific dialect and can't stand for a whole language. There is a diversity of phonemic representation. To be neutral on this perspective is not to select and promote one analysis (equal one variety) but to give the different analysis, with sources. So: phonetic with audio sources, phonology with written sources (linguistics piece of work).
Finally, we consider the needs of the readers, and we consider they do not need dozen of phonetics and dozen of phonological representations. They want a short information, giving a usual way of pronouncing a word, consensual, as unmarked as possible, and we created a third way to indicate this specific information, with backslash signs like \θis\. This last one is provided in the first part of the page, and the other ones on the second part of the page, for people eager to have more precise information. It was quite not a huge change, but a great improvement in the frame it offer for people to add new information without colliding with existing ones. Less controversies on "false phonological representation" and more accurate descriptions. If you want to know more about this, I can help you, or translate some pieces of French Wiktionary policy Face-smile.svg Noé 10:16, 15 August 2017 (UTC)
I think Wiktionnaire has a good system. I find it interesting how the very broad pronunciation is included in the header, but I don't like how more detailed pronunciation is relegated to the bottom of the entry and often neglected. Having a very broad transcription with everything else in a collapsable box might be a good solution. On the other hand, it would be hard to decide what transcription to use when a word has been affected by a merger or split in many dialects. Andrew Sheedy (talk) 16:07, 18 August 2017 (UTC)

Words with uncertain reading[edit]

Recently the Egyptian entry jsqꜣrwnj

i s q A rw

was added alongside our previously existing entry jsqꜣrnj

i s q A r

But these aren’t in fact two different attestations with two different spellings; they’re both representing a single attestation from the Merneptah Stele, where the original engraver inscribed a hieroglyph very poorly and modern authors have proposed two different readings of what it was intended to be. Do we have any policy about what to do in such a case — keep only the more plausible/widely accepted entry? Keep both? (And, if so, what would they be marked as? Alternative forms, even though they really aren’t?) — Vorziblix (talk · contribs) 10:01, 14 August 2017 (UTC)

Perhaps create an "alternative reading" template, and use it in the entry for the less widely accepted reading. Then list the less widely accepted reading in the Alternative forms section for the more widely accepted form. — Eru·tuon 16:30, 14 August 2017 (UTC)
Sounds good. For now I’ll just do {{form of|Alternative reading}} rather than an altogether new template, but if more of these start cropping up, so that categorizing them becomes useful, I’ll go for a separate template. — Vorziblix (talk · contribs) 23:18, 14 August 2017 (UTC)
Thanks, that clears things up a lot. — Vorziblix (talk · contribs) 23:18, 14 August 2017 (UTC)
Another example is ᚐᚆᚓᚆᚆᚈᚈᚋᚅᚅᚅ / ᚐᚆᚓᚆᚆᚈᚈᚐᚅᚐᚅ. In most cases, it's possible to be reasonably certain how to read an inscription, but when it's not (in individual cases), the practice does seem to be to have multiple (cross-linked) entries. Whether or not it is sensible for one of the entries to be a "form of" redirect to the other entry depends on whether the difference in reading entails a difference in meaning. - -sche (discuss) 06:30, 15 August 2017 (UTC)

Flag gadget edit request[edit]

Could an admin change the URL for the Ancient Greek flag in MediaWiki:Gadget-WiktCountryFlags.css from Flag_of_Palaeologus_Dynasty.svg to Byzantine_imperial_flag,_14th_century,_square.svg? The file has been moved, and there's been an error message in the browser console because the CSS file tries to load the file using the old name. — Eru·tuon 17:41, 14 August 2017 (UTC)

DoneDixtosa (talk) 17:55, 14 August 2017 (UTC)

What's the deal with the garbage "American Sign Language" entries?[edit]

Is there an editing tool that produces these, maybe with ASL as the first in a list of languages? I don't think it's a single vandal producing all of them. DTLHS (talk) 20:41, 14 August 2017 (UTC)

@DTLHS: Do you have a link or diff? —Justin (koavf)TCM 21:09, 14 August 2017 (UTC)
People often create fully-formed ASL entries with all the usual headings, but with no actual content or definition. Yes, there is a tool that creates these, but I can no longer find it. I've seen it before. Equinox 21:11, 14 August 2017 (UTC)
  • It's the New Entry Creator; one of its defaults is ASL. —Μετάknowledgediscuss/deeds 21:35, 14 August 2017 (UTC)
    Actually, it's the second-from-the-top entry template, on the search results page, not the New Entry Creator. There should really be an AbuseFilter to take care of those. --Yair rand (talk) 01:21, 21 August 2017 (UTC)

@DTLHS: how would you improve them? --Backinstadiums (talk) 22:12, 14 August 2017 (UTC)

I don't think you understand. They're contentless entries that are deleted on sight. —Μετάknowledgediscuss/deeds 22:16, 14 August 2017 (UTC)

Adding language code 'ghc'[edit]

Hi all, I am thinking it might be useful to add the code 'ghc' for the historic common written language of Ireland and Scotland, particularly in cases where it's not clear whether a word derives from Irish or Scottish Gaelic. Gherkinmad (talk) 21:33, 14 August 2017 (UTC)

I don't know what lect you are referring to or when it was used. We have codes for Old Irish (sga) and Middle Irish (mga), and those should suffice. —Μετάknowledgediscuss/deeds 21:37, 14 August 2017 (UTC)
I can kinda see the point. While, technically, Scottish Gaelic can be seen to be differentiating itself from Irish as early as the Book of Deer, for pretty much the entire Middle Ages you can't really tell between them. And everything after 1200 is currently classified as either ga or gd. So a Classical Gaelic could be seen as a useful intermediary step:
  • pgl Primitive Irish (–c.600)
    • sga Old Irish (c.600–c.900)
      • mga Middle Irish (c.900–c.1200)
        • ghc Classical Gaelic (c.1200–c.1800)
          • ga Modern Irish (c.1800–)
          • gd Scottish Gaelic (c.1800–)
That would require some refactoring, though. It would make etymologies slightly less messy: as it is, there appears to be an issue with taking a gd word back through ga to mga. This way, they could both branch from ghc. --Catsidhe (verba, facta) 21:55, 14 August 2017 (UTC)
Do we have a resolution? Gherkinmad (talk) 23:08, 14 August 2017 (UTC)
Resolution? We barely have the start of a discussion! Also, this sort of thing has been suggested before (by me at least once, IIRC) and not happened, so maybe a wider debate will have some impact. --Catsidhe (verba, facta) 23:26, 14 August 2017 (UTC)
(Without expressing an opinion on whether this is a good or bad idea,) it would be possible to add 'ghc' as an "etymology-only language" so that etymologies could refer to it, even if we don't want to add it as a "full language" with its own entries / language sections (which might duplicate many mga and ga entries?). - -sche (discuss) 06:46, 15 August 2017 (UTC)
  • @Angr is the expert, and he hasn't voiced a need for this as far as I've seen. But I'd like his thoughts. —Μετάknowledgediscuss/deeds 03:57, 15 August 2017 (UTC)
    For reference, the code was removed following this discussion in 2013. (I have no great knowledge of the subject and defer to people like Angr and Catsidhe who are familiar with the Irish language(s).) - -sche (discuss) 06:37, 15 August 2017 (UTC)
    My views haven't changed since that 2013 discussion. I think mga, ga, gd, and gv are sufficient to cover all Goidelic lects from the 10th century to today. The problem with making it an etymology-only language is that etymology-only languages are varieties of one particular existing language, but the whole motivation behind ghc is to avoid calling it either Irish or Scottish Gaelic (because it's basically both). —Aɴɢʀ (talk) 08:33, 15 August 2017 (UTC)
    For the purpose Catsidhe is talking about, it seems like it could be considered a variety of Middle Irish... but then, I don't see why branching Scottish Gaelic and Irish from ghc is any better than branching them both from mga, or why branching them from mga like we do now causes "an issue" — Catsidhe, can you explain? - -sche (discuss) 09:39, 15 August 2017 (UTC)
    No one considers Middle Irish going as late as 1800, though. Middle Irish is generally seen as ending around 1200 (much earlier than Middle English, for example), so we consider everything after that to belong to one of the modern languages, even though the literary language (as opposed to the colloquial language) is virtually identical in Ireland and Scotland until around 1800. —Aɴɢʀ (talk) 11:55, 15 August 2017 (UTC)
    Which is why I find the distinction between Early Modern Irish and Early Scottish Gaelic (to 1800) to be annoyingly artificial. There is nothing linguistic which distinguishes just about any given 14C Irish from 14C Scottish. The only way you can tell in most cases is by knowing beforehand where or by whom it was written.
    Also, having ga cover 800 years of history makes it tricky to use for both historical research and for current usage. Unless you're paying attention, it can be easy to miss that one word became moribund in the 16C, and another entered the language in the 1980s. The former case isn't going to help if you're writing a letter to someone in Gaoth Dobhair, the latter isn't going to help if you're doing Mediaeval research. --Catsidhe (verba, facta) 12:16, 15 August 2017 (UTC)
    Yes the motivation is to avoid calling the language either Irish or Gaelic, because in the case of the English word Gael we simply don't know which variety it came from, and we might be a little more honest if we simply said so. The OED has the word first in modern English in 1775/1810 from Scottish Gaelic, but I completely accept that the word might have a longer history in the language, and so I was thinking we could meet Angr halfway by saying it derives from Classical Irish/Gaelic, or otherwise simply that it derives from Middle Irish. Gherkinmad (talk) 16:48, 15 August 2017 (UTC)
    @Angr OK, the matter has been more or less resolved. However I would still advocate the ghc code for cases where there is a further intermediary stage, otherwise there are three words for Gael in modern Irish: Gaoidheal, Gaedheal and Gael, all covering the same time period. Any thoughts? Gherkinmad (talk) 16:03, 16 August 2017 (UTC)
    There will have to be three entries for modern Irish anyway, since the spellings Gaoidheal and Gaedheal were used up until the mid-20th century, long after ghc would be over. That's the main reason for my opposition to ghc: it would increase unnecessary redundancy. If we had it, we would have to have Gaoidheal in both ga and ghc instead of just ga; likewise we would have to have new ghc entries for common words whose spellings haven't changed, like fear, bean, mac, , , athair, máthair, and so on and so forth. It doesn't seem worth it to me to duplicate the effort. —Aɴɢʀ (talk) 16:15, 16 August 2017 (UTC)
    @Angr, @-sche Could we add ghc as an etymology-only language? Because as the matter stands, one would have to say that the modern Irish word is first attested in print in 1567 in Scotland, without further explanation. I just don't see a way to credit this properly without referring to a further intermediary stage of the language which is of course still Irish. Gherkinmad (talk) 16:53, 16 August 2017 (UTC)
    As was said before, etymology-only languages always have a parent language that they belong to. This parent is used, for example, in determining which section should be linked to. So which language does ghc belong to, Irish or Scottish Gaelic? It doesn't solve the problem at all, just moves it. —CodeCat 18:29, 16 August 2017 (UTC)
    It belongs to Middle Irish: Scottish Gaelic and (Late) Modern Irish could both branch from ghc whose parent language is mga. I'm sorry for pressing this so much, I just know you really have to make your case if you want to edit Wiktionary. Gherkinmad (talk) 19:28, 16 August 2017 (UTC)
    If ghc's parent language is mga, then this carries the implication that all ghc terms are mga terms. Every link to a ghc term in fact creates a link to a mga entry because of how the parent of an etymology language works. Since every link should have an entry behind it, it implies that any links in etymologies to ghc terms attested from 1200 to 1800 are implicit requests for Middle Irish entries to be created on those pages. So, you want us to create Middle Irish entries for terms attested as late as 1800? —CodeCat 20:05, 16 August 2017 (UTC)
    If that's what will eventuate no I don't, though this does need further discussion, as it's clear not everyone accepts the current policy as it is. Gherkinmad (talk) 21:03, 16 August 2017 (UTC)

Tagalog enclitic forms[edit]

In Tagalog, any word ending in "a, e, i, o, u, or n" has an enclitic form (sort of). For example, the word "malaki" (big), to say a "big person" one says "malaking tao", adding an "ng" at the end. And that goes for adjectives, nouns, verbs, all words. The question is, do we make an entry of the enclitic form for all the words in Tagalog that has them? --Mar vin kaiser (talk) 10:51, 15 August 2017 (UTC)

It sounds a bit like English -'s or Latin -que, i.e. a clitic that can be added to virtually anything. And we don't have entries for person's or virumque, so I'd say we shouldn't have an entry for malaking either, but just one for malaki and one for -ng. BTW, how do words ending in other sounds behave? —Aɴɢʀ (talk) 11:58, 15 August 2017 (UTC)
@Angr: Well, for example, the word "maliit" (small), to say a "small person" would be "maliit na tao". Actually some see the word "malaking" to be a contraction of the word "malaki" and the word "na" which links words together. One problem is that for example, the word "taong", it could mean four things,
  1. "taong" - a black veil for mourning
  2. "taóng" - water container (we don't write diacritics to indicate stress in Tagalog, so both are under the same entry)
  3. "taong" - the word "tao" (person) + "na"
  4. "taóng" - the word "taón" (year) + "na"
So my point is, shouldn't the last two be in the entry "taong" also? --Mar vin kaiser (talk) 13:23, 15 August 2017 (UTC)
@Mar vin kaiser: Well, look at butcher's: it has several meanings of its own, but the transparent one of butcher + the clitic -'s isn't actually listed. —Aɴɢʀ (talk) 13:42, 15 August 2017 (UTC)
@Angr: Good point. Although, the entry it's has it. But I do see your point. --Mar vin kaiser (talk) 13:45, 15 August 2017 (UTC)
@Angr: The reason why I feel it's important is because for example, any two words that are beside each other, the first one has to be in enclitic form, and think of the number of entries that have two words. For example, "free will" is "malayang loob", but there won't be any entry for "malayang", only "malaya". And that would go for all the other entries that have two words. --Mar vin kaiser (talk) 13:59, 15 August 2017 (UTC)
@Mar vin kaiser: It's probably at it's because in the standard written language, the one thing it's isn't is it + the possessive -'s, but only it + the contracted verb -'s. As for the headword line, that's not a problem. At the entry for malayang loob, just add |head=[[malaya]][[-ng|ng]] [[loob]] to the headword template. —Aɴɢʀ (talk) 14:13, 15 August 2017 (UTC)


Versageek has been inactive for more than a year, so per the WMF policy her checkuser rights have been revoked. The policy requires that any local wiki have two or more checkusers if they have any, my rights have been suspended as well pending our electing another. We can opt not to bother having local checkusers and simply rely on the stewards to take care of requests, or we can nominate one or more new checkusers and have some elections.
From my perspective it is not strictly necessary to have local checkusers, but it is convenient. Almost all of the work these days is keeping track of and blocking the long-term pests, and making sure we are actually blocking Wonderfool when we think we are. - TheDaveRoss 12:59, 15 August 2017 (UTC)

Having local checkusers is definitely a good thing. I'm surprised WF hasn't made any votes to encheckuserify anyone. —Μετάknowledgediscuss/deeds 16:57, 15 August 2017 (UTC)
"Encheckuserify"? Beware, lest you affixiate. — Kleio (t · c) 20:02, 15 August 2017 (UTC)
I thought User:Chuck Entz was a checkuser, since he does a good job of keeping track of the IPs/locations of various vandals. He seems like a good candidate for the position. - -sche (discuss) 19:47, 15 August 2017 (UTC)
Oddly enough, I probably wouldn't have as much to say if I were a checkuser, since I understand there are fairly strict rules about what information obtained with the checkuser tools can be disclosed and when you can use them. Right now, I get pretty much all my information from geolocating just about every IP that does something out of the ordinary and looking for patterns (that and monitoring the abuse filter logs). I'm not sure what I would be allowed to say/do if I spotted an IP that had earlier turned up in a checkuser investigation (though I could probably block them). That said, I'm game, if everyone thinks it's a good idea. Chuck Entz (talk) 02:43, 16 August 2017 (UTC)
I actually think Chuck is a great candidate, but I was under the impression that we had an old (unwritten?) rule that no one user should have all the user rights at en.wikt simultaneously. —Μετάknowledgediscuss/deeds 04:19, 16 August 2017 (UTC)
You both bring up good reasons for pause. Who else wants the job? We could nominate WF; then he'd have to ID himself to the Foundation to get the flag... ;) lol - -sche (discuss) 05:10, 16 August 2017 (UTC)
I think Chuck is a great choice as well. Re "having all the rights", I don't see a problem there. Our 'crats have a fairly limited scope of responsibility which doesn't much change how they might be able to (ab)use the CU tools. This is a different story than other wikis which have roles such as ombudsmen, abrcom, etc.
Re limiting your ability to act, I have not found that to be a problem. In the cases where an anonymous contributor is connected to a previously blocked logged-in account you may have to be somewhat oblique (e.g. not using the name of the blocked account, just saying that they are evading a block) but that is actually a fairly rare situation. - TheDaveRoss 14:56, 16 August 2017 (UTC)
I thought I was clear enough above, but I'll restate it: any minor concerns I may have mentioned as an aside have no bearing on my main point, which was that I'm willing to be a checkuser, if that's what the community wants. I'm not a hat collector, and I can't say that being a bureaucrat, for instance, has exactly enhanced my life, but if someone needs to do it, it might as well be me. Chuck Entz (talk) 23:16, 8 September 2017 (UTC)
Probably better to have some local ones. Equinox 19:49, 15 August 2017 (UTC)
Local is good, but what about Chuck's stated concerns. Where is it written that checkusers can't disclose publicly available info? Who can be asked about this? DCDuring (talk) 04:27, 16 August 2017 (UTC)
Perhaps we can create a new class of superuser: "Chuckuser". DCDuring (talk) 04:28, 16 August 2017 (UTC)
lol! - -sche (discuss) 05:10, 16 August 2017 (UTC)
@DCDuring: The policy dictating the use of the tool is here, and is also governed by the privacy policy and the access to nonpublic information policy. There are lots of words there, but essentially it is OK to talk about publicly available information, and it only gets tricky when your interpretation of public information is affected by nonpublic information. - TheDaveRoss 15:02, 16 August 2017 (UTC)
So Chuck's concerns are in maintaining a "Caesar's wife" standard, probably appropriate. DCDuring (talk) 22:04, 16 August 2017 (UTC)

For what it's worth, I have these user rights on Wikispecies, so I am already vetted by the WMF. I would be willing to have those tools here. —Justin (koavf)TCM 05:12, 16 August 2017 (UTC)

Metaknowledge started a vote for Koavf, and I made a comment on the discussion page there suggesting that we also vote on admin status at the same time. - TheDaveRoss 12:35, 21 August 2017 (UTC)

I would like to become a checkuser. --Daniel Carrero (talk) 07:13, 16 August 2017 (UTC)

DI CheckUser. PseudoSkull (talk) 22:09, 16 August 2017 (UTC)
@PseudoSkull, I don’t think we accept most of the specialized terminology and abbreviations used by Wikipedia/Wiktionary here, such as CheckUser, RfV, RfD, and so on, but we put them in the Wiktionary:Glossary. —Stephen (Talk) 22:27, 16 August 2017 (UTC)
If there are 3+ external citations, I would disagree. PseudoSkull (talk) 22:28, 16 August 2017 (UTC)
But let's discuss that elsewhere. Perhaps in WT:TR so that the discussion at hand can continue. PseudoSkull (talk) 22:28, 16 August 2017 (UTC)

Review of Ecjklangs (talkcontribs)' contributions[edit]

Most of these sex-related entries appear only in Urban Dictionary (OneLook backs me up on this), but some entries - such as sexcess - are somewhat citable. Not sure how durable they are though (I mean, floorcest anybody?). Anyone in the mood for a look-through? --Robbie SWE (talk) 08:47, 16 August 2017 (UTC)

Translations added by IvanScrooge98 (talkcontribs)[edit]

This erroneous edit by User:IvanScrooge98 in Recent Changes attracted my attention. A quick check of their recent additions of Chinese translations shows that he/she is certainly a non-speaker of Chinese. A large proportion of their added Chinese translations were outright incorrect, others often problematic. Some recent, outright erroneous examples include: diff, diff, diff, diff, diff, diff. It's a shame that such sloppiness was not picked up earlier and was allowed to persist for such a long time. Their additions of translations in other languages also need to be thoroughly checked. Wyang (talk) 10:23, 16 August 2017 (UTC)

Hmmm… excluding the first one (from zh.wiktionary), I based the other edits, as I usually do, on the respective Wikipedia articles. I'm sorry if there's something wrong and willing to fix my mistakes. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 14:32, 16 August 2017 (UTC)
You should be careful when using non-English Wikipedias as a source because they are full of made-up garbage. Every now and then I have to remove a Portuguese translation that you add because it doesn’t meet our attestation criteria or are inaccurate. There’s no harm in using Wikipedia as a starting point when researching translations, but you should at least check Google Books. — Ungoliant (falai) 14:48, 16 August 2017 (UTC)
Guess I should more when I can. Sorry. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 15:14, 16 August 2017 (UTC)
Most of the errors in Chinese translations are in your inferred Pinyin readings and traditional/simplified forms; these are more serious factual errors. Please see if you can fix the examples above, now knowing that they contain errors. Many of your added Chinese translations are sum-of-part terms which do not warrant inclusion on Wiktionary, but that is less serious of a problem. Wyang (talk) 00:59, 17 August 2017 (UTC)
@Wyang: is diff fine, for instance? [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 13:40, 17 August 2017 (UTC)
It's better, though both terms are SoP and should link to the individual components. Could you please also fix the six others? Wyang (talk) 21:54, 17 August 2017 (UTC)
@Wyang: would you mind check if my attempts are correct? Also, should I undo my additions at Warwick and Portoferraio? [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 10:22, 18 August 2017 (UTC)
Not really, you haven't fixed the factual errors on those pages. It's all right- I will fix them. Please do not add other Chinese translations. Wyang (talk) 10:26, 18 August 2017 (UTC)

definitions vs. predicates[edit]

The entry for alt-right and the Tea Room/2017/August#alt-right discussion are recent manifestations of a failure to respect the concept of a definition. We do know how to do so, but sometimes some contributors act as if they believe that any predicate about a definiendum that they or someone else puts down in writing is a potential definition.

"Headquarters of US military imperialism" is not a definition of Pentagon, whether or not you believe the truth of "The Pentagon is the headquarters of US military imperialism".

What do we have to do to see to it that this basic notion of lexicography is respected? Would voting on a policy help? A definition style guide? DCDuring (talk) 22:02, 16 August 2017 (UTC)

Statistics for numbers of etymologies[edit]

For those of you who wanted to know, as of August 16, 2017, the largest number of etymologies any entries on the English Wiktionary has is 15. That entry is zꜣ. (Wouldn't it be amazing if you ever saw "Etymology 27", "Etymology 9432", etc. LOL ROFL LMAO) PseudoSkull (talk) 01:57, 17 August 2017 (UTC)

"Etymology 4320603, Etymology 4320604" PseudoSkull (talk) 01:58, 17 August 2017 (UTC)
You're ready for Wikidata Face-smile.svg Noé 15:29, 17 August 2017 (UTC)
Quite impressive! — Eru·tuon 19:11, 17 August 2017 (UTC)

AWB Rights[edit]

I'd like to use AWB on Wiktionary, mainly to run typo fixes, and a regex I made to split long See Also/Related terms/etc sections into columns, and to be able to search inside templates, and to dump Wiktionary data offline for faster searching and whatnot. I already have rights on English Wikipedia. Pariah24 (talk) 19:08, 17 August 2017 (UTC)

I notice that no one has even nominated you for autopatroller status, which means that all your edits are marked for review. It seems silly to have people not trusting your edits enough to stop checking all of them, but at the same time giving you the ability to make them in bulk. Chuck Entz (talk) 02:36, 18 August 2017 (UTC)
DI AWB (wiki sense). PseudoSkull (talk) 04:44, 18 August 2017 (UTC)
I really don't care if people review my edits, and the AWB policy makes no mention of that as a prerequisite. I've been editing Wikipedia for quite a while longer than I've had this account, and this "we don't trust your edits" business sounds pretty anti-AGF to me. Never had an admin say something like that to me before; do you speak this way to everyone? A simple no would have sufficed. Pariah24 (talk) 08:53, 18 August 2017 (UTC)
I am sure Chuck was not intending to offend. We have the edit patrol feature enabled here, and the general practice is that once someone has been editing for a little while the people who patrol edits notice that they make reasonable edits and don't need to be patrolled any longer. If you are not yet set to autopatrolled status it may indicate that you have not edited here sufficiently (or, sometimes, sufficiently well) to have been noticed and flagged by a patroller. I would suggest that you just continue making good edits and I am sure you will be autopatrolled an eligible for AWB in no time. - TheDaveRoss 12:21, 18 August 2017 (UTC)
Thank you. Somehow I managed to give the impression that we're so suspicious of them that we have them under surveillance, or that we think there's something wrong with their edits, or that we only talk to the cool people who already know the secret handshake... The simple fact is that AWB access requires that we know the contributor in question well enough to be sure they know Wiktionary's standards and practices well enough to avoid making mistakes, since those mistakes would be propagated much more widely using the AWB tool, and that we just don't know them well enough- yet. Chuck Entz (talk) 08:41, 20 August 2017 (UTC)

Tsolyáni language[edit]

Do we include words in this fictional language? I'm working my way through some missing French nouns and came across zaqé which our French friends define as "Troisième jour de la semaine dans le calendrier tsolyáni". SemperBlotto (talk) 05:48, 18 August 2017 (UTC)

No, see Wiktionary:Criteria_for_inclusion#Constructed_languages. DTLHS (talk) 05:50, 18 August 2017 (UTC)

Using HTML attributes instead of classes for WT:ACCEL[edit]

Currently, WT:ACCEL has data passed to it using CSS classes, so that the resulting Wikicode looks like this on bar: <span class="form-of lang-en plural-form-of"><b class="Latn" lang="en">[[bars]]</b></span>. There's a few points to note about this.

  1. There's two wrapping HTML elements, span and b, even though these could easily be combined into a single b element, as long as WT:ACCEL is modified to recognise not just span elements.
  2. If step 1 is done, then there is no more need for the lang-en CSS class, because WT:ACCEL can extract it directly from the b element's lang= attribute.
  3. HTML allows you to specify custom attributes named data- followed by any text. We can use this, rather than CSS classes, to specify the inflectional data.

All in all, the line above would end up looking like this: <b class="Latn" lang="en" data-accel-form="plural">[[bars]]</b>.

What do people think of this change? @Dixtosa in particular. —CodeCat 12:54, 19 August 2017 (UTC)

Looks a bit cleaner. Equinox 12:56, 19 August 2017 (UTC)
Looks cleaner, yes, but I do not see any other benefit... yet. --Dixtosa (talk) 13:21, 19 August 2017 (UTC)
I very much like this, even if there are no benefits besides the neatness. — Eru·tuon 23:46, 19 August 2017 (UTC)
@Dixtosa Can MediaWiki:Gadget-WiktAccFormCreation.js line 20 be modified to $('.form-of a.new').each(function(){, and line 23 to var formof_classnames = $(this).closest(".form-of")[0].className.split(' ');? This will allow elements other than span to contain the acceleration information, which facilitates step 1. —CodeCat 11:22, 20 August 2017 (UTC)
Step 1 has been completed, and the link on bar now looks like this: <b class="Latn form-of lang-en plural-form-of" lang="en">[[bars]]</b>. Step 2 can now be implemented. It might be as simple as putting something else in line 76 of MediaWiki:Gadget-WiktAccFormCreation.js, but I'm not sure what. The way the code is currently written, it only passes the classes to the function (the details parameter), not the wrapper element itself. Would replacing line 76 with lang: $(link).closest("[lang]").attr("lang"), be sufficient? Perhaps the code should be restructured so that the element itself is passed around instead of only the classes, but I will leave that to Dixtosa to implement. —CodeCat 18:34, 20 August 2017 (UTC)

Disambiguate Wikisaurus (thesaurus) entries by language[edit]

So wikisaurus:juoppo -> wikisaurus:fi:drunkard, wikisaurus:drunkard/Finnish or wikisaurus:drunkard/fi and

wikisaurus:insane -> wikisaurus:en:insane, wikisaurus:insane/en or wikisaurus:insane/English

Whether we use English or native words in the pagetitle, collisions would quickly happen as soon as someone added non-English words (which they may have refrained from out of uncertainty). I personally prefer the first scheme, because it is similar to what we use for topical categories like Category:de:Graph theory and because it does not imply the existence of useless superpages (parent page? root page? the opposite of a subpage). As for using native versus English words: the WS entry is tied to meaning seen as abstract from specific words, so I do not see why we should not use English. Are there any large synonym groups that cannot be succinctly expressed in English?__Gamren (talk) 13:23, 19 August 2017 (UTC)

I prefer Wikisaurus:English/drunkard, following the same scheme as Rhymes pages. —CodeCat 13:26, 19 August 2017 (UTC)
I prefer to keep the current Wikisaurus setup for its simplicity until it becomes obvious that collisions are an actual problem. --Dan Polansky (talk) 14:02, 19 August 2017 (UTC)
Here are some strings that might be expected to have many synonyms in more than one langauge: god (Danish/English), person, gut (Nynorsk/German), pen (Welsh/Norwegian/Mindiri/Mapudungun). Is it obvious yet?__Gamren (talk) 14:18, 19 August 2017 (UTC)
From what I have seen, collisions have not become an actual problem yet. Currently, we cater for collisions by being setup for multiple languages per Wikisaurus page, on the model of the mainspace. If you start expanding Danish part of Wikisaurus and you run into obstacles preventing you from productively expanding that part, we can see how to best remove them. --Dan Polansky (talk) 14:27, 19 August 2017 (UTC)
For reference, one of the subject home pages: Wiktionary:Wikisaurus#Multilingualism. One past discussion: Wiktionary:Beer_parlour/2009/March#Wikisaurus_-_non-English_entries - here, a suggestion was made that would lead to wikisaurus:fi:juoppo. --Dan Polansky (talk) 14:39, 19 August 2017 (UTC)
I have now edited WS:god and created WS:da:beautiful (I would be fine with CodeCat's suggestion above, as well).__Gamren (talk) 18:25, 19 August 2017 (UTC)
I support having pages like Wikisaurus:English/drunkard, per CodeCat. It would be consistent with rhymes and reconstruction pages.
--Daniel Carrero (talk) 12:42, 21 August 2017 (UTC)
I also think that would be the best format. Andrew Sheedy (talk) 04:23, 25 August 2017 (UTC)
WS:Danish/pain, WS:Danish/furthermore, WS:Danish/villain.__Gamren (talk) 15:19, 10 September 2017 (UTC)
But inconsistent with topical categories. --Dan Polansky (talk) 19:14, 10 September 2017 (UTC)
I'm puzzled; the proposal is to have pages for the x language synonyms of the x language translation of an English word? — Eru·tuon 18:12, 10 September 2017 (UTC)
I don't like this change in practice. If such a change should take place, let me note that language codes are slightly winning over full language names in Wiktionary:Votes/2017-07/Rename categories; WS:da:villain would be more in keeping with that than WS:Danish/villain. However, perhaps the opposes in that vote are not only about language code vs. language name. I seem to prefer WS:da:villain over WS:Danish/villain; what I prefer most is the continuation of the preexisting practice in many non-English Wikisaurus entries, including WS:příbuzný, Finnish: WS:juoppo, French: WS:chat, Hindi: WS:मनुष्य, Polish: WS:gruchot, Portuguese: WS:duradouro, and Telugu: WS:కుక్క. --Dan Polansky (talk) 19:14, 10 September 2017 (UTC)
The current practice is supported by {{ws}} in that the template automatically links to a Wikisaurus entry if it exists; thus, in WS:bird, there is "[WS]" next to "passerine", linking to WS:passerine; this works automatically with current practice for both English and non-English WS entries, but will require additional markup with the proposed new practice. --Dan Polansky (talk) 19:20, 10 September 2017 (UTC)
@Erutuon No, this proposal is about changing the pagetitle of WS entries, such that entries in different languages are on different pages. Please read what I initially wrote.__Gamren (talk) 18:49, 11 September 2017 (UTC)
@Gamren: I did read it, but I am responding to the fact that you want to move the synonyms of juoppo to a page titled with an English translation drunkard. I was describing what this means: having a page for synonyms of a translation of an English word. (In this case, a page for Finnish synonyms of a translation of the English word drunkard.) It is kind of confusing as a concept. It seems similar to having an expanded table of translations for the various senses of an English word. So it would be less confusing if it were titled Translations:fi:drunkard.
It would get more confusing when the word (a given series of letters) has several different meanings in English and in the language of the page. How do you tell people that they can't add senses of the word in the language of the page? For instance, that WS:de:fast should not contain synonyms for a German meaning, "almost", but can contain synonyms for an English meaning, "moving at a high speed".
And which English-titled page should the content of a non-English-titled page be moved to, if it contains multiple senses that do not all belong to a single English word? (I do not know if this case exists because I haven't worked on Wikisaurus pages much, if at all.) — Eru·tuon 19:15, 11 September 2017 (UTC)
We use {{ws sense}} to specify the sense, so there should not have to be doubt as to what sense a page uses. But I do not really mind native-language titles.__Gamren (talk) 07:19, 14 September 2017 (UTC)
I have renamed the discussion from "Disambiguate WS entries by language" to "Disambiguate Wikisaurus (thesaurus) entries by language" to make it easier to find later. --Dan Polansky (talk) 08:21, 17 September 2017 (UTC)
Some more notes:
1) The Danish content of Wikisaurus:god can be at Wikisaurus:brillant; WS pages are often synonym rings, and thus you can often find a synonym for which the headword in Wikisaurus namespace is still unoccupied. This makes it possible to continue with reduced overlap even in the current setup. Admittedly, once a lot of languages get covered by Wikisaurus, it may be increasingly hard to ensure non-overlap, but we have not arrived at that stage yet, so we do not know. Alternatively, the English content of Wikisaurus:god could be at Wikisaurus:deity.
2) If editors insist/very much desire that there is a guaranteed non-overlap, the naming scheme proposed long time ago should be considered: Portuguese Wikisaurus:duradouro would be at Wikisaurus:pt:duradouro, or there would be, for Danish, Wikisaurus:da:smerte (pain) instead of WS:Danish/pain. Then, the automatic "[WS]" hyperlinking as seen in WS:bird is easy to make work, by providing language code to {{ws}} in entry markup. This scheme is dissimilar to the one used by categories and the one used by rhyme pages, but that is fine: Wikisaurus has problems different from those of categories and rhymepages, and therefore can use a different scheme, one tailored to meet its needs. Whether language codes or language names are used in this scheme is a separate choice to be made, and is relatively less important; Wikisaurus:Danish:smerte and Wikisaurus:Danish/smerte make it equally easy to make "[WS]" automatic interlinking work.
--Dan Polansky (talk) 08:26, 17 September 2017 (UTC)

employment category?[edit]

Do we have a category for employment related terms like job title, trade union, severance pay, etc.? This would eminently useful IMO. ---> Tooironic (talk) 02:16, 20 August 2017 (UTC)

English names for letters of the Arabic language[edit]

Do we include these? Our page Arabic script has a table of them, but they link to the Arabic letters themselves. SemperBlotto (talk) 04:47, 20 August 2017 (UTC) (I've just added the French zhâl - hope it's OK)

Category:en:Arabic letter names DTLHS (talk) 04:48, 20 August 2017 (UTC)
So my French term seems to be wrong - I can't figure out how to correct it. SemperBlotto (talk) 04:51, 20 August 2017 (UTC)
What do you mean wrong? DTLHS (talk) 04:54, 20 August 2017 (UTC)
We're probably missing the English names of some letters if you're concerned that it's a red link. DTLHS (talk) 04:56, 20 August 2017 (UTC)
Also some of the entries currently in Category:en:Arabic letter names are not letters; they are Arabic diacritics. Wyang (talk) 04:57, 20 August 2017 (UTC)
OK, I'leave it alone - totally outside my comfort zone. SemperBlotto (talk) 05:00, 20 August 2017 (UTC)

Edits by[edit]

This IP user has been adding "Ancient Armenian" (= Old Armenian) terms as etymons for Modern or Ancient Greek terms, while deleting the old etymologies, as well as some other things. The etymologies are dubious: for all of them, because the terms were attested before the time of Old Armenian, and doubly so for some, because they are phonologically implausible (առասպել (aṙaspel) supposedly yielding μῦθος) or the etymology actually goes the other way (ῥινόκερως was calqued by ռնգեղջիւր (ṙngełǰiwr)). Not sure what to do here besides revert the edits, which I've done. It would probably be better manners to explain to him or her, but I don't feel like it. — Eru·tuon 06:42, 20 August 2017 (UTC)

I reverted an edit by this same idiot (using a slightly different IP) that added "ancient Armenian" to the etymology for an Old Armenian term- they're even further out there than you give them credit for. They're changing their IP, so I doubt they'll read anything you leave on their talk page, but it never hurts to try, I guess. That said, feel free to revert them- as far as I'm concerned, they're only one step removed from the vandals who randomly replace language headers with the names of their own languages. Chuck Entz (talk) 08:13, 20 August 2017 (UTC)


Hi there. I'm taking a month off Wiktionary to concentrate on things IRL. You won't be hearing from me at all. So, in the unlikely case that you see someone here who you think might be me, who's following my edit patterns or whatever, it won't be. Thanks. . --WF on Holiday (talk) 17:49, 20 August 2017 (UTC)

BBC Pidgin English platform[edit]

An interesting source of vocabulary if anyone feels like taking up a new language (Category:Nigerian Pidgin language). DTLHS (talk) 18:50, 22 August 2017 (UTC)

Nigerian Pidgin English has a big problem with orthographic norms (i.e., they're all over the place). The BBC uses a very acrolectal orthography, but many sources use a phonemic orthography (like what we try to standardly use for Krio, which is an extremely similar language from Sierra Leone). —Μετάknowledgediscuss/deeds 18:56, 22 August 2017 (UTC)

Kingdom of Great Britain[edit]

Translation requests can no longer be added to this entry, I'm guessing because sense IDs are being used. How to fix this bug? ---> Tooironic (talk) 13:30, 23 August 2017 (UTC)

Category:etyl cleanup - cleanup going backwards?[edit]

This shows 109703 entries needing attention at the moment, a day or two ago it was 109682. Does this mean someone is still using {{etyl}}? DonnanZ (talk) 19:44, 23 August 2017 (UTC)

Probably. If the template still works and there are no obvious errors or warning messages people are still going to use it. DTLHS (talk) 19:48, 23 August 2017 (UTC)
Then maybe the template should be disabled to give us a chance. It's a terribly long job getting rid of it ({{etyl}}) anyway. DonnanZ (talk) 19:54, 23 August 2017 (UTC)
Could always make an edit filter that, at very least, flags entries which add {{etyl| to an entry. - TheDaveRoss 20:07, 23 August 2017 (UTC)
The person(s) still using it need(s) to be traced somehow. DonnanZ (talk) 14:37, 24 August 2017 (UTC)
I added AF71 to flag these with "etyl" so we can see who is using them. - TheDaveRoss 15:42, 24 August 2017 (UTC)
It's caught one fish already, User:Froaringus. DonnanZ (talk) 16:10, 24 August 2017 (UTC)
 :-/ Oops. Sorry! While I try to use inh, der, bor, from time to time I found etyl useful when there is a composite word... But if the idea is to reduce its presence obviously I won't use it again.--Froaringus (talk) 18:14, 24 August 2017 (UTC)
I use this. I have not been made aware of alternatives, nor that it is unacceptable to use it. Equinox 18:18, 24 August 2017 (UTC)
{{etyl}} should generally be replaced with {{der}}, {{inh}}, {{bor}}, {{cog}}, or {{noncog}}, depending on the situation. — Eru·tuon 18:34, 24 August 2017 (UTC)
Ditto. SemperBlotto (talk) 18:28, 24 August 2017 (UTC)
Personally I use {{der}} and {{cog}}. I don't believe in using {{bor}} or {{inh}}, in any case those two create excess bumph at the bottom of the page. DonnanZ (talk) 19:18, 24 August 2017 (UTC)
They only create "excess bumph" because we make them double-categorize, which can be in principle be disabled if the community so decides. --Tropylium (talk) 21:18, 24 August 2017 (UTC)
There was actually a vote that decided the opposite. So apparently the majority wants this. —Rua (mew) 10:54, 25 August 2017 (UTC)
I'm also guilty...it seems to me there have been cases where none of the other templates fit, but my memory may be playing tricks on me. I can't remember exactly which entries they were. I wasn't actually aware that we were trying to get rid of {{etyl}} either. Andrew Sheedy (talk) 04:30, 25 August 2017 (UTC)
I originally had a problem with {{der}} when I didn't know the actual word it was derived from, until I hit on using a hyphen in the last field, e.g. {{der|nb|gml|-}}. DonnanZ (talk) 08:17, 25 August 2017 (UTC)
When you don't know the word, you're supposed to leave it empty. A hyphen means "no word belongs here", which is wrong. —Rua (mew) 10:55, 25 August 2017 (UTC)
If you know the language it comes from, at least that can be shown that way. Alternatively {{der|nb|gml|}} which brings up the question "term?" DonnanZ (talk) 11:03, 25 August 2017 (UTC)
Yes, that's the point. If you don't know the term, it puts down a notice so someone else can add it. —Rua (mew) 11:07, 25 August 2017 (UTC)
The first example is not necessarily "wrong", {{der|xx|yy|-}} can be used in other circumstances. DonnanZ (talk) 18:31, 29 August 2017 (UTC)
I still use it in my cleanup of Spanish entries because all I know is the Latin words they come from. Should I use der every time? Some people have mindlessly converted etyl into der in every case, but the issue still remains that we've changed the system so that it will always be clear whether terms are inherited or borrowed. Even if we deprecate etyl, we'll just have to populate cat:etyl_cleanup with entries using der but not bor or inh... It seems to me that we'll just have cleaner template parameters. Ultimateria (talk) 22:32, 29 August 2017 (UTC)
I get the impression that some users resent having a cleanup forced upon them, which may not have been necessary if it wasn't for the {{bor}} / {{inh}} brigade. I suggest that if you want to clear the list just use {{der}}, and let the purists sort out what are considered as borrowed and inherited terms later. The use of {{der}} involves less keystrokes than {{etyl}}, which is something in its favour. DonnanZ (talk) 09:06, 30 August 2017 (UTC)
That is not particularly helpful. That means that after {{etyl}} is "cleaned up", {{der}} will have to be cleaned up. Better to leave {{etyl}} or to learn where to use {{bor}} and {{inh}}. There are also cases where {{der}} can't be used: words that are not involved in an etymology, and have a hyphen in the second parameter of {{etyl}} (for instance, {{etyl|en|-}} {{m|en|word}}, which must be changed to {{cog|en|word}} or {{noncog|en|word}}). — Eru·tuon 22:33, 1 September 2017 (UTC)
We can generate cleanup lists with the new templates to pick out some of the mistakes or omissions. For example, an etymology whose first etymology template is {{der}} is probably missing something; either it should be {{inh}} or {{bor}}, or a step has been left out. Conversely, {{bor}} can only appear as the first etymology template, and {{inh}} can only be preceded by other {{inh}}s. —Rua (mew) 22:41, 1 September 2017 (UTC)
@CodeCat: I suppose that would work in the majority of cases, but as it relies on the first step in the etymology coming first, there may be some cases that fall through the cracks. (Hypothetically; I can't point to any examples.) — Eru·tuon 20:47, 2 September 2017 (UTC)
Etymologies aren't always written in order (see bindiga, for example), and sometimes {{der}} is appropriate when something can neither be said to be borrowed nor inherited. —Μετάknowledgediscuss/deeds 22:35, 2 September 2017 (UTC)

Lithuanian dalyvis participles[edit]

mylįs. Should these be categorized as lemmas or non-lemmas? DTLHS (talk) 22:11, 23 August 2017 (UTC)

Can we not give it an English name? —Rua (mew) 22:14, 23 August 2017 (UTC)
I have no idea. What would you call them? DTLHS (talk) 22:14, 23 August 2017 (UTC)
Adjectival participles. That's what Wikipedia calls them. —Rua (mew) 22:17, 23 August 2017 (UTC)
In any case, they are non-lemmas, because they are forms of verbs. The part of speech should be changed to Adjective though. —Rua (mew) 22:19, 23 August 2017 (UTC)

Please restore my admin rights[edit]

As title. It's been more than a year now. It's incredibly frustrating to not be able to delete new user vandalism or delete the original as I move entries with wrong titles. Wyang (talk) 09:38, 24 August 2017 (UTC)

Support, please give Wyang and CodeCat's rights back. --Daniel Carrero (talk) 10:27, 24 August 2017 (UTC)
Support per Daniel Carrero. --Anatoli T. (обсудить/вклад) 10:30, 24 August 2017 (UTC)
Oppose. The issue with Module:links still hasn't been solved. —Rua (mew) 10:49, 24 August 2017 (UTC)
That issue doesn't need to be resolved in order for the two of you to be admins, you just both need to agree to be reasonable adults and not wheel war when you disagree. There are lots of things on the wiki with which various admins disagree, but it is possible to disagree and also be willing to abide by the decisions of others. If what you are really saying is that you are incapable of having admin rights and acting reasonably, then I don't think you should have the admin rights. - TheDaveRoss 12:55, 24 August 2017 (UTC)
When I receive admin right's I'm going to use them to fix problems, and that includes the problem currently in Module:links. If Wyang immediately reverts, aren't we back to square one? The issue won't be solved until Wyang agrees not to revert, and so far he hasn't. In the past, I asked for people to come to a consensus on what the desired situation is, but there is still no consensus, which means there is no decision to respect and everyone is free to do as they like. Which leads to the edit war. There must be a consensus decision before this will stop, I've said it before. —Rua (mew) 13:37, 24 August 2017 (UTC)
(Edit conflict) The sad part is that, if I were to make a shortlist of the people most qualified to solve this, these two would be at the top of the list. Both have continued to make valuable contributions for a year in spite of it all, so they do know how to rise above things- it's just when you put them together that things go wrong. The ideal solution would be for them to agree to leave the disputed part of Module:links alone for now, and to work on a solution that both can agree to. I had a proposal on the table that no one seemed to object to, but any workable solution would be great. Chuck Entz (talk) 13:47, 24 August 2017 (UTC)
I don't think that's fair. Wyang wants the status quo. I will respect the status quo if there is a clear consensus that it's how we want to do things. So far, I haven't seen any. The only reason it's the status quo is because Wyang got the last revert, not because it's what anyone wants. —Rua (mew) 13:50, 24 August 2017 (UTC)
To say it another way, I will only accept a consensus on which version is preferred, I will not accept Wyang's version unless that's also the consensus. Without consensus, I will only accept my version. —Rua (mew) 13:53, 24 August 2017 (UTC)
SupportAryaman (मुझसे बात करो) 11:02, 24 August 2017 (UTC)
Oppose per CodeCat. DCDuring (talk) 13:03, 24 August 2017 (UTC)
Oppose --Victar (talk) 16:51, 24 August 2017 (UTC)
Support -- ɱɑɗɦɑѵ (talk) 17:09, 24 August 2017 (UTC)
Oppose. I'll repeat what was said in previous discussions, which is still true today: Neither admin has done or said anything to indicate that they will not resume the conflict when granted admin powers. Every one of the many attempts to resolve the conflict has failed magnificently. Perhaps we could restore their admin powers only on the condition that they will not make any edits related to the original conflict, but that might be difficult to enforce. --WikiTiki89 20:23, 24 August 2017 (UTC)
It can be enforced through a consensus, as I noted. —Rua (mew) 20:47, 24 August 2017 (UTC)
I think CodeCat (Rua) is right in demanding consensus. I praise her for keeping her position in the face of disagreement and would encourage her to continue. While I wish Wyang to have his admin powers restored too, I wish he was more willing to participate in consensus-building process one year ago concerning Module:links and I'd like him to focus on said consensus-building process if he wants to keep his changes on that module. --Daniel Carrero (talk) 21:11, 24 August 2017 (UTC)
I just want to point out that CodeCat is equally responsible for the failure to resolve the conflict. You can go back to the previous discussions and see for yourself. --WikiTiki89 21:16, 24 August 2017 (UTC)
OK, would you like to show us any diffs or quotes to back up that statement about CodeCat? I believe I did read those discussions in their entirety but I could have missed that part. --Daniel Carrero (talk) 21:20, 24 August 2017 (UTC)
I don't want to resurrect all the old arguments here. Go back to the all the discussions and see for yourself. --WikiTiki89 14:38, 25 August 2017 (UTC)
I didn't envisage that this would transition into a vote... anyway, if what is needed is a guarantee of not editing in the relevant modules, then that can be provided on my part. The dispute is not going to resolve itself even if the current state continues for another ten years, but I'm not really bothered now- there are more important things to do in life and on Wiktionary compared to spending hours and days to argue anonymously and disinhibitedly in an online forum. Those debates were a waste of mental exercises to the detriment of collegiality, really. It's a shame that Wiktionary as an online platform has also largely given in to the shortcoming of general online disinhibition secondary to anonymous, asynchronous and non-face-to-face communication, as exemplified by the discussion above. Stances are more important than comments and opinions; they precede the latter and are boldified. There were numerous moments in the past year when I wished I could delete the original entries with incorrect titles as I moved them; those entries often stayed in the speedy delete category for days before someone heeded them. There are still numerous incorrect Pinyin entries from residual moves. Likewise Wiktionary:Requests for deletion has a backlog of Chinese, Thai, etc. requests which should have been cleared a long time ago. Reverting vandalism only to see it being readded moments later. Recently was hoping to fix the link in {{ja-suru}} too, but was unable to. Wyang (talk) 22:20, 24 August 2017 (UTC)
@Chuck Entz: It appears to me that we have reached a point where one party can agree not to engage in the conflict, and the other cannot ("Without consensus, I will only accept my version."). This to me suggests that Wyang can be trusted not to wheel-war, and therefore has achieved a level of readiness that CodeCat has not, irrespective of the resolution of the original conflict. —Μετάknowledgediscuss/deeds 00:15, 25 August 2017 (UTC)
To play the devil's advocate, this may be because the current version of the module is Wyang's preferred version. —suzukaze (tc) 02:28, 25 August 2017 (UTC)
The guarantee is “not editing in the relevant modules”, not conditional on the state of the pages. Wyang (talk) 02:50, 25 August 2017 (UTC)
Support. I'm not involved with this, but I'm a generally positive thinker (at least I'd like to say). PseudoSkull (talk) 00:20, 25 August 2017 (UTC)
Oppose. Leasnam (talk) 02:43, 25 August 2017 (UTC)
Support. I'd have preferred resolution of the specific underlying conflict, by Wyang's guarantee changes my vote. DCDuring (talk) 04:41, 25 August 2017 (UTC)
Support. In addition, perhaps Codecat could outline the two sides of their dispute so we can all understand what it's about. Then we might be able to reach a consensus. I think I knew something about it at one time, but I've completely forgotten. —Stephen (Talk) 05:29, 25 August 2017 (UTC)
I want to remove the phonetic_extraction part from Module:links. To compensate, some edits need to be made to Module:th-translit, so that it works like regular transliteration and doesn't need special module support. In general I'm opposed to coding language-specific exceptions into generic modules, especially as in this case it's completely unnecessary. —Rua (mew) 10:59, 25 August 2017 (UTC)
Support restoring the rights of the sides of the conflict, provided that they don't make direct edits related to that conflict. --Z 07:00, 25 August 2017 (UTC)
At the moment I think I oppose restoring rights to either party. While they may be willing to accept one outcome or another for this specific situation, it doesn't appear that the next similar situation which arises will have any different result. The appropriate response when two well-meaning individuals disagree is to stop taking action and engage in conversation. If no mutually agreeable solution arises then have a vote and abide by the results of that vote. If that amicable resolution path is not one that an individual is willing to follow then I don't think they should be given the power to override page protection. If one or both Wyang and CodeCat say that they are willing to resolve issues in "the wiki way" going forward then I would support the reinstatement of admin rights. - TheDaveRoss 15:00, 25 August 2017 (UTC)
Consensus is not the wiki way? I want a consensus I can abide by, but so far there isn't any. The state of Module:links is left to the last editor. I want a consensus on the state of the module, whether to include Wyang's code or to exclude it. I'll abide by a consensus as long as it's clear that there is one. —Rua (mew) 15:25, 25 August 2017 (UTC)
All of your statements in this conversation reinforce the idea that you are not willing to behave in a reasonable way. It is great that you are willing to abide by consensus, but you also say that you will immediately engage in further wheel-warring until that consensus is reached. As with all non-vandalism issues, if there is controversy the status quo remains until there is consensus to change the status quo. Your statements boil down to "I will do what I want until there is a vote to stop me" which is not at all how collaboration works. Being bold only extends to the point that someone objects. - TheDaveRoss 15:37, 25 August 2017 (UTC)
I objected to Wyang's addition of code to Module:links. When I removed it, it was reinstated by Wyang. So Wyang didn't try to find consensus for his additions: someone objected, and yet his edits remain. Since I still object to his edits, and there is no consensus to keep them or remove them, it's at my discretion to remove them again. —Rua (mew) 15:56, 25 August 2017 (UTC)
So you were both unreasonable. It takes two to make a wheel war. I think TheDaveRoss has hit the nail on the head with all of his remarks here. The reasonable, adult course of action here is to leave things as they are for now, come up with a compromise that both sides can he happy with, then remove the disputed code as part of implementing the new solution. Chuck Entz (talk) 16:33, 26 August 2017 (UTC)
@Chuck Entz, it was only in the beginning that they both were unreasonable. That's no longer the case. Only Codecat promises to resume her previous edits to the modules. Wyang has guaranteed “not editing in the relevant modules”, regardless of what Codecat does with them. Wyang's word can be trusted and his admin status should be reinstated. —Stephen (Talk) 09:25, 29 August 2017 (UTC)
@Chuck Entz Any decision? Or is it going to last till the end of time...? Wyang (talk) 00:00, 1 September 2017 (UTC)
Sorry. I've been trying to avoid heat stroke while taking public transportation in 107-degree weather (I think that's somewhere in the 41-42-degree range in centigrade), so I've been a bit pre-occupied lately (did I mention that an AwesomeMeos sock has been very active lately, and I believe an Uther Pendrogn sock has posted recently as well). I hope a long weekend in my air-conditioned apartment will give me a chance to sort all of this out. Chuck Entz (talk) 02:20, 1 September 2017 (UTC)
@Chuck Entz Any update for now? Please see history of the entry zoosexuality. Wyang (talk) 23:26, 18 September 2017 (UTC)
As said in my comment above, I'm not really bothered anymore. CodeCat is free to do whatever she wants to those pages. I would rather spend my time and energy more efficiently on other things than wasting it on this. Wyang (talk) 22:35, 25 August 2017 (UTC)
Tentatively Support, based on the character revelation above. Leasnam (talk) 16:43, 26 August 2017 (UTC)
Support for Wyang, but not for CodeCat/Rua, given their respective attitudes. Andrew Sheedy (talk) 17:26, 27 August 2017 (UTC)
  • Provisional support for both Wyang and CodeCat, or for neither. I oppose restoring admin rights for only one and not the other. And I support immediately desysopping both of them if the wheelwarring starts again. —Aɴɢʀ (talk) 20:55, 27 August 2017 (UTC)
    • CodeCat states that she will edit the modules in question as soon as she is reinstated. Wyang, however, has promised not to edit the modules again and will allow CodeCat to have her way. Since Wyang will not participate in wheelwarring, there will be no more wheelwarring. —Stephen (Talk) 00:56, 2 September 2017 (UTC)
  • If anything, CodeCat's remarks above made me realise that I should rather take a detour than have any interaction with her... Life is too short. Wyang (talk) 11:28, 28 August 2017 (UTC)
  • I oppose re-sysopping Wyang. The wheel war for which he was desysopped was pretty incredible, showing a severe lack of self-control or lack of understanding why a protracted speedy revert war or wheel war is bad. Furthermore, some of his remarks on this very page suggest that he does not support consensual decision making but rather decision making based on strength of argument, which all too often results in the holder of such a view making themselves the judge of which arguments are strong and which are not, resulting in non-consensual decision making. Wyang had plenty of time to draft a vote to help resolve the matter of disagreement with CodeCat, but he did not do that, and is unlikely to do so given his apparent oppositions to votes. Wyang's adminship would provide undeniable benefits, but the things said seem to offset those benefits. --Dan Polansky (talk) 16:24, 1 September 2017 (UTC)

घॣ (ghl̥̄) and other Devanagari "Translingual" consonant + vowel combinations[edit]

These were created en masse by User:Thecurran, but there are so many things wrong on those pages:

  • The pronunciation is just the Sanskrit pronunciation, not a Translingual one. The pronunciations in modern languages are certainly not /ɡʱl̩ː/.
  • They are placed under the heading of Syllable, but they are not a list of syllables (cf. Category:Japanese syllables). They are simply a ‹consonant + vowel› combination; Sanskrit syllables are far more complex than these combinations.
  • The alternative forms section inappropriately makes use of {{pi-alt}} (a template listing the equivalents of a Pali term in other Brahmic scripts, relying on the backend of Module:pi-Latn-translit). As a result, most (if not all) of the forms produced are incorrect and the entry has been unheeded by others since creation. The author had also attempted to create Module:Deva-Latn-translit, but that was probably too ambitious.

Do we really need these entries? In this case I would argue that the benefit of absence of this information outweighs that of their presence. Wyang (talk) 11:13, 28 August 2017 (UTC)

I think they should all be deleted. घॣ (ghḹ) is definitely not Translingual, l̥̄ is unused in all of the Indic languages, Sanskrit included, since it was an artificial creation by grammarians. As for "syllables", क्षाउईएय्र् (kṣāuīeyr) is also a syllable, and I don't think anyone wants entries like that. —Aryaman (मुझसे बात करो) 11:36, 28 August 2017 (UTC)

What is the benefit of absence, Wyang? Warmest Regards, :)—thecurran Speak your mind my past 06:31, 8 September 2017 (UTC)

Do You know what it's like to decipher Devanagari without being taught it, Aryaman? In abugidas, many vowel signs inconsistently change in both position and appearance as they are applied to various consonants. For native readers these variations are readily apparent but, for non-native readers, each CV (consonant+vowel) cluster must be read as an independent grapheme. For non-natives to read or transcribe even a tiny sample text, they must isolate and look up each CV cluster as if it was from a syllabary; akin to Cherokee or hiragana. The process is so complex and tedious that humanity completely lost the ability to understand the Egyptian hieroglyphic syllabary for millennia before the Rosetta stone was found.

The reason for keeping ancient pronunciations is to allow one-to-one correspondence. Otherwise backwards compatibility is lost because accepting modern and historic mergers means that transcribing forwards and then backwards takes the user to a different wrong CV cluster than what they started with. Warmest Regards, :)—thecurran Speak your mind my past 05:31, 9 September 2017 (UTC)

@Thecurran: Frankly, don't give me any of that crap. My formal Hindi education ended when I was 6 years old, and so I never learned Devanagari at school. I taught it to myself about two years ago as I sought to relearn my native language, and it was absurdly easy. Second, Devanagari has changed very little since the age of Classical Sanskrit. The same conjuncts are used, and while there have been a few additional consonants accommodated in e.g. Hindi using the nuqta. Besides we use slightly modified IAST for all Devanagari languages so the phonemic outcome and these "mergers" and "shifts" don't really matter. Third, every piece of Devanagari in the main space is automatically transliterated with Lua anyways. Finally, do you really think someone who can't read Devanagari can type it in the search box... —Aryaman (मुझसे बात करो) 10:06, 9 September 2017 (UTC)
Logic has given way to emotion and this conversation has gotten impolite. It would probably be better if we paused this conversation to let things cool down for a fortnight or two. Warmest Regards, :)—thecurran Speak your mind my past 23:49, 9 September 2017 (UTC)
@Thecurran: Sorry for the excessive outburst, I just wanted to say that yes I know how hard it can be to learn a new script. The problem is why don't we do this for every other script then? Gujarati, Odia, Bengali, Sylheti, Kannada, Tamil, Telugu, Khmer, Thai... you get the point. Wiktionary should not be a character index or Unicode analyzer, it's a dictionary. —Aryaman (मुझसे बात करो) 04:01, 10 September 2017 (UTC)
As a non-native reader of Devanagari, deciphering and transliterating Devanagari was not difficult, compared to some of the other scripts. I'm not quite sure I understand your argument using the Egyptian hieroglyphs, but as an abugida, you just need to read Devanagari (or even just the diagram in Devanagari#Compounds) to know the basic mechanism of the script. As a dictionary, Wiktionary is not for including these units of the Devanagari script which are isolatable by the mouse cursor. There are many more complex scripts than Devanagari, such as Arabic, Burmese, Khmer, Thai and Tibetan ― we don't include Arabic لْ, Burmese ကြို, Khmer ក្បួ, Thai กั, Tibetan ཁྱུ as "Syllables", and I don't think we should, same for घॣ (ghḹ), or स्त (sta). (What is desirable, on the other hand, is a Unicode string analyser tool, which breaks a string into the Unicode characters and links to their entries.) Apart from this superfluity, what is also contributing to the benefit of absence is the erroneous content in these entries: the erroneous alternative forms, inappropriate header and pronunciation. Wyang (talk) 06:02, 9 September 2017 (UTC)

At the least, the abuse of {{pi-alt}} should be excised.—suzukaze (tc) 00:54, 18 September 2017 (UTC)

After over 90 years, 21-volume dictionary of Akkadian is completed and released for free (non-commercial)[edit]

https://oi.uchicago.edu/research/publications/assyrian-dictionary-oriental-institute-university-chicago-cadJustin (koavf)TCM 04:08, 30 August 2017 (UTC)

Thanks for sharing that. To be clear, the license they are using does not seem to be compatible with Wiktionary, so large-scale importing will not be possible. - TheDaveRoss 14:22, 30 August 2017 (UTC)
Just to be clear, that's old news. It's been up for a while. But thanks for spreading the awareness. --WikiTiki89 20:56, 31 August 2017 (UTC)

Germanic wa-stems[edit]

Can there be a category for germanic wa-stems?
For proto-germanic, wa-stems could just be a-stems without any irregularity. But in MHG for example, terms derived from wa-stems (e.g. klē with gen. klēwes, and knie with gen. kniewes, knies) have a special declension.
There could be:

- 13:31, 30 August 2017 (UTC)

No objection from me. --Victar (talk) 17:31, 1 September 2017 (UTC)
Support. I don't see why not. —Stephen (Talk) 23:50, 1 September 2017 (UTC)
Support, since we already do this for e.g. Old Saxon. KarikaSlayer (talk) 23:27, 2 September 2017 (UTC)
No objection from me either...Support Leasnam (talk) 16:43, 6 September 2017 (UTC)

LDLs: Unusual spellings[edit]

For a LDL a single usage is enough for attestion. But what is, if the usage uses an unusual spelling?
For example:

  • Low Germans usually use sch as in High German and Dutch (e.g. Low German Schipp and Minsch (in some dialects), High German Schiff and Mensch, Dutch schip and mensch, English ship and human). But there was at least one author who used sh (namely Robert Garbe who has Shinen, Shikksal).
  • Low Germans usually use s before l, n, m, w as in Middle High German (e.g. LG Snee and Swien, MHG snē and swīn, Dutch sneeuw and zwijn, Eng. snow and pig). But there was at least one author who used ß (Capital Sz) (namely Gloede in his Zutemoos who has Sznee, Szwien).

Should clearly unusual spellings be somehow excluded or should they be somehow normalised?
E.g. knowing Zutemoos' spellings, one could normalise his Sznee, Szwien etc. to Snee, Swien (for the lemma only, not in quotes of course). - 13:31, 30 August 2017 (UTC)

They should be included, but as rare/alternative spellings of the more usual spelling. —Rua (mew) 14:30, 30 August 2017 (UTC)
  • Whether or not a language uses normalised spellings for entries should be noted on its respective W:A page. Normalisations usually only applay to historic languages and those with foreign scripts. Korn [kʰũːɘ̃n] (talk) 15:00, 30 August 2017 (UTC)

Megleno-Romanian orthography[edit]

I'm interested in possibly adding some Megleno-Romanian vocabulary, but it doesn't seem like there is a standardized orthography in use, and varies considerably. Add that to the fact that there is rather little actually written by the few remaining speakers themselves, and scant records overall. The DEX Romanian etymological dictionary uses one orthography (which I think some of the few existing entries on English Wiktionary are based on), but it seems a bit inconsistent and probably differs from what native speakers and other sources may use, as it tries to approximate the Romanian spellings to offer linguistic comparisons, rather than being a serious attempt to use a certain system.

Then there's the orthography on the Romanian Wikipedia page article for the language: https://ro.wikipedia.org/wiki/Limba_meglenorom%C3%A2n%C4%83, as well as the Omniglot site's brief entry on it: http://www.omniglot.com/writing/meglenoromanian.htm, which uses the wiki info.

I also noticed that there are a substantial amount of Megleno-Romanian words added to the Occitan Wiktionary already https://oc.wiktionary.org/wiki/Categoria:Mots_en_meglenoroman%C3%A9s_eissits_d%E2%80%99un_mot_en_latin, interestingly, but they seem to use a slightly different orthography than these, based on Petar Atanasov, 1990, Le mégléno-roumain de nos jours, Balkan-Archiv, Neue Folge, Beiheft Band 7, Amborg, Helmut Buske Verlag.

The main differences between these systems seem to be in the use of characters like -ḑ- for -dz- or -z- and -ț- for -ts-, as well as -ǫ-, -ă-, -ạ-, and inconsistencies in -l'- vs -ľ-.

I know there's not many people here who can probably answer this or help out, but just in case, does anyone have any input on what should be done in approaching this obscure and very endangered language? I suppose the same could also be said of the Istro-Romanian language here as well. Word dewd544 (talk) 20:36, 31 August 2017 (UTC)

Something similar to what we have currently is the orthography of Theodor Capidan's grammar and dictionary. As far as phonology goes:
  • a e i o u /a e̞ i o̞ u/. Same with the long vowels, only longer.
  • /ɐ/ occurs only in unstressed, initial position.
  • ę /æ/? /ɛ/? occurs in one dialect area as a variant of ea.
  • ǫ /ɔ/? /ɒ/? is the M-R cognate of Daco-Romanian â, ă.
  • Rising diphthongs are au̯ eu̯ iu̯ ou̯, and falling i̯a i̯e i̯o i̯u.
  • There's a palatal ľ as well as a velar ł, which mostly occurs at the end of words.
  • ts /t͡s/ < old c before e, i.
  • /t͡ʃ/ < old c after e, i.
  • /d͡ʒ/ < old g before e, i.
  • ń is /ɲ/.
This is just from the phonology section. KarikaSlayer (talk) 16:17, 1 September 2017 (UTC)

September 2017

September LexiSession: peace[edit]

An origami for peace!

The monthly suggested collective task is to make peace. September 21 is the International Day of Peace and October 2 the International Day of Non-Violence so it may be good to reinforce our content related to this topic.

By the way, Lexisession is a collaborative experiment without any guide or direction. You're free to participate however you like and to suggest next month's topic. If you participate, please let us know here or on Meta, to keep track on the evolution of LexiSession, because we started a year ago. I hope there will be some people interested in making some contributions! My plan is to try to draft a thesaurus on this topic, but to pick good illustrations can be a nice challenge too Face-smile.svg Noé 10:37, 1 September 2017 (UTC)

Hey, the International Day of Peace is today. I'm quite happy to make some publicity about the new thesaurus about peace in French! Face-smile.svg Noé 13:41, 21 September 2017 (UTC)

Addition to WT:Wikidata policy[edit]

I propose to add a 5th case in which pre-approval isn't necessary: in existing and in-use templates/modules, where the use of Wikidata does not have any effect on the output. This would allow us to do things like tracking and stuff, for testing and to explore potential new uses and the effects. —Rua (mew) 14:41, 1 September 2017 (UTC)

To repeat what I said in Wiktionary talk:Wikidata#A first experiment, I support adding this 5th exception. I think it's consistent with the spirit of the other exceptions, to allow the possibility to access Wikidata without affecting the actual content presented to readers. --Daniel Carrero (talk) 21:49, 9 September 2017 (UTC)
I've been bold and added it to the Wikidata policy page, since nobody else has shown an interest in this discussion. —Rua (mew) 19:03, 15 September 2017 (UTC)

Old Kurdish?[edit]

Please forgive my ignorance, but what is the generally accepted name for the parent form of all the Kurdish sublects? Old Kurdish? Proto-Kurdish? Is "Old Kurdish" attested and/or reconstructable?

Also, I noticed we have quite a few entries with the language code ku. Do these need to be sorted out and moved to their respect lect codes or are these entries with identical orthography across all three lects? --Victar (talk) 20:55, 1 September 2017 (UTC)

We have discussed this before, although I'm not sure where. @-sche might have a link handy. Our entries in ku are nearly all Kurmanji in Latin script, although we also have a specific code for Kurmanji. I think we would be best off committing to a single approach, and use a modified version of {{fa-regional}} to link between dialects. —Μετάknowledgediscuss/deeds 21:15, 1 September 2017 (UTC)
Pinging @Calak, who seems to be knowledgeable in all of the Kurdish dialects. —Aryaman (मुझसे बात करो) 02:42, 2 September 2017 (UTC)
Proto-Kurdish is attested.
I always use ku code when a word is common in all of the Kurdish dialects. For example the common word for "goat" in Kurdish is "bizin"; why should we separate dialects and write "bizin" four times?! ku code means Kurdish language with all its dialects.--Calak (talk) 09:03, 2 September 2017 (UTC)
Because it can get confusing where there's Southern Kurdish one place and Kurdish another, and it's doubly confusing once someone has set computers at the job and there's just raw lists of which entries have Souther Kurdish translations and which don't, without any note of Kurdish in the vicinity.--Prosfilaes (talk) 22:52, 4 September 2017 (UTC)

Proposal: install mw:Extension:PageNotice[edit]

This extension makes it possible to add headers to pages. It would mean we no longer need to add {{reconstructed}} to every reconstruction page. —Rua (mew) 12:19, 2 September 2017 (UTC)

French Wiktionary August news[edit]

Logo Wiktionnaire-Actualités.svg
Camille Pissarro - La Récolte des Foins, Éragny - Google Art Project.jpg


Hey! August issue of Wiktionary Actualités just came out in English!

What's up in French Wiktionary? And in the other Wiktionaries? What is a Magic Link? Is there statistics somewhere? German words that may exists? Videos? Details on tantum categories? Nice paintings from French artists? Clowns? Yes, all of this can be find in August Actualités!

As usual, it is translated in English by non-native speakers, in less than a day, and it is not perfect, but it can be improved by readers (wiki-spirit). We are very happy to celebrate a year of English translations! Twelves issues! That's not bad considering that we do not received any money for this publication and we are not supported by any user group or chapter. It is only written by the community, and it was eleven participants for this issue! We all stay eager to receive your opinion on our publication! Face-smile.svg Noé 21:24, 2 September 2017 (UTC)

Egyptian hieroglyphics[edit]

Why are you using the html tag for Egyptian hieroglyphics instead of the unicode characters? While looking at Help:WikiHiero syntax I read that the unicode characters only are partially supported so I guess that's why (found the page while writing). What are the things missing for it to be fully supported and when might those "missings" be added or fixed? This turned out to be a more "I don't know anything" question than meant... sorry 'bout that.Jonteemil (talk) 23:34, 2 September 2017 (UTC)

You answered your own question: Unicode can't support all of what we want to show. WikiHiero not only displays the characters correctly, it also does so regardless of whether the reader's computer can support special fonts (hint: most can't) and allows for flexibility in stacking, which lets us show how the language's native speakers chose to organise hieroglyphs spatially. Moreover, Egyptian dictionaries are conventionally organised by romanisation. There is no expectation that Unicode will ever fix this, which is unsurprising given that it is a long-extinct language with no use community, so our current solution is the best way to handle Egyptian going forward. —Μετάknowledgediscuss/deeds 23:45, 2 September 2017 (UTC)
Actually, positioning hieroglyphs properly might be in the works. —suzukaze (tc) 07:16, 3 September 2017 (UTC)
Fingers crossed! — Ungoliant (falai) 17:54, 4 September 2017 (UTC)
Oh, I hadn't seen this version. That's interesting, although I'm not sure it weighs out the other concerns. (E.g. that Egyptian spelling is so erratic that if people search by hieroglyph, we'd have to create entries for all the plentiful alternative spellings (and arrangements that aren't even truly alternative spellings, but have different Unicode control characters) because they'd never be able to guess what spellings we'd lemmatised. —Μετάknowledgediscuss/deeds 18:02, 4 September 2017 (UTC)


Discussion moved from Wiktionary:Tea room/2017/September#User:TNMPChannel‎.

I just blocked them for three days for creating an entry in Vietnamese- a language they don't claim to know- by plagiarizing the definitions (without attribution) from a Chinese entry that shares the same character. This is not just dishonest, it's a copyright violation and a violation of our Creative Commons license.

It's also part of a pattern of poor judgement that I've been concerned about for a while: indiscriminate mass creation of articles from a single source without checking for attestability. Creating entries, then immediately rfding them (within minutes). Submitting one of their new entries to rfc because no one else had intervened to fix it yet. Moving a category without understanding enough about our categories to have a clue whether it was a good idea (it definitely wasn't). In general, doing stuff without knowing what they were doing, then expecting others to fix it.

I may be wrong, but to me this all looks like a child who's too young to understand the implications of what they're doing, and is used to grownups stepping in and fixing things. If not, then something is really, really wrong.

At any rate, we need to decide what to do about this- I've only blocked them for three days, and they will have read this by then. Wikis aren't all that good at dealing with contributors who sincerely believe they're helping, but don't know what they're doing. What do you think we should do? Chuck Entz (talk) 05:55, 3 September 2017 (UTC)

I think he/she should be unblocked for now. The first reminder or warning to the user about making errors in unfamiliar languages was by Justin on their talk page at 03:54, 3 September 2017 (UTC), and they haven't made similar edits after the message. From I observed in their Chinese edits: he/she seems to be quite unfamiliar with our formatting system, though I can see they are trying to improve, and I have also received 'thanks' for my subsequent edits to their created entries. The entries have been quite useful too. It's not very often that we get new users who are native in E/SE Asian languages, so I'm more inclined to fix their new edits than discourage them. Of course, if it persists despite explanation and warning, then blocking would be indicated. Wyang (talk) 06:01, 3 September 2017 (UTC)
The plagiarism part merits at least a day. Chuck Entz (talk) 07:16, 3 September 2017 (UTC)
Sure. However, I suspect (or hope) that it may just be part of their cluelessness, rather than malicious disruption or infringement. I do hope that they could return, to help with Chinese idioms and Malay entries. Wyang (talk) 08:12, 3 September 2017 (UTC)
I find their constant page moves very worrying. They seem to misunderstand how everything works here. Maybe time should be taken to explain what's wrong on their talk page. —suzukaze (tc) 06:02, 3 September 2017 (UTC)
About 2.5 hours ago this user placed {{unblock|I promise I won't do it again.}} at User_talk:TNMPChannel#Unblock. I think we need some specific, extensive acknowledgement of what won't be done again. (Also something in the documentation that requests a complete confession or allocution before the user is entitled to the request being considered.) DCDuring (talk) 12:25, 3 September 2017 (UTC)
Also, please check if it's not an Awesomemeeos sock. --Anatoli T. (обсудить/вклад) 12:53, 3 September 2017 (UTC)
Completely different. Awesomemeeos gets all the technicalities perfect but has trouble with basic common sense. This person can't get either right. Awesomemeeos simply doesn't have the self-awareness and self-restraint to pull off an impersonation like this- for one thing, they're compulsive about upgrading templates. Chuck Entz (talk) 14:32, 3 September 2017 (UTC)

Translingual terms listed under descendants[edit]

Prompted by this seemingly innocent kerfuffle, it makes me wonder if it is I who am in the wrong, or if I'm onto something. Should translingual terms be listed as descendants? I would say no, because they're mainly taxonomic terms made up of Latin terms, not natural descendants. --Robbie SWE (talk) 17:49, 4 September 2017 (UTC)

Would you support the taxonomic terms being listed as ==Latin== instead of ==Translingual==? DTLHS (talk) 17:59, 4 September 2017 (UTC)
Hmm, I'm kind of slow today. Mind giving me an example? --Robbie SWE (talk) 18:02, 4 September 2017 (UTC)
Since you say that they are "taxonomic terms made up of Latin terms" and favor derived terms instead of descendants, it would be consistent to just call them Latin instead of "Translingual". DTLHS (talk) 18:04, 4 September 2017 (UTC)
But we already have translingual taxonomic terms. The examples I was given were bombyx#Descendants, accipiter#Descendants, aequoreus#Descendants and alauda#Descendants. I don't believe they should be listed as descendants. --Robbie SWE (talk) 18:19, 4 September 2017 (UTC)
You do not understand. If you want them to be derived terms why do you still want to call them "Translingual"? DTLHS (talk) 18:23, 4 September 2017 (UTC)

I don't want them there at all - for instance, Bombyx shouldn't be under descendants nor should it be under derived terms at bombyx. --Robbie SWE (talk) 18:47, 4 September 2017 (UTC)

Why shouldn't there be some link in the Latin section to the taxonomic term that is derived or descended from it? Why would we omit the connection? DCDuring (talk) 22:17, 4 September 2017 (UTC)
@DCDuring, I understand what you mean. The reason why I would opt for not listing them under descendants at all is because tanslingual isn't a language per se. According to our guidelines – [l]ist terms in other languages that have borrowed or inherited the word. The etymology of these terms should then link back to the page. – I don't think that translingual terms fall under this category. I looked through this category and a vast majority of them are not listed under descendants in their original Latin entries. --Robbie SWE (talk) 08:30, 5 September 2017 (UTC)
@Robbie SWE: In the case of CJKV characters clearly we are dealing with a script, not a language. In the case of other Translingual items we are dealing with items that are used in multiple languages. If we don't have Translingual descent shown for taxonomic names, then in principle we should show the taxonomic name as a descendant in each language in which the taxonomic name is used. This seems silly at best. DCDuring (talk) 14:34, 5 September 2017 (UTC)
"Should translingual terms be listed as descendants?"
By common practice it's done that way and so you were "in the wrong". This know is with a "should" and another question and topic. So what possibilities are there?
  • Don't list translingual descendants at all.
    --- This probably is not a good choice.
  • List them as derived terms.
    --- By WT:ELE#Derived terms that would only be possible if translingual terms would be mislabelled Latin or if Latin would be mislabeled translingual or if both would be merged into a single pseudo-language 'Translingualolatin' (or whatever the name would be). This probably is not a good choice either.
  • List translingual terms as descendants.
    --- Why not? I can't think of any contra reasons. Pro reasons: In a translingual entry it would also be for example "From Latin TERM", and descendant is descriptive.
  • List translingual terms at see also.
    --- This does also depend on the question of what can be listed at see also, and apparently there are different views about it. If different-language terms can be listed at see also: Why not? The only reasons I could think of would be that descendants (maybe cp. descendant#Noun) sounds fitting and is more descriptive and informative. On the other hand, as some translingual terms come with {{taxlink}} and link to the English wikispecies project and not to wiktionary, this would also be an acceptable choice.
- 02:52, 5 September 2017 (UTC)
{{taxlink}} "temporarily" links to Wikispecies (in all uses), with the hope that there will be a Translingual Wiktionary entry (unless it is decided that taxonomic names are Latin). The "See also" heading is for items that have no more specific heading, but in the past has been used for (true) external links and WM project links as well as for alternative forms, "coordinate terms", hyponyms, hypernyms, meronyms, etc. Placing the items now under "See also" under proper headings would be an excellent cleanup project (Augean stables?), but most of us are in pursuit of bright shiny objects. DCDuring (talk) 03:59, 5 September 2017 (UTC)

(literary or dialectal)[edit]

故此 is tagged as (literary or dialectal). I'd like to know whether, as the order of it seems to mean, 'literary' refers to Mandarin only, or also in the unspecified dialects implied by the tag. Is this way of inferring to be systematically unsderstood for any such tags appearing in other entries for any other language? --Backinstadiums (talk) 21:06, 4 September 2017 (UTC)

As it is the entry is not comprehensible. @Wyang, what dialects are included in "dialectal"? DTLHS (talk) 05:17, 5 September 2017 (UTC)
Wiktionary:About Chinese#Key points: "Terms are defined in relation to Modern Standard Written Chinese. ... Senses limited to the literary language, certain dialects or regions should be marked accordingly." I changed the tag to "formal or Min Dong". I think the headers in our entries should link to the "About" pages somewhere, so that people are directed to a page which explains how a language is documented in entries on Wiktionary, also a page where people can leave their questions or feedback, or voice their interest in joining the editing team. Wyang (talk) 12:13, 5 September 2017 (UTC)
@Wyang: Isn't it uncommon for term to be use in either formal registers or dialectal ones? --Backinstadiums (talk) 06:41, 7 September 2017 (UTC)
Not necessarily, especially if the reason for the disuse in the modern standard register is an innovation, which happens often in Chinese. Wyang (talk) 06:44, 7 September 2017 (UTC)
@Wyang: Could you please add some examples of such innovations for words used frequently in the language? thanks in advance. --Backinstadiums (talk) 14:58, 8 September 2017 (UTC)
Much of the variation in basic vocabulary (Appendix:Sino-Tibetan Swadesh lists) is due to innovations in the northern varieties of Chinese. Some examples include: (“mouth”) (later displaced by ), (“to eat”) (by ), (“to drink”) (by ), (“dog”) (by ), (“to stand”) (by ), (“he/she”) (by ). Apart from this kind of simple monosyllabic supplantation, another reason for the innovations is the process of polysyllabification, which occurred especially in the northern varieties out of the need for disambiguation, as a compensatory mechanism after the loss of many phonetic contrasts (e.g. tone) through sound changes. Examples include 石頭, (“seed”) → 種子. Wyang (talk) 04:30, 9 September 2017 (UTC)
@Wyang: Hi again, thank you for your examples but none is tagged as either "literary", "dialectal" or "literary or dialectal". Could we isolate the entries which have the tag "literary or dialectal", and then check which ones developed as innovation? --Backinstadiums (talk) 08:30, 9 September 2017 (UTC)
They don't have to be tagged. It's implied in the {{zh-dial}} boxes on those entries. Wyang (talk) 10:42, 9 September 2017 (UTC)
@Wyang: Do you mean having {{lb|zh|Min Dong}} link to the about page? — Eru·tuon 07:52, 7 September 2017 (UTC)
Not really, having <h2>Chinese</h2> link to the about page rather, in the style of fr:chinoise or something similar. Wyang (talk) 08:02, 7 September 2017 (UTC)
I suppose that could be done with JavaScript. Or with templates, if we decided to allow templates in headers like the French Wiktionary (less likely). — Eru·tuon 08:49, 7 September 2017 (UTC) (talk)[edit]

Please someone block this. Wyang (talk) 04:40, 8 September 2017 (UTC)

@Wyang: Done, simply because I trust you. I want a reason, though — I looked over a few edits and they seemed fine, though I don't know any Uyghur. (Also, for future reference, this sort of thing can go at WT:VIP.) —Μετάknowledgediscuss/deeds 05:39, 8 September 2017 (UTC)
Hmm, is it just because it's an Australian who already knows how all our templates work? —Μετάknowledgediscuss/deeds 05:46, 8 September 2017 (UTC)
Works for me. It's not necessarily the quality of the initial batch of edits, but the camel's nose that will lead to high volumes of hard-to-check edits later on. Notice, for instance, their edits on {{Template:kk-decl-noun}} which "coincidentally" continues edits by another IP, (talkcontribswhoisdeleted contribsnukeedit filter logblockblock logactive blocksglobal blocks) back in July.
(Before E/C) @Wyang Hi. What edits are wrong? I have only checked some, haven't seen anything bad.

--Anatoli T. (обсудить/вклад) 05:40, 8 September 2017 (UTC)

@Atitarev you don't find it odd that an IP pops up out of nowhere and starts out by rewriting all the inflection templates- in both Kazakh and Uyghur? Chuck Entz (talk) 07:28, 8 September 2017 (UTC)
@Chuck Entz: Yes, it's suspicious, it could be a formally blocked editor but I didn't know that this is a reason for blocking, though and none was given. Who was it, anyway? --Anatoli T. (обсудить/вклад) 07:37, 8 September 2017 (UTC)
AwesomeMeeos, of course. I'm starting to go through Category:Noun inflection-table templates by language. In the Adyghe subcategory, for instance you'll find edits by (talkcontribswhoisdeleted contribsnukeedit filter logblockblock logactive blocksglobal blocks). Chuck Entz (talk) 07:57, 8 September 2017 (UTC)
He's quite inventive, LOL. --Anatoli T. (обсудить/вклад) 08:11, 8 September 2017 (UTC)

Adding accents to Italian headwords[edit]

I'm currently learning Italian and started to work on our Italian entries. I noticed that we don't display accents for irregularly stressed words. cavolo and diavolo for example are listed in other Italian dictionaries as càvolo and diàvolo (e.g. Treccani), because they don't follow the common stres pattern (next to last syllable). My suggestion is to add a headword parameter to those entries, {{it-noun|m|head=diàvolo}}. Or would that be confusing? Explicit parameter? Better alternatives? – Jberkel (talk) 09:04, 10 September 2017 (UTC)

The problem is that sometimes the accent is actually written, and there's no way for someone to tell the difference. —Rua (mew) 11:28, 10 September 2017 (UTC)
I have included the accent in the hyphenation as a possible solution: diavolo. --Vriullop (talk) 12:12, 10 September 2017 (UTC)
Hm, maybe that's a solution then, I think the information should go somewhere, and the pronunciation section is a good place. It's interesting that they don't bother using accents, even in ambiguous cases (e.g. pesca). – Jberkel (talk) 15:12, 10 September 2017 (UTC)
Theoretically, the "correct" solution would be to include an IPA pronunciation with a stress mark. But I like the accent in the hyphenation idea too. --WikiTiki89 18:06, 11 September 2017 (UTC)


This person keeps reverting my OED-sourced proper pronunciation of angstrom. Apparently he or she thinks that proper pronunciations must meet his personal litmus test of notability or whatever. Wiktionary exists for many reasons, one being a place where readers from Wikipedia can come to get the specifics of a word, an important part of that being proper pronunciations. I added this information for the specific reason of avoiding the continuation of ever-recurring arguments about the pronunciation of this word over at Wikipedia. The proper pronunciations of words and the common do not always overlap in many cases. Their argument seems to be that [œ] does not exist in English, but a quick look over at Open-mid front rounded vowel can tell you that's not the case. In any case it is a Swedish loanword, and you can observe this vowel in the pronunciation of the the Swedish version. I made sure to display the pronunciation I added [phonetically] rather than /phonemically/, and I did not mess with or remove the existing common phonemic English pronunciations, so I really don't see what the big deal is. I don't just pull this stuff out of my ass; I am well-versed in the relevant term and how IPA works. This information, while a bit obscure, could still potentially help someone. Don't get me wrong; any other day I'm all for excising superfluous crap from reference sources, but this is not one of those cases. It looks to me to be yet another age-old case of what we called “barracks lawyers” in the Army. Pariah24 (talk) 12:48, 10 September 2017 (UTC)

I would expect that pronunciation by English speakers would rarely be the same as "correct" or common pronunciation by Swedish speakers, especially for a word fully absorbed into English. At [[ångström]] we have the Swedish pronunciation. DCDuring (talk) 14:51, 10 September 2017 (UTC)
@Pariah24 The problem I have is that you don't state which accent says IPA(key): [ˈɔːŋstɹœm]. The LPD and CEPD, the most respected pronunciation dictionaries of English list only the pronunciations with IPA(key): /ə/ and IPA(key): /ʌ/. We're both aware that there's no *IPA(key): /œ/ phoneme in English, at least in most accents. Can you prove that accents that use IPA(key): [œː] for the NURSE vowel (or GOAT vowel, in the case of South African English) would use that vowel in 'angstrom'? I find that highly unlikely and so that argument just doesn't hold up without additional sources that would prove that. Mr KEBAB (talk) 15:10, 10 September 2017 (UTC)
@DCDuring We do, I added it there a few days ago per the LPD, which provides the Swedish IPA alongside RP and GA transcriptions. Mr KEBAB (talk) 15:10, 10 September 2017 (UTC)
The established practice on Wiktionary is to only put English pronunciations in an English entry. If no English speaker actually says the word angstrom with the pronunciation [ˈɔːŋstrœm], then that pronunciation should not be listed in the English entry. So the way to resolve this dispute is to find evidence that an English speaker says [ˈɔːŋstrœm], and to put that pronunciation in the proper context (is it rare, is it used by English speakers who also speak Swedish?). — Eru·tuon 17:48, 10 September 2017 (UTC)
I didn't notice that the pronunciation came from the OED. It is given as phonemic there: /ˈɔːŋstrœm/. On the one hand, I respect the OED; on the other, Wiktionary encourages verification of information taken from other sources, so it would be good to find out what they based this transcription on and whether it would meet our standards even if the OED didn't say it, and put it in proper perspective (that is, as above, who actually uses this pronunciation?). — Eru·tuon 18:02, 10 September 2017 (UTC)
screenshot for the plebs :) Pariah24 (talk) 03:29, 11 September 2017 (UTC)
Given its relative obscurity I think most would agree that finding a sample in the wild of a pronunciation of this term is pretty unfeasible. OED has never steered me wrong. If this were Wikipedia I wouldn't have even bothered with this nitpicky silliness, but it's a dictionary for pete's sake, and I always lean on the side of too much information is better than not enough, provided the source isn't questionable. I'm not over here deleting anything, I'm just adding. Pariah24 (talk)
This doesn't look like a query for an obscure word to me. —suzukaze (tc) 03:48, 11 September 2017 (UTC)
After all, there are many sources that say "X is a word that means Y" out there in the world, in particular other dictionaries. But they don't actually prove that people use a word, they just say they do, which really isn't sufficient. It wouldn't be the first time that dictionaries make up words that nobody has ever used!
suzukaze (tc) 03:40, 11 September 2017 (UTC)
Blockquoting, really? Pariah24 (talk) 03:43, 11 September 2017 (UTC)
If you want to play this game, here's a link for you. Pariah24 (talk) 03:47, 11 September 2017 (UTC)
Alright, you can have another too. —suzukaze (tc) 03:50, 11 September 2017 (UTC)
  • I, too, suspect this is a dictionary invention and I think it should not be in the entry without proof of actual use. Incidentally, I have a background in science (in the US), where normal use was /ˈæŋstɹəm/ or /ˈæŋstɹɑm/ (the latter of which I note is not in the entry). —Μετάknowledgediscuss/deeds 03:52, 11 September 2017 (UTC)
Not trying to offend but your anecdotal experience with a physics term can not possibly be comprehensive enough to be used as a valid argument for inclusion into a worldwide dictionary project. The only plausible way to attack my view that I see is to question the validity of OED and claim they would just put made-up bullshit into their dictionary, which is what you appear to be doing. Everything I know about OED from years of using it goes against this view. This is all getting really pedantic if you ask me. Sometimes it seems like all the worst parts of Wikipedia are magnified here. Pariah24 (talk) 04:24, 11 September 2017 (UTC)
If I understand the screenshot above correctly, the OED's entry for angstrom hasn't been updated since 1933. Nobody is saying that they insert made up bullshit into their entries. But their information may be outdated. DTLHS (talk) 04:28, 11 September 2017 (UTC)
Sadly, it may be bullshit. The OED makes things up a lot more than any of us would like, I'm afraid; you'll see that they're a frequent offender over at Appendix:English dictionary-only terms (a list of terms found in dictionaries that were never actually used enough to enter Wiktionary). —Μετάknowledgediscuss/deeds 04:31, 11 September 2017 (UTC)
If it needs to be removed based on this rationale I don't have a problem with it. I only have a problem with "hey guy I'm here to police you because I don't think you know what you're talking about." I'm rarely on the other side of these situations, because I almost never remove the content of others unless it obviously needs to go. Pariah24 (talk) 04:35, 11 September 2017 (UTC)
@Pariah24 Here we go again with misrepresenting what I say. I'm not interested in policing you (though you seem to strongly believe that, which isn't correct) but in the quality of Wiktionary. I removed that information and added sourced pronunciations (or sourced already existing ones, whatever it was) because I could see that the IPA was incomplete, it didn't say which accent says IPA(key): [ɔːŋstrœm] and I knew for a fact that neither RP nor GA speakers use IPA(key): [œ] in loanwords, not to mention native words. Do I really have to repeat myself over and over again? It seems like I do. I'm tired of you twisting my words and lying about my actions. Mr KEBAB (talk) 18:11, 11 September 2017 (UTC)
It's especially aggravating when you're someone who has already gone through pains to use good sources. Pariah24 (talk) 04:39, 11 September 2017 (UTC)
@DTLHS: The last quotation in the entry is from 1957, and the small text under the definition mentions a redefinition of the meter in 1960, so they must have updated at least those parts of the entry since 1933. — Eru·tuon 05:03, 11 September 2017 (UTC)
Nobody likes to say this, but we rely heavily on original research (for English at least) and generally don't give a shit what secondary sources say. Which is why you will encounter so much hostility if you try to support something based on what a dictionary says. DTLHS (talk) 04:40, 11 September 2017 (UTC)
I honestly never noticed the little update warnings off to the right, so thanks for that. Clearly my attention to detail needs some work. Pariah24 (talk) 04:47, 11 September 2017 (UTC)
However on a second look it does say Previous version OED2 (1989). I think it just means it was first published in 1933. Pariah24 (talk) 04:53, 11 September 2017 (UTC)
@DTLHS That's clearly not the case here. I sourced the IPA on angstrom, it's not OR. Mr KEBAB (talk) 18:11, 11 September 2017 (UTC)
FWIW, there have been other cases where dictionaries prescribe a pronunciation that we can't find in use; bon appétit is one. If more dictionaries than just the OED prescribed the pronunciation mentioned here (and if they were consistent, e.g. in saying it was an RP pronunciation), then it might be appropriate to add a note like the one in bon appétit. - -sche (discuss) 04:20, 19 September 2017 (UTC)
Isn't it ALWAYS true that a term originating in a FL will sometimes be pronounced by a limited number of English speakers as it is in the FL. Those who pronounce it as in the FL would be limited to those who knew the FL or were repeatedly "corrected". If the pronunciation is not fairly common, why should it be included? DCDuring (talk) 05:03, 19 September 2017 (UTC)

WM language guides[edit]

Following is text extracted from a message posted on many WM mailing lists:

Many wikis in the Wikimedia world give editors suggestions about the correct usage of each respective language: orthography, register, punctuation, and so on.

I started a page to list of such language guides: https://meta.wikimedia.org/wiki/Language_guides

I added a bunch of links to Hebrew there because that's my home wiki. I also added a few pages that I could find for Catalan, Indonesian, Russian, and Bosnian.

Please add your languages there! Surely there are dozens and dozens of missing links there.

Before you ask: The linked page explains why Wikidata is not very convenient for maintaining such a list, but if you think that you can put this nicely in Wikidata, be bold.

Thank you!

-- Amir Elisha Aharoni

Note that this is not the same as article style guides. I would think folks here would be interested. Potentially English usage guides would be a valuable resource and link target for us. Perhaps some of our usage notes and other material would be useful material for such usage guides in many languages. DCDuring (talk) 14:43, 10 September 2017 (UTC)

Erm, it appears that these are, in fact, just style guides... —Μετάknowledgediscuss/deeds 18:32, 10 September 2017 (UTC)
I think the glass is about 1/4 full. At least, Croatian, Hebrew, Indonesian, and Polish include grammar and/or common spelling errors. Afrikaans has something on translation errors. DCDuring (talk) 20:03, 10 September 2017 (UTC)

Draft strategy direction. Version #2[edit]

In 2017, we initiated a broad discussion to form a strategic direction that will unite and inspire Wikimedians. This direction will be the foundation on which we will build clear plans and set priorities. More than 80 communities and groups discussed and gave feedback[strategy 1][strategy 2][strategy 3]. We researched readers and consulted more than 150 experts[strategy 4]. We looked at future trends that will affect our mission, and gathered feedback from partners and donors.

A group of community volunteers and representatives from the strategy team synthesized this feedback into an early version of the strategic direction that the broader movement can review and discuss.

The second version of the direction is ready. Again, please read, share, and discuss on the talk page on Meta. Based on your feedback, the drafting group will refine and finalize the direction.

SGrabarczuk (WMF) (talk) 10:12, 11 September 2017 (UTC)

Merge Proto-Nuclear Polynesian and Proto-Eastern Polynesian into Proto-Polynesian[edit]

The Austronesian languages suffer from what you might call matryoshka grouping: each group has a branch which then branches further, which then branches further, and so on. You end up with a lot of branches which don't have very significant differences, and a lot of different proto-language entries with very similar content. It's made more complicated by the fact that some of the branches are less well established than others. To reduce this somewhat, I propose merging Proto-Nuclear Polynesian poz-pnp-pro and Proto-Eastern Polynesian poz-pep-pro into Proto-Polynesian poz-pol-pro. The differences between them are very small; each group is separated from its parent by only one or two sound changes, the sound changes of individual languages are often more substantial than those separating the proto-languages. See *matuqa for example. Rapa Nui and Hawaiian are both in the Eastern Polynesian group, yet the former preserves the Proto-Polynesian form unchanged while the latter significantly changes it. Having separate entries for Proto-Polynesian *aka, Proto-Nuclear Polynesian *aka and Proto-Eastern Polynesian *aka is quite pointless. —Rua (mew) 22:47, 11 September 2017 (UTC)

So for some more background, what are the sound changes that supposedly differentiate PNP from PP, and PEP from PNP? --WikiTiki89 22:54, 11 September 2017 (UTC)
See w:Proto-Polynesian language. Nuclear has loss of *h and merging of *l and *r. Eastern also has *s > *h and partial loss of *q. —Rua (mew) 23:31, 11 September 2017 (UTC)
  • Oppose. We have terms that can be reconstructed to PNP but not to PPn. Why on earth would you make it impossible to enter reconstructible terms? —Μετάknowledgediscuss/deeds 23:53, 11 September 2017 (UTC)
    • Same reason we did it for Proto-Germanic or Proto-Uralic. —Rua (mew) 00:00, 12 September 2017 (UTC)
      So what do you do with words reconstructible to Proto-West-Germanic but not to Proto-Germanic? —Μετάknowledgediscuss/deeds 04:29, 12 September 2017 (UTC)
      • We reconstruct them for Proto-Germanic. Many linguists don't even believe in Proto-West-Germanic. —Aɴɢʀ (talk) 09:47, 12 September 2017 (UTC)
        • To clarify, we put the reconstruction in a Proto-Germanic entry and give it a context label of "West Germanic". --WikiTiki89 16:47, 12 September 2017 (UTC)
  • Oppose. There are two factors that differentiate Polynesian languages from Indo-European and other Eurasian language families.
    1. Polynesian phonotactics are extremely adverse to consonant change: in any Polynesian language I'm familiar with, there's no such thing as a consonant cluster- every syllable begins with either a vowel or a single consonant followed by a vowel, and there are simply no final consonants. That means that any consonant change that does happen is really significant.
    2. Millions of square miles of open ocean. It is physically possible to walk from the Scandinavian Peninsula all the way to India, but not from Samoa to Hawaii. Proto-Germanic spread over a wide area, but contact between dialect areas prevented it from splitting up into separate branches, for the most part. There are Polynesian island groups such as the Hawaiian Islands and Rapa Nui where there has been a colonization event or two, but no other contact with the outside world- ever. There are parts of Polynesia where island groups are close enough to allow periodic contact, but there are also plenty of island groups where other peoples were a subject of oral history, but never actually encountered until the Europeans showed up. There again, patterns of sound changes are probably reflective of actual population movements, not of areal influence or borrowing- you can't have areal influences if people within an area have never had any contact with each other. Chuck Entz (talk) 01:36, 13 September 2017 (UTC)
      I don't get what your point is. You seem to be saying that we should group languages based on how different they theoretically could be rather than how different they actually are. To me it seems that what we need is to determine whether there actually are fundamental enough differences between PP, PNP, and PEP that it would be infeasible to treat them under a single langauge. I don't personally have an answer to this, nor do I have any evidence to share, but let's not base this on theory. --WikiTiki89 02:40, 13 September 2017 (UTC)
  • Support at minimum for inherited terms. Multiplication of entries like Proto-Polynesian *wai, Proto-Nuclear Polynesian *wai and Proto-Eastern Polynesian *wai is useless. These are nothing more than a waste of effort that makes things harder to maintain. Trying to document every possible reconstructed proto-language as if they were attested natural languages is not a part of Wiktionary's mission; it is w:scope creep and should be avoided.
    — I could agree either way on items that actually are reconstructible only for a smaller group of languages, though, such as Proto-Nuclear Polynesian *nui. Having these under their proper proto-language categories etc. is more exact; but on the other hand, keeping around two "tiers" of proto-languages seems more complicated than is really necessary, since the context label (dialect label would be more exact) approach works for almost all needs. --Tropylium (talk) 20:48, 18 September 2017 (UTC)

Category:Quotation templates to be cleaned[edit]

There are over eight thousand entries in this category. Does anyone know what we are supposed to do with them? SemperBlotto (talk) 19:27, 14 September 2017 (UTC)

It is a project of TheDaveRoss. Maybe he can tell you. DTLHS (talk) 19:28, 14 September 2017 (UTC)
The cleaning means changing {{quote-text}} to one of the specific quote templates, it is not at all urgent. - TheDaveRoss 19:57, 14 September 2017 (UTC)

"Morphologically from the root" IP editor[edit]

There seem to be a variety of IP users editing Arabic entries and, among other things, adding the text "morphologically from the root x". See for instance this and this and this and this. As you can see, there are several different IP addresses, but to me their editing style looks the same, particularly the tagline above in the etymology sections, so they might be a single very high-tech person who knows how to mess with IP addresses, or a conspiracy of people. Sometimes the edits are okay, aside from formatting (no newlines between sections, and second-level reference sections, for instance). Often they're replacing specific definitions or etymologies with generic morphological ones (with the above tagline), or replacing Arabic templates with generic ones. With the last example, أَعْلَى (ʾaʿlā), they've radically reformatted the entry in an unconventional fashion, with pronunciation sections above etymology sections that share that pronunciation. It makes sense, but it's something that needs discussion (and there's the deletion of valuable etymological material). On the whole, their edits are full of various things that need to be corrected.

Anyway, I don't know what to do about this. At least the tagline gives some way to find their edits. I'll leave it at that. — Eru·tuon 10:06, 15 September 2017 (UTC)

Proposed first use of Wikidata: categorising planets[edit]

A while ago, {{senseid}} was added to entries with Wikidata ids, but with no actual Wikidata access, it was mostly a formality. Now that Wikidata access has been enabled, I've done some experimenting in Module:senseid, detailed at Wiktionary talk:Wikidata#A first experiment. The experiment was meant to find out how to use Wikidata information to categorise entries. Categorising entries this way offers the advantage that we don't have to think about which categories something belongs in. As long as the data is present on Wikidata, and the appropriate IDs are added to entries on Wiktionary, the categories can be added automatically. Think of it as an {{auto cat}} for individual senses: just plop in the Wikidata ID and the template will figure out what needs to be done. This method can only work for "set" categories, which contain things belonging to a particular set, usually indicated in Wikidata with the "instance of" property. It doesn't work for categories that contain terms related to a particular thing/topic. Semantic relatedness is lexical data, which is not currently present in Wikidata. Thus, we'll still have to manually add "topic" categories like Category:Astronomy.

Because of the rule that uses of Wikidata should be approved first, I did this experiment by using tracking categories to stand in for actual topical categories. My finding was that in general it works quite well, but Wikidata handles certain things idiosyncratically which our modules need to take into account when "translating" the information. For example, many things that are combined into one category on Wiktionary have several different Wikidata entities, such as our Category:Planets of the Solar System, which has three corresponding Wikidata entities, outer planet of the Solar system (Q30014), inner planet of the Solar System (Q3504248) and planet of the Solar System (Q17362350). Wikidata makes frequent use of subclassing; writing code on our end to resolve sub/parent classes may help in these cases. Taxonomic data is also handled differently, with special taxonomic properties rather than the generic "subclass of" and "instance of" properties.

I would like to take a first step towards making it actually do something with the data. This needs to be approved in a discussion, so I hereby propose modifying Module:senseid/{{senseid}} to automatically place an entry into a language-specific Category:Planets of the Solar System, if it is given a Wikidata ID (Q followed by numbers) and if Wikidata indicates that the entity for this ID is a planet of the solar system. I'm choosing planets specifically because it's a very small set with exactly 8 known members, and the Wikidata data is known to be complete. This makes it easy to spot any problems if they arise. Extending the system for more categories is very easy; if you approve of doing it for more categories than just planets right from the start, please state so. —Rua (mew) 18:54, 15 September 2017 (UTC)

I entirely approve and I especially like this example because it is something small and restrained (unlike e.g. adding it to every entry on a species) and it does include some possibly contentious data--i.e. the status of Pluto (which is not contentious to astronomers but is the sort of thing that could have actual individuals editing back-and-forth about it). —Justin (koavf)TCM 00:47, 16 September 2017 (UTC)
What benefit does this add? The members of the category are unlikely to change much over time, and if we did this it would require that every page which is not a planet checks to see if it is a planet, which seems like a lot of overhead. - TheDaveRoss 12:54, 18 September 2017 (UTC)
@TheDaveRoss: Your question answers itself: since this is a very small and stable use case, it will allow us to see in the wild how Wikidata integration would work. That is the value. —Justin (koavf)TCM 15:53, 18 September 2017 (UTC)
@Koavf: I don't disagree that this would be a good test, if it were the type of thing that I thought Wikidata was well suited for. I do not think that this is an example of a good use of Wikidata, however. If you have a small, class of objects you label the objects rather than querying all objects to see if they belong to that class. If you have a very large class, or one which changes often, then you query. - TheDaveRoss 11:39, 19 September 2017 (UTC)
It's not expensive at all to check this. All you need to do is retrieve the "instance of" property of the entity, and then check the IDs that you get for matches. IDs are just strings, so it's basic string matching, which is very fast. —Rua (mew) 15:48, 20 September 2017 (UTC)
It is expensive due to the volume, not the task. - TheDaveRoss 17:37, 20 September 2017 (UTC)
Yes, but structured data will vastly decrease overhead in the long term. —Justin (koavf)TCM 17:41, 20 September 2017 (UTC)
Only if we were currently using any overhead at all for this sort of thing, which we are not. I agree that judicious use of Wikidata will decrease overhead. - TheDaveRoss 17:46, 20 September 2017 (UTC)
Categorizing things is overhead. Someone has to actually do it. —Justin (koavf)TCM 18:23, 20 September 2017 (UTC)
  • I too question the utility of this. The inline markup is unaesthetic, cryptic and really seems out of place, and it offers practically no additional benefit to the current categorisation system. Furthermore, it rests on the assumption that words in various languages are direct equivalents of one another; they are not, for most words in a language. Sure, Venus may be translated as 太白星 and listing it in the translations table at Venus is acceptable, but the words are far from being the same, really. There are several names for Venus in Chinese, each with a nuance in meaning, and the same situation exists for nearly every other planet and star. Wyang (talk) 13:38, 18 September 2017 (UTC)
If 太白星 doesn't mean Venus, then why does the definition say Venus? Subtle nuances in meanings should be included in entries. But in this case it's simple: either it refers to that same ball of rock floating around the sun, or it doesn't. —Rua (mew) 14:17, 18 September 2017 (UTC)
It means Venus, but only in a Chinese astronomy, astrology, or Taoist context. It is already indicated with the label in the entry. It is the Grand White Star in traditional Chinese astronomy/folk religion, governed by the Grand White Star Lord. It is conveniently translated as Venus, but its definition really should be elaborated on in the future. Venus in modern Western astronomy and in the context of Taoist Wu Xing (five elements) is called 金星, Venus in Chinese astronomy and astrology is called 太白(星), Venus in the context of Taoist mythology is called 太白金星, Venus seen in the morning is called 啟明(星), Venus seen in the evening is called 長庚(星), etc. The nuances in most of the vocabulary in a language are simply too significant to allow a bijective map to another language, even for a simple term like Venus, especially if the two languages developed from cultures which historically had very few contacts. The principle of trying to map all words of a language to specific pre-defined semantic concepts, for the purpose of classification, is methodologically problematic. Wyang (talk) 23:09, 18 September 2017 (UTC)
@Wyang: Of course. But has an article named something on the planet *second-*closest to the Sun. What is that named? —Justin (koavf)TCM 23:29, 18 September 2017 (UTC)
@Koavf: You can't just say "of course" and wave that off. It's similar in Hindi: अरुण (aruṇ, Uranus in astrology), युरेनस (yurenas, Uranus the planet). Some Hindi purists also use अरुण for the planet. btw the closest planet to the sun is actually Mercury. That's not the main problem with this though, the real problem is the markup will become even more opaque and hidden from the casual editor. People wonder why we don't get many new editors, it's because it takes months to learn how to use all of our templates. —Aryaman (मुझसे बात करो) 23:57, 18 September 2017 (UTC)
@Aryamanarora: I'm not suggesting that the problem is simple or trivial: I'm suggesting that it's very complicated but that this is a good first step. Do you have a better solution? —Justin (koavf)TCM 00:06, 19 September 2017 (UTC)
If I did I would have to know enough Lua to implement it. —Aryaman (मुझसे बात करो) 00:08, 19 September 2017 (UTC)
@Aryamanarora: I'm not asking if you know the technical means (God knows I don't!), just what in principle would work better. Do you have any thoughts? I'd be happy to know what would work better even in a hypothetical sense. —Justin (koavf)TCM 04:26, 19 September 2017 (UTC)
I don't understand what you mean. Wyang (talk) 23:34, 18 September 2017 (UTC)
@Wyang: zh:w:金星 corresponds to en:w:Venus, so wouldn't that be the best word to use? Also, it's not a problem to use this senseid on more than one entry. —Justin (koavf)TCM 00:06, 19 September 2017 (UTC)
Well, it is a problem to try to assign foreign words to specific semantic labels in English: e.g. Venus on Wikidata, and try to systematically generate categories based on these crude equivalents. Like I said above, the best word to use depends on context. Very rarely do words in Chinese match with senses of a word in English exactly. Wyang (talk) 00:14, 19 September 2017 (UTC)
@Wyang: Then have both words in the category. That solves the problem. If one English word is a cognate*equivalent* to two words in another language, that's okay. (Or three or vice versa, etc.) —Justin (koavf)TCM 03:13, 19 September 2017 (UTC)
They can be both put in the same category, or any category, with the current categorisation system, without having to resort to such rigid equivalence sets. The current method is also superior in usability and aesthetics. P.S. See definitions of cognate. Wyang (talk) 04:05, 19 September 2017 (UTC)
@Wyang: It is not superior in usability because it can be exported across languages. With over 100 Wiktionaries and no less 6,500 languages, using structured data to do any part of the work is far more usable. If the method of categorization that MediaWiki uses is superior, why did we ever launch Wikidata in the first place? —Justin (koavf)TCM 04:23, 19 September 2017 (UTC)
Pronunciation is templatisable, inflection is templatisable, entry layout is templatisable, but semantics is not templatisable or structurisable. Every sense of every word in a language corresponds to a semantic field or domain, and in the hypothetical 3D representation of human perception and cognition, it is a sphere in space, which spatially centres on the core, fundamental meaning of the term. What we are trying to do when we translate foreign words into English on Wiktionary is to find existing English terms with spatially close semantic areas to the source words; that's why definitions for Chinese words on Wiktionary are usually given with two or three English equivalents. Giving these multiple equivalents allows the reader to imagine the semantic area for the foreign term by superimposing the various English terms. As such, semantics across languages is not structured, and attempts to structurise it will only result in confusion and chaos. If languages were strict bijective mappings of words and grammar from one to another, machine translation would be a lot easier. Sure, it may work for water (in most languages), since the semantic fields for words for water are mostly spatially close, but it will fail for river, fluid, syrup. Wyang (talk) 04:46, 19 September 2017 (UTC)
@Wyang: So are you opposed to the notion of categorizing these terms? —Justin (koavf)TCM 16:20, 19 September 2017 (UTC)
I'm opposed to the notion of mapping senses of foreign words onto specific, pre-defined English semantic labels, and blindly achieve categorisation via those labels, as if the labels themselves are equivalent to the senses of the foreign words. Wyang (talk) 23:26, 19 September 2017 (UTC)
The Wikidata items aren't meant to encompass entire senses. They encompass referents of senses. Chinese may have different words for Venus, all with various nuances, but they all refer to the same ball of rock in space. The context in which they are used isn't relevant, that's a matter for context labels and usage notes. All that matters is that they fundamentally are different terms for that same ball of rock. So can you give concrete examples? Which terms refer to the planet and which don't? —Rua (mew) 23:45, 19 September 2017 (UTC)
They differ on a lexical level and these nuances will be reflected in their categories. The category of Category:zh:Planets of the Solar System is perfect as it is now. There is no need to dump all tens of synonyms of Venus, plus the names of all other planets in traditional Chinese astronomy into this category; these words, which are largely limited to traditional Chinese astronomy, should go into Category:zh:Planets of the Solar System in Chinese astronomy, or at least Category:zh:Stars and planets in Chinese astronomy (the reason the entry has the {{lb|zh|Chinese star}} label). One can easily adjust the categorisation in whatever way is most appropriate now. Putting an unattractive senseid next to the sense simply takes away this freedom and flexibility. Another example is the senseid at happiness, linked to Q8 on Wikidata which has 幸福 (xingfu) listed as the Chinese equivalent. This is unfortunate as xingfu is probably one of the hardest Chinese words to translate into English. Although it is typically glossed as happy; happiness, its connotations are hard to describe and not insignificant. English "I am very happy" and Chinese "我很幸福" have vastly different meanings. It would be quite silly to let the meaning conveyed by happiness blindly dictate the categories of the foreign words. Wyang (talk) 02:12, 20 September 2017 (UTC)
I opppose this, in particular the {{senseid|zh|Q313}} noise added to 太白星. Wiki markup should be free from identifier noise; it should be pleasant to edit directly. --Dan Polansky (talk) 13:53, 18 September 2017 (UTC)
It's not possible to use Wikidata without identifiers. —Rua (mew) 14:11, 18 September 2017 (UTC)
I support using Wikidata to categorize planets. One alternative idea might be using something like {{senseid|zh|Venus}} instead of {{senseid|zh|Q313}} with a data module that recognizes that "Venus" means "Q313". --Daniel Carrero (talk) 14:35, 18 September 2017 (UTC)
I strongly oppose creating a module which maps strings to Wikidata identifiers. The Lua errors are rampant enough without going down that path. - TheDaveRoss 15:02, 18 September 2017 (UTC)
I'm not a fan of it either. In any case, the senseids themselves aren't a part of this proposal. This proposal is only about modifying {{senseid}} to use them for categorising. Having Wikidata IDs on entries is beneficial even if others decide they don't want {{senseid}} to categorise. —Rua (mew) 15:08, 18 September 2017 (UTC)
Sure, I take back the idea of mapping things like "Venus" = "Q313". I prefer using "Q313" anyway, that was just an alternative idea. --Daniel Carrero (talk) 21:26, 18 September 2017 (UTC)
I strongly agree with Wyang, especially his point regarding the fact that Chinese has multiple names for Venus, each with its own connotations. --WikiTiki89 21:19, 18 September 2017 (UTC)
Is there any Chinese name for Venus that shouldn't get categorized in Category:zh:Planets of the Solar System? This is a categorization proposal, so I'd like to know how the nuances of each name affect categorization. --Daniel Carrero (talk) 21:26, 18 September 2017 (UTC)
Oppose per Dan Polansky and Wyang. —Aryaman (मुझसे बात करो) 21:22, 18 September 2017 (UTC)
d:User:Amgine for a very old commentary on wikidata, which aligns with Wyang's opposition. Feel free to expand if you can. - Amgine/ t·e 01:54, 21 September 2017 (UTC)

Split RfD by English/non-English as we have with RfV[edit]

I propose that we split Wiktionary:Requests for deletion into Wiktionary:Requests for deletion/English and Wiktionary:Requests for deletion/Non-English, just as we have done with Wiktionary:Requests for verification. RfD is presently over 425K, and although I can't say offhand what proportion is non-English, I would estimate it at somewhere over one third. As with RfV discussions, examination of English and non-English entries, of course, requires different skill sets, and a different set of editors are typically attracted to each kind of discussion. bd2412 T 02:04, 17 September 2017 (UTC)

When I proposed the split of RFV, I considered this as well but ultimately rejected it. The fact is that if you don't know at least a little Japanese, you just can't be of any use in gathering Japanese quotations or assessing whether they're uses. However, anyone who understands how the SOP concept works can look at a Japanese word broken into its component parts and, once shown that 茶色の葉 is 茶色 (ちゃいろ) (chairo, brown colour) + (no, possessive connector) +  () (ha, leaf), and since it means "brown leaf", it would be inappropriate to have a Wiktionary entry for that. That's why everyone can contribute at RFD, and why we should focus on clearing up the backlog by making judgement calls on whether a consensus has been reached rather than splitting the page. —Μετάknowledgediscuss/deeds 03:58, 17 September 2017 (UTC)
Support, WT:RFD is too large already. --Daniel Carrero (talk) 21:37, 18 September 2017 (UTC)
Support --Backinstadiums (talk) 07:04, 21 September 2017 (UTC)
Support and I also believe scriptio continua languages and some other language groups require CFI different from English. BTW, @Metaknowledge: Japanese idiomatic terms may get a possessive particle の, e.g.  () () (konoha) or  () () (kinoha). The 2nd one looks especially like SoP ("leaf of the tree"") but both terms are considered idiomatic. Languages such as Korean or Arabic, etc. (both use spaces between) may have non-words written together, with no spaces between them, such as clitic prepositions, pronouns, etc. (Arabic) - فَقَالَ (faqāla, and (he) said)‎ = فَ‎ + قَالَ‎, غُرْفَتِي (ḡurfatī, my room)‎ = غرفة‎ + ي‎, particles and copulas (Korean) - 한국어로 (han-gugeoro, “in Korean”) = 한국어 + 로, 학생입니다 (haksaeng-imnida, “(someone) is a student”) = 학생 + 입니다. I do agree, however, that one can take part in discussions without a thorough knowledge of a given language but one has to learn fast and listen to arguments of native speakers or advanced learners. --Anatoli T. (обсудить/вклад) 07:44, 21 September 2017 (UTC)
Support. — Ungoliant (falai) 12:52, 21 September 2017 (UTC)
Support. --Canonicalization (talk) 13:19, 21 September 2017 (UTC)

Upcoming Wiki Science Competition[edit]

Did you hear about the Wiki Science competition, starting in November?

The competition will focus on images, but it might evolve in the near future, so users of other content platforms should take a look at it.

I've informed the village pump on commons, since there will be an intense workflow of technical uploaded by newbies, that will require some better categorization and translation of descriptions here and there. More importantly, images can be used for the articles on specific platforms. I think about some of your users who created and take care of many technical and scientific entries and are still currently active, such as User:SemperBlotto

I give you some details.

In 2015, limiting to Europe, we got thousands of entries, we can expect two or three times more this year. In the case of Italy for example we will send emails to many professional mailing lists, and other national wikimedia chapters will use their social media too to inform the public.

We have finished with Ivo Kruusamägi of WM Estonia to prepare some of the juries. I did my best to gather, besides people with a strong scientific background, also some expert wikipedians (because I ask first on wikipedia) here and there to take a look to the files on commons and not just the quality of the images. I have also informed users on English wikipedia, English wikiveristy and will do the same on some other wikimedia platforms in the following weeks.

The final international jury is made of expert researchers, usually with interest in photography, but no strong knowledge of the details of any wikimedia platforms. The main goal was to enlarge the network of "friends" of wikimedia platforms. Some national juries should have enough expert wikimedians and wikipedians probably, I guess because of the presence of active national chapter in their set up, so someone might take care of some the uploads at least improving some description and/or using them diorectly. Sometimes, suggesting technical entries to be created too.

More in general, gathering users besides wikipedians will probbaly help us to include more platforms for the competitions.

Now that I am sure that we have enough "scientists" here and there and from different fields, maybe we can see if we can also gathers specifically expert wikimedia users, whatever their background. Example simple teachers and not researchers that can evaluate the quality of the images for more specific uses.

For the countries without juries, there is the possibility of creating a second-level jury to select images from the rest of the world to the experts of the final jury. For such second-level jury I have found some names, but the numbers of entries could be really high, so maybe that's where we can look for more standard wikimedia users.

if you are a citizen of a country with a national jury you could also join them directly (rumor has it, more will appear). I don't know the details in many cases, if they need more jurors or they are fine.

Anyone interested?--Alexmar983 (talk) 05:59, 18 September 2017 (UTC)

I am not interested to be part of a jury but I though it is very interesting that you knock here. Pictures made with a Wikipedia uses perspectives are quite different than pictures usable in Wiktionary. Here we also need to illustrate verbs and actions for tools (not only the tool itself) and more. I'll be very enthusiastic to integrate pictures from this competition in wiktionaries if they fit our needs! Face-smile.svg Noé 07:36, 19 September 2017 (UTC)
With thousands of uploads, statistically someone could fit some needs also here... Noé I am happy if more people take a look, this should give better feedbacks for the future when the competition will be bigger and we can make it more specific to the needs of some wikiplatform. For example edit-a-thons. In the meantime, I have found another juror on frwikipedia, I am close enough to finalize the second-level jury. I am "sad" noone replied form wikiversity yet.--Alexmar983 (talk) 12:56, 19 September 2017 (UTC)
Cool. Maybe you can try to ping the French Wikiversity, if some French Wikipedians can assist you Face-smile.svg Noé 13:24, 19 September 2017 (UTC)

Wiktionary User Group[edit]


The Tremendous Wiktionary User Group is a coalition of users of Wiktionaries aimed to create a common platform to share ideas and documents. It is also a way to be a lobby at Wikimedia Foundation to make it acknowledge the needs of our projects in term of technical improvements. .

This User Group is completing a revolution, a first year of existence! We are writing our first Annual Report (due September 26th). It's time to look at what was made during the year and to frame the future axis of action. There is 42 affiliates now but the group can include much more people. I invite you to read our works and to see if you want to participate in our actions. The more visible one is LexiSession but there is much more to do, including promotional material (leaflet, banners, stickers, etc.), inter-wiktionarian collaborations (on templates, Wikidata, policies and guidelines) and meet-ups! There is no fees nor admission processes, it's open to everyone who like Wiktionary and want to do more about this project. Your ideas and initiatives are welcome!

Thank you for your attention, I hope to see you soon Face-smile.svg Noé 08:14, 19 September 2017 (UTC)

Help review PulauKakatua19 (talkcontribs)'s entries[edit]

This user is editing in way too many languages for them to possibly understand all of them. I have checked and fixed all of the recent edits in Hindi, Bengali, and Sanskrit, but someone acquainted with Indonesian, Malay, and now Korean should check the rest. Atitarev (talkcontribs) warned them on their talk page about Russian a while ago too. —Aryaman (मुझसे बात करो) 01:00, 21 September 2017 (UTC)

I will check their Chinese, Korean and Malay ones. Wyang (talk) 01:11, 21 September 2017 (UTC)

Gfarnab (talkcontribs) back at it again[edit]

e.g. A recent error at . Someone please block them. Wyang (talk) 01:13, 21 September 2017 (UTC)