Wiktionary:Beer parlour/2017/August

Definition from Wiktionary, the free dictionary
Jump to navigation Jump to search
discussion rooms: Tea roomEtym. scr.Info deskBeer parlourGrease pit ← July 2017 · August 2017 · September 2017 → · (current)

Contents

travel game[edit]

Would travel game (a board game or card game that was modified to be playable by passengers during a trip) be considered a SoP? W3ird N3rd (talk) 05:34, 1 August 2017 (UTC)

Looks good to me. I'd class I spy and the number plate game as my favourite travel games from when I was a kid in a car. --WF on Holiday (talk) 23:04, 1 August 2017 (UTC)
That's actually not even the definition I meant. I meant games like chess or Ludo that have been modified (e.g. with magnetic game pieces) to be played in a car or on a train. Amazon link to clarify. W3ird N3rd (talk) 00:10, 2 August 2017 (UTC)
I thought about it and the definition I was originally thinking of (and yours as well) is SoP after all because there are other "travel" things. But travel didn't have an adjective section yet. It does now.
  1. (in a compound) An object or activity that has been designed or reworked for use while travelling.
    (object) I've packed the chess travel game in my travel bag and I've got my travel cup in the cupholder, I'm ready to go!
    (activity) Let's play a travel game. I spy with my little eye..

W3ird N3rd (talk) 04:26, 2 August 2017 (UTC)

Aaaaand it's gone. @SemperBlotto, is there a reason you just chucked the whole thing instead of moving it into an additional definition for the noun? I had looked at running (like "running man") and noticed it had an adjective section, but on closer inspection that is used for other meanings of running.. I think. I'm not even sure. I can't entirely explain why running is an adjective in all meanings mentioned but travel isn't. W3ird N3rd (talk) 06:28, 2 August 2017 (UTC)
(@SemperBlotto. —suzukaze (tc) 06:45, 2 August 2017 (UTC))
Thanks, I'm still learning how these things work. I looked it up: https://en.wikipedia.org/wiki/Attributive_verb. So it appears travel acts as a deverbal adjective. So it seems either SemperBlotto is wrong or somebody needs to remove the adjective section from exciting or I may be losing my marbles. W3ird N3rd (talk) 06:57, 2 August 2017 (UTC)
I dunno, travel in travel game sounds like the noun travel to me: a game used during travel. It's weird to try to think of it as a verb. — Eru·tuon 07:07, 2 August 2017 (UTC)
Right, so it's https://en.wikipedia.org/wiki/Noun_adjunct. So "travel game" can't be added because I suspect it's SoP yet the information in travel and game don't really allow one to figure out what a "travel game" would be. And this information can't be added to travel either. Okay, my marbles are definitively gone. W3ird N3rd (talk) 07:28, 2 August 2017 (UTC)
Were I looking at this naively, it would be ambiguous to me whether this meant "a game suited for travel" or "a game related to travel" or "the travel industry" or "one of a genre of games somehow related to some definition of travel", or ..... I don't think that dictionaries should act as if they are well suited to hold users' hands as they try to figure what a phrase or sentence or larger unit of language unless there is true novelty or obscurity worse than what I have advanced as my own naive view of alternative meaning. DCDuring (talk) 19:41, 2 August 2017 (UTC)
Well, it's not a phrase, it's a compound noun. I think from what you're saying, it's (for a naive reader) not transparent. — Eru·tuon 19:57, 2 August 2017 (UTC)
Gee, lots of people would call it a noun phrase or NP. Do we have a policy about which school of labels we follow? DCDuring (talk) 21:35, 2 August 2017 (UTC)
Not that I'm aware of. The criterion of spacing bothers me because it means that if you happen to add spaces between the parts of a compound, then it suddenly changes to a phrase. So honeybee is a compound, while honey bee is a phrase. Utterly arbitrary. There has to be a more solid criterion than spelling. — Eru·tuon 22:02, 2 August 2017 (UTC)
If a multi-word expression is attestably spelled solid, we have decided that is sufficient evidence to say that phrase, usually a bare NP, is includable. That criterion is intended to shortcut our repetitive, amateurish arguments about including such terms. DCDuring (talk) 22:29, 2 August 2017 (UTC)
Huh. I was talking about criteria for whether something is a compound noun, not CFI. — Eru·tuon 22:36, 2 August 2017 (UTC)
I think that, in practice, we try to avoid academic discussions with only indirect application to Wiktionary. It seems to me a good practice. DCDuring (talk) 23:46, 2 August 2017 (UTC)
You're probably right. I'm quite annoyed by compounds being called phrases, but it isn't particularly useful to discuss. Back to the content of your post, you recognize potential ambiguity with travel game but still don't think it should be included. I find that baffling, given that English Wiktionary is used by lots of people who don't speak English well. I would imagine that at least some of them would misunderstand travel game in the ways you mention. — Eru·tuon 01:59, 3 August 2017 (UTC)
@Erutuon: To me those ambiguities are typical of those that arise in interpreting any NP/compound noun that one hasn't heard before. In normal speech, the context shows one definition to be the most relevant of all of the ones that are possible from the definitions of the component terms. I consider the situation to be illustrative of why we focus on transparency of meaning in the context in which a term is used, given the definitions of the component terms. DCDuring (talk) 07:48, 3 August 2017 (UTC)
It really, really, REALLY wouldn't be the first time I turn to Wiktionary (or any other dictionary) to look up a word that I have no proper context for. For example when something like this happens in a TV show:
So what do you hate most?
-Any travel game.
Why do you hate that so much?
-I JUST DO NOW DROP IT ALRIGHT?
And then the show continues. Maybe it's a running gag. Maybe it's a reference to something in a previous episode that I missed. Maybe it refers to some character trait. Maybe it refers to some event or tradition that I'm not aware of, like some scandal in the country where the show was made. Maybe it's just plain random.
Alternatively, some word will pop up in my head randomly but I can't remember the context I heard it in. Out of curiosity I try looking it up. What it comes down to is as simple as this: Wiktionary is useless to look up any ambiguous SoP so I'll be forced to go elsewhere. If that's your goal I'd say mission accomplished. W3ird N3rd (talk) 23:47, 3 August 2017 (UTC)
Why wouldn't it be arbitrary? Why should you be able to draw a clean line between a compound and a noun phrase? There's no clear line between a "canoe truck", a "turnip truck" and a "fire truck". Certainly, though, English words spelled without spaces are more likely to be organic unions with a unique meaning, whereas noun phrases are more likely to be spelled with spaces and have meanings obvious from the individual words.--Prosfilaes (talk) 23:24, 2 August 2017 (UTC)
I dunno, it seems axiomatic that syntactic categories (word, phrase) should be based on something other than spelling, such as syntactic behavior. If they coincide with spelling, great. Honey bee behaves no differently from honeybee, so it is in the same syntactic category. Maybe there are spaced-out compounds that could with more justification be called phrases. I agree, though, that there is something determining whether a compound can be written with spaces: if it would be too long as a single word, or its meaning is obvious from its constituent parts. At some point on the continuum of each characteristic, it's acceptable to write a word either way. But I don't think either characteristic has anything to do with syntactic category (word or phrase) either. — Eru·tuon 01:59, 3 August 2017 (UTC)
Given my background in computer science, it seems axiomatic that you lex before you parse, and that you have to figure out what a word is before we starting figuring out what stuff means. That's sometimes not possible in computer or human languages, and pauses in audio would be more reliable than spaces in text, but things should be broken into words ideally before we get into syntax.--Prosfilaes (talk) 03:20, 3 August 2017 (UTC)
@Prosfilaes: I don't know anything about computer science or quite what lex and parse mean, but what I mean by syntactic category is word, phrase, clause, or noun, verb, adjective, etc. So which things are words is connected to syntax. Anyway, from what programming I've done (mostly on Wiktionary), programming languages are far more tightly constrained and more straightforward to analyze (if not figure out what their actual purpose is) than human languages, so I don't know how much of the process is similar to analyzing the lexical or syntactic categories of human words. — Eru·tuon 21:22, 4 August 2017 (UTC)
The basic ideas of lexing and parsing used in computer languages were designed by Chomsky for use in human linguistics. The point is, we can't talk about nouns and adjectives before we figure out what words are. In both human and computer languages, you lex (split text into words and specific punctuation marks) and then you parse, and occasionally you're forced to go back and relex the text in light of the parsing. But in both cases, you do the vast majority of breaking stuff into words before you start trying to figure out the meaning. There's a reason why spaces and verbal pauses exist in languages; it's to make it easy to clearly split things into words.--Prosfilaes (talk) 23:04, 4 August 2017 (UTC)
Well, it seems my use of the name syntactic category for word and phrase got you on the tangent of lexing before parsing. I don't know, maybe syntactic category isn't the right term. I have no idea. And I don't see how lexing before parsing relates to whether compounds are words or phrases. — Eru·tuon 23:22, 4 August 2017 (UTC)
I would imagine (but I can't speak for Prosfilaes) that if "travel game" is a word in your dictionary, you can just look it up and you know what it means. If it's not in your dictionary, you will assume it's just two words and you look up travel and game. From that, some systems (like Google translate, Babel Fish, etc) could probably end up being fooled into assuming this is roadkill or a really annoying basketball game. W3ird N3rd (talk) 00:01, 5 August 2017 (UTC)
DCDuring, I hadn't even thought of those interpretations yet. Thinking about that, I realized game also means wild animals hunted for food. It would depend heavily on context whether a non-native speaker could actually make that mistake, but I think it would be funny as hell. In the text "We were very hungry because we didn't pack enough food. But at least while on this trip, we enjoyed some travel game." the "travel game" could actually be interpreted as roadkill. Bon appétit! I'm hoping Wiktionary:Beer_parlour/2017/August#Allow_more_SoP_compounds.2C_similar_to_Dutch_and_German. or another rule change based on that will fix this in the future, but I don't think I'm going to hold my breath. W3ird N3rd (talk) 02:12, 3 August 2017 (UTC)
The MWE is also a synonym of away game. DCDuring (talk) 01:08, 5 August 2017 (UTC)

order Arabic disambiguating entries orthographically, not by verbal forms[edit]

Currently Arabic disambiguating entries are ordered by verbal forms instead of orthographically, which is not the optimal lexicographical approach. Thus, for ease of reference, يُوجدُ should appear just once in the page for يوجد, specifying it could belong to either verbal form I or verbal form IV. --Backinstadiums (talk) 08:47, 1 August 2017 (UTC)

It should be just as easy as modify a line of code --Backinstadiums (talk) 12:39, 4 August 2017 (UTC)

@Backinstadiums: Huh? What line of code? — Eru·tuon 18:14, 4 August 2017 (UTC)
@Erutuon: I mean it cannot be that much of fuss, just a different grouping in a specific case. If anything should be clarified further, please let me know. --Backinstadiums (talk) 20:57, 4 August 2017 (UTC)
@Backinstadiums: To do this, the template {{ar-verb-form}} would have to no longer display the form number and many entries would have to be edited (there are 35,188 entries in Arabic verb forms, some of which will contain homophonous verbs with different Form numbers). The editing part would be a lot of work, and would probably have to be done by bot, as the entries were in large part created by bot. I'm agnostic on whether the change would be helpful or consistent with Wiktionary organizational principles, and no one else has responded: @Atitarev, Wikitiki89, Benwing2? — Eru·tuon 22:22, 4 August 2017 (UTC)

Just like any issue in life, no matter how much is already done, if it's not in accordance to the optimal lexicographical approach which enables ease of reference to improve the user's usability, action must be taken on it as soon as possible not to worsen resources even more --Backinstadiums (talk) 15:39, 10 August 2017 (UTC)

August LexiSession: circus[edit]

Let's go to the circus!

The monthly suggested collective task is to collect words about the circus. I've noticed that Wikisaurus:circus does not exist, and auguste is a kind of clown, so this a great opportunity to look around this topic together!

Let's stop clowning around and juggle some ideas together!

By the way, Lexisession is a collaborative experiment without any guide or direction. You're free to participate however you like and to suggest next month's topic. If you do something this month, please let us know here or on Meta, to let people know that English Wiktionarians are doing something on this topic. I hope there will be some people interested in making some contributions! Face-smile.svg Noé 13:43, 1 August 2017 (UTC)

Here's a good start - to be added to Category:en:Circus if appropriate[edit]

Circus and sideshow attractions[edit]

Maybe this is me being slightly grumpy because some people in another discussion I started don't seem to entirely grasp what I was suggesting, but aren't some of these SoP?
I personally don't have a problem with any of these and luckily I'm not a SoP nazi, but after a few RfDs this project could end up with more red links than it had when it started. W3ird N3rd (talk) 03:30, 4 August 2017 (UTC)
  • Some of these seem lame and/or SoP. On the other hand sources such as Carny Lingo show that there is a large vocabulary of great charm and linguistic interest. I doubt that we will get very far into that highly desirable content this month, but extracting a list of terms from that and similar sources would be useful for Wiktionary, IMO. I'm not at all sure that the terms fit well into the categories suggested, many better assigned to a category based on usage context, eg, Category:English circus slang or similar. Examples, barnstorm, blow a tip, blow one's pipes, build a tip, burn the lot, carry the banner, clean the Midway, cool out, bail the counter, bat away. Unfortunately, I don't know that we have a good system of such categories, instead duplicating encyclopedic-type "topical" categories. DCDuring (talk) 08:29, 4 August 2017 (UTC)
    I suppose much of this would fit in Category:English circus slang. I hope that {{lb|en|circus slang}} or {{lb|en|circus|slang}} would work. DCDuring (talk) 08:36, 4 August 2017 (UTC)
    My hopes are in vain. I hope someone can rectify the operation of {{lb}} so anyone interested can help us play along with this cross-project effort. Note that there is a considerable overlap between criminal slang and circus slang. DCDuring (talk) 08:48, 4 August 2017 (UTC)

Next steps for Wikidata access[edit]

Hello all,

Thanks to @Daniel Carrero there's now a page to centralize all the discussions and information related to accessing Wikidata data from English Wiktionary. I hope we can improve it soon with examples and documentation :)

We also suggest an enabling date for the arbitrary access: September 7th. If you have any question or concern, feel free to ask. Thanks to the people who worked on this! Lea Lacroix (WMDE) (talk) 13:52, 1 August 2017 (UTC)

Thank you. September 7th looks good to me. --Daniel Carrero (talk) 03:18, 2 August 2017 (UTC)

Best practices for Oxford -ise/-ize variants[edit]

I just made Birminghamize. What should be put at Birminghamise? —Justin (koavf)TCM 00:58, 2 August 2017 (UTC)

It is ridiculous that Wiktionary lacks a basic policy on how to handle these variant English spellings: afraid of offending others / nobody willing to take charge / deeming status quo as good enough / etc., ... so there we go - both color and colour can evolve in parallel. Wyang (talk) 06:05, 2 August 2017 (UTC)
The problem is that someone who dares to set a standard will likely get into an edit war. So nobody touches it with a pole. —CodeCat 19:49, 2 August 2017 (UTC)
This would be a perfect application of Wikidata. —Justin (koavf)TCM 23:56, 2 August 2017 (UTC)
I have never heard of Birminghamise, so unless you can find it being used there is no point in making an entry. But normally -ise verbs are labelled "British spelling" so they can appear in Category:British English forms. DonnanZ (talk) 23:45, 5 August 2017 (UTC)

Etymology giving me problems[edit]

Can someone review:

for the etymologies that I've added? All of these words are directly taken from Spanish but I've clearly not made them all correctly formatted. Also, I'm not sure if there's a different way of noting a language which is a creole based on [x] versus a language which simply adopts one word from [x]. (E.g. the difference between a Haitian Kreyol word derived from French versus using "facade" in contemporary English). Thanks. —Justin (koavf)TCM 02:24, 2 August 2017 (UTC)

The relation between a creole word and its etymon from the lexifier doesn’t fit very well with the inherited/borrowed dichotomy. We should consider adding templates for other special kinds of derivation like this and substrate “borrowings”, semi-learned borrowings, etc. — Ungoliant (falai) 02:45, 2 August 2017 (UTC)
It's good to see someone else basically ratify that. For creoles/pidgins, it's really a different matter than to go from stages of a language (Old English → Middle English → Modern Englishes) or inheritance in a family (Proto-Germanic → English). —Justin (koavf)TCM 04:43, 2 August 2017 (UTC)

French Wiktionary monthly news - Actualités[edit]

Logo Wiktionnaire-Actualités.svg

Hello!

I am happy to inform you that the 28th issue of Wiktionary Actualités just came out in English!

As usual, Actualités is in English but talk about French Wiktionary and lexicography in general.

In this edition main articles are: a presentation of the Lingua Libre project to record words, a summary of a strange dictionary and a thought about lemmas and grammatical categories. And more: shorts, statistics (including new ones like the number of pages that include a link to a thesaurus) and an explanation about the Linter.

As usual, it is translated in English by non-native speakers, so it is not perfect, but it can be improved by readers (wiki-spirit as usual). Please note that we do not received any money for this publication and we are not supported by any user group or chapter. It is only written by the community. Feel free to leave us comments! Face-smile.svg Noé 09:09, 2 August 2017 (UTC)

Allow more SoP compounds, similar to Dutch and German.[edit]

So there was a discussion last month about deleting SoP compounds in German and Dutch. Now triggered by "travel game", perhaps we could explore pros and cons for the opposite. That is, allowing English SoP compounds in ways similar to the way they would be allowed in German and Dutch.
So exactly what does that mean? Put simply, if some SoP would pass an RfV and is not using any common/universal word (like "brown" or "fan") it would be allowed. This means you still can't create brown leaf or large box, but you could create burger joint and sheep farmer. Also computer chip and lab rat, those already exist but I'm not sure how they could be justified by the current rules. Optionally you could exclude any SoP with a space that is unambiguous. (like sheep farmer)

  • Pro: while it may be possible to figure out the meaning of a SoP by looking up the parts, it's not always easy. The parts may have more than one possible meaning so you need to figure out the correct meaning for all the parts.
  • Pro: in the case of "travel game", travel and game don't really make it clear what a travel game is. Travel game is probably SoP and adding the attributive noun use to travel just results in an instant revert. So basically it's impossible to describe a "travel game" on wiktionary.. in English. I could, however, describe the Dutch word reisspel.
  • Pro: fietshater (bike hater) would pass RfV and should be allowed by the current rules. It wouldn't be allowed with these new rules because hater is universal and can apply to thousands of nouns and verbs.
  • (added august 4) Pro: translations. How would you translate juice extractor (SoP) to Dutch? Juice is sap, but how to translate extractor? The correct answer is sapcentrifuge, but would you have ever guessed it? Wiktionary is useless in this case, and this example wasn't even that ambiguous.
  • (added august 5) Pro: We can delete ex-pilot. (Wiktionary:Requests_for_deletion#ex-pilot).
  • Con: there will be more entries on wiktionary.

I'm not taking a stance on this myself yet, I just think it's worth thinking about. I may not be seeing the whole picture. I haven't made up my mind yet and I think it's a good idea. I wonder what you think. W3ird N3rd (talk) 09:18, 2 August 2017 (UTC)

Sorry W3ird N3rd, strong oppose on that. As I mentioned in the discussion you're referring to, we need to have some sort of quality control around here – having more entries on Wiktionary doesn't necessarily boost our credibility if said entries are redundant. --Robbie SWE (talk) 09:57, 2 August 2017 (UTC)
There would still be a form of quality control. RfV requirements still apply and common words are not allowed. Optionally you could add that if the SoP is fully transparant (like sheep farmer) it is still not allowed, while allowing burger joint and travel game. You talk about quality control, but you actually don't have that control right now as I could create fietshater (bike hater) and you probably couldn't do a thing about it. W3ird N3rd (talk) 10:47, 2 August 2017 (UTC)
I support allowing all attestable English compound words, and no longer making it spelling-dependent. Consequently, WT:COALMINE would be superfluous, as coal mine would no longer depend on the attestability of coalmine for inclusion. —CodeCat 10:53, 2 August 2017 (UTC)


Perhaps we could deal with SOP compounds differently than with other lemmas, effectively soft redirecting them to their constituents while keeping them for consistency, maybe like
(literally) A mine from which coal is dug
or
(literally) An exhibition (tentoonstelling) of Khoekhoe (Hottentot) tents (tent)
While being subject to usual attestation rules and linked from translation tables as hottentottententententoonstelling f where applicable (of course we won't have a page for Khoekhoe tent exhibition to link translations from).
It could get messy with languages with transliteration though. Crom daba (talk) 13:04, 2 August 2017 (UTC)
Why don't we include any rubbish as terms and forget about CFI? Who cares about this dictionary and its reputation, anyway? --Anatoli T. (обсудить/вклад) 13:33, 2 August 2017 (UTC)
We already host all sorts of rubbish, my approach would make it more manageable and invisible in most use cases. Crom daba (talk) 14:00, 2 August 2017 (UTC)
I don't see the connection between being more inclusive of compounds and quality. The more useful lexicographical content we can provide, the better. RFV provides good quality control, alongside making sure our entries are clean and properly formatted. —CodeCat 14:33, 2 August 2017 (UTC)
@Atitarev and @Crom daba, please be aware that hottentottententententoonstelling is a terrible example that isn't even related to this discussion because it is a joke word and tongue-twister. This word is not, will not and has never been used to refer to any kind of actual exposition. I used it in the other discussion to demonstrate how hard it can be to break down Dutch compound words, but hottentottententententoonstelling isn't SoP. W3ird N3rd (talk) 14:42, 2 August 2017 (UTC)
Yes, I've read the entry, I'm merely using it to show how non-idiomatic words could be formatted. Crom daba (talk) 14:49, 2 August 2017 (UTC)
You may know that, but I think Atitarev possibly doesn't and now thinks Wiktionary will be filled with thousands of rubbish words like that. W3ird N3rd (talk) 14:54, 2 August 2017 (UTC)
oppose. There's no need for most multi-word English terms in English, and nobody will look them up.--Prosfilaes (talk) 23:02, 2 August 2017 (UTC)
Wanna bet? How can you be so sure? Nobody has any idea what passive users search for. DonnanZ (talk) 14:16, 3 August 2017 (UTC)
Also, people do look them up. Pageviews for lab rat are similar to minibar. In addition, how can you say nobody will look something up when the thing in question doesn't (or isn't allowed to) exist? And another thing: translations. We can't have a juice extractor because it's SoP. So now translate juice extractor into Dutch. Good luck with that. You will correctly find sap for juice but how are you supposed to translate extractor? Here's the answer: a juice extractor in Dutch is a sapcentrifuge. Which you could have found if you had looked up juicer (it just so happens a single-word synonym exists here, this is not always the case), but you won't find that if you're looking for a juice extractor, which is the term I'm most familiar with. The very fact is that I had to look up this example on Wikipedia: w:Juice extractor which helped me find juicer. And it's just sheer luck that a juice extractor happens to be encyclopedia-worthy. W3ird N3rd (talk) 01:10, 4 August 2017 (UTC)
This is true. Professional translators seldom need ordinary dictionaries (such as collegiate dictionaries), we want dictionaries that are mainly multi-word, such as the French-English Dictionary of Petroleum Technology. Multi-word dictionaries are the gold-standard and they command high prices. My Dictionary of Petroleum Technology cost me $115 in 1980. In my translating company, we virtually never used any of the ordinary dictionaries (such as Websters, OED, Random House, American Heritage), we only purchased and used the very expensive multi-word dictionaries. Even now that I'm retired, I never use the simple word dictionaries. Almost all the terms I ever have to look up are multi-word terms, and Wiktionary does not handle those. Translators have to equip themselves with a pile of very expensive dictionaries, and all of them multi-word. —Stephen (Talk) 03:46, 4 August 2017 (UTC)
  • I find the juice extractor argument convincing. So far our CFI mainly cover "does anyone want to look it up?" and less "might anyone want to translate it?" Been on a treasure hunt within Wiktionary for translations myself in the past. Korn [kʰũːɘ̃n] (talk) 14:33, 4 August 2017 (UTC)

General question: what does SoP compound even mean?

Compounds have a continuum of transparency of meaning, but they generally do not have a single possible meaning. If they are formed from two nouns (as for instance travel game), there are several possibilities. I'm somewhat rusty on them, but I gather that travel game is a tatpurusha, where travel is added to game to signify a particular type of game, and travel has the meaning of a particular prepositional phrase (or in Sanskrit a grammatical case. Putting aside the other forms of compound, the relationship of travel to game is unknown when you're newly encountering the word. The actual relationship, in terms of grammatical cases, is locative: "a game played during travel". But there are other possible interpretations, such as "a game consisting of travel" (like, I dunno, a long-range treasure hunt?). Meh, it's not a very good example, or I'm not very good at brainstorming about possible meanings.

There are compounds that might be clearer: for instance, bike-hater. A noun combined with hater is often the object of hater (the thing that is hated). But even there, theoretically it could mean "a hater who is on a bike".

So I don't think compounds can be SoP in the same way that regular phrases or sentences are, like "some people hate bikers". There isn't one predictable semantic relationship between the elements of a compound the way there is with phrases. In the previous case, some people is the subject of hate, and bikers is the direct object of hate: that's the only way it can go down.

So what is a SoP compound? I have no idea. I think it should be well-defined for it to serve as a CFI. — Eru·tuon 21:15, 4 August 2017 (UTC)

  • "We were very hungry because we didn't pack enough food. But at least while on this trip, we enjoyed some travel game." roadkill!
  • travel game: A game to play on a journey (like I spy or punch buggy)
  • a physical game, like chess, designed for use on a journey (magnetic pieces etc)
  • geocaching / geohashing
  • travel business (Doug Parker is a big name in the travel game)
  • Something that resembles a game with rules, despite not being designed: in the travel game, being held up for security checks is becoming less of a drag and more of a routine nowadays
  • The ability to seduce someone, usually by strategy:
Watch him. He's got a great travel game.
-He's got a WHAT?
Travel game. Basically he just takes any chick he picks up to Paris. Guaranteed success.
  • The travel game that is used by airlines where they offer cheap tickets but charge extra for additional luggage, meals, toilet visits and use of the oxygen mask is really disgusting.
  • (basketball) I'm getting tired of these travel games. They just travel for most of the game time. It's not funny anymore.
  • (childbirth) There's nothing fun about the travel game, but all that is forgotten when the mother is holding her newborn baby.
Will this suffice? ;-) W3ird N3rd (talk) 23:45, 4 August 2017 (UTC)
Then again, all of these sound valid to me, which could be an argument that compounds really are a sum of parts, or rather a product. Crom daba (talk) 16:03, 5 August 2017 (UTC)
That's the trick. They may sound valid, but most of them are completely invalid.
Unlikely to pass RfV:
  • roadkill
  • a game of basketball with lots of travelling
  • the ability to seduce someone
  • game that involves travelling
  • a questionable or unethical practice
  • childbirth
Using a universal part:
  • travel business (possibly won't pass RfV either)
  • something that resembles a game with rules, despite not being designed (possibly won't pass RfV either)
Valid:
  • a game to play on a journey
  • a physical game, like chess, designed for use on a journey
So by the proposed guideline, only the last two definitions would be included. But that's not final, you could argue about exactly what should and should not be included. For example, you could argue that if a valid entry for travel game already exists, it's acceptable to add travel business (if that would pass RfV) while at the same time not allowing an entry to be created solely for travel business. W3ird N3rd (talk) 16:56, 5 August 2017 (UTC)

What about if I would word it like this:

  • SoP compounds with an irregular translation in another language are allowed. (or at least their translation section would be) This will allow juice extractor because of sapcentrifuge.
  • SoP compounds with a space or hyphen that have no irregular translations in another language, have only one meaning and this meaning can be reasonably obtained by looking up the first definition of the seperate words are not allowed. This would possibly cover sheep farmer assuming there are no irregular translations in another language.
  • SoP compounds using parts that can be universally applied (the parts are not related in any way) are not allowed, unless they are idiomatic. This excludes "brown leaf", "large box", "luxury boat" and ex-pilot but allows more cowbell. (w:More Cowbell)
  • (added august 6) Compounds without a space or hyphen that have only one non-universal part (like sockless) are only allowed if their usage is vast - far beyond the current three-independent-durably-backed-up-sources rule. Common words like hopeless or pointless should be kept, but exactly how much sense does an entry for boatless make?
  • Any entry still needs to be able to pass an RfV.

Maybe this is more clear? W3ird N3rd (talk) 16:56, 5 August 2017 (UTC)

I believe we should have some rules similar to WT:COALMINE in order to avoid unproductive RFD discussions and give contributors a chance to to predict if their new SOP entries will pass RFD. I guess many editors don't feel liked spending time on creating entries that later on might get deleted. This will slow down the rate of coumpound term entry growth which IMHO are necessary for a usable multiligual dictionary. I dont't believe that a vote to allow all attested terms currently has any chance to pass. In the past some such rules have been proposed:
  • Including all terms with lesser common single-word synonyms.
  • The lemmings priciple which would grant inclusion if a term is covered by a list of trusted dictionaries (which still have to be specified).
  • We already have some translations-only entries (Category:English non-idiomatic translation targets), however there is yet no rule to prevent their deletion. We probably want to keep them if they have idiomatic translations for a number of languages.

Matthias Buchmeier (talk) 23:32, 5 August 2017 (UTC)

From what I understand, I would have to create a vote at Wiktionary:Votes. I would probably need some help to have any chance of getting that right. You say such a vote would have no chance to pass, but if there's anything I learned from politics it's this:
  • If you want something to pass, bring it up for voting when everybody who is against it is on vacation.
  • If you want something to pass, just attach it to another bill that is being voted on that will pass. (not possible on Wiktionary)
  • If neither of those are feasible but at least a third of eligable voters are in favor, just bring it up for voting again and again and again and again. Sooner or later it'll pass because either those who are against it missed the vote, those who are against it don't vote because they figure it'll never pass anyway (that's one of the reasons Trump was able to win) or some current event or hype changes what people think and the vote passes.
There are more strategies, but these are the big ones. Once it has passed, it'll be virtually impossible to take it off the books again. W3ird N3rd (talk) 03:38, 6 August 2017 (UTC)

I just came across the following: towelless, fishless, bikeless, streetless, boxless, fireless, woodless, barless, magazineless, goldless, bronzeless, schoolless, cardless, mapless, pantless, sockless, appleless, watchless, morningless, kingless, bossless, condomless, monitorless... (this goes on endlessly) Next time you see somebody saying Dutch or German needs to be treated differently on Wiktionary, slap them in the face with this list. W3ird N3rd (talk) 07:00, 6 August 2017 (UTC)

The problem is that a lot of contributors have the justified fear that allowing all attested multiword compounds would flood the database with low quality entries. I believe that the best way to overcome this problem would be some set of well-designed inclusion rules. Matthias Buchmeier (talk) 17:48, 6 August 2017 (UTC)
I think the load of -less variants is low quality. A bunch of these wouldn't even pass RfV. So be it German, multiword compounds or just plain English -less variants of words: we need better inclusion rules. The current inclusion rules allow rubbish like boatless while prohibiting travel game. They will also allow fietshater and perhaps even bike-hater while nothing prevents lab rat from being deleted. I think the five-bullet point list I made above is at the very least a good start. But if there's no chance of any change ever becoming policy, I might as well give up. In that case a completely new wiktionary needs to be started, which would be a downright shame. W3ird N3rd (talk) 06:30, 7 August 2017 (UTC)
boatless would easily pass RfV, and is a translation of an Egyptian term (iww, with a hook above the i) that would pass your translation terms argument. I recall an old dictionary has a page of un- compounds without definitions; it hardly hurts us to give stuff like that boilerplate entries.--Prosfilaes (talk) 09:23, 7 August 2017 (UTC)
Looks like boatless is a bit of an odd duck. It's not used a lot on websites (which is what I initially checked for), but quite a few books use the word. As for the translation, I wasn't aware of that and was only referring to the RfV. I don't terribly mind having such entries around, but it just feels like insanity to have those while not allowing entries that are not nearly as obvious "because SoP". W3ird N3rd (talk) 13:30, 7 August 2017 (UTC)

Kajkavian – language, dialect or something inbetween?[edit]

Recent changes to Kajkavian prove that there is a dispute in the linguistic community as to the classification of this dialect/language. I hate to see this entry be turned into a political battlefield, so let's decide once and for all – is it a dialect or language, and should this page be protected to avoid any future disputes? --Robbie SWE (talk) 10:12, 2 August 2017 (UTC)

I'd stick with the conservative option and call it a dialect, at least until it gets an army and a navy. Crom daba (talk) 10:52, 2 August 2017 (UTC)
It's still not settled. Its status has been disputed for a long time, but it has been classified as a dialect of Serbo-Croatian since about 1950 or so. Many of the Yugoslavs get very worked up about it, one way or another. I agree with Crom daba, I think we should keep it as a dialect until there is something closer to a consensus that it's a separate language. I thought about getting an opinion from User:Ivan Štambuk, but Ivan seems to be absent. I think it's been over a year since Ivan's last serious edit. —Stephen (Talk) 11:17, 2 August 2017 (UTC)
If you're interested in opinion of other Yugos, @Vorziblix, Biblbroks might respond. Crom daba (talk) 11:31, 2 August 2017 (UTC)
Opinions from other Yugos might be helpful, but only if they are linguists and are philosophically moderate. The last time we asked for Yugoslav opinions, everybody from the Serbian, Croatian, and Bosnian Wikipedias came here and we almost had a shooting war. With User:Ivan Štambuk, we knew his education and philosophy, so he was very helpful in things such as this. Ethnologue does not recognize it yet. SIL mentions it only as a literary language. I don't know what to make of that. —Stephen (Talk) 13:35, 2 August 2017 (UTC)
I don’t have a strong opinion either way, as I’m not knowledgeable enough about Kajkavian to say whether it would be more convenient to keep it merged or split. For reference, however, here’s an old discussion of this same subject with Ivan Štambuk. — Vorziblix (talk · contribs) 21:33, 2 August 2017 (UTC)
"at least until it gets an army and a navy" Wait, all I need to have my own language is an army and a navy? Why has no one told me this before! **starts gathering troops**
On a more serious note, you may want to look at and compare the West Frisian language, a dialect that relatively recently became recognized as a language. W3ird N3rd (talk) 15:18, 2 August 2017 (UTC)
We are completely indifferent to official "recognition". We consider things separate languages (and give them separate codes) based on linguistic considerations, though admittedly our results are not always consistent: we treat all Serbo-Croatian and Chinese varieties as a single language (each), but we treat Bokmaal and Nynorsk as separate languages. —Aɴɢʀ (talk) 16:23, 2 August 2017 (UTC)
In considering these types of questions, I would like us to put more emphasis on lexicographic convenience and less on "linguistic considerations"- that is, will splitting or merging these languages make it easier to maintain the dictionary? Will it make it easier for users to find information that they want? DTLHS (talk) 16:58, 2 August 2017 (UTC)
I think the Frisian case is still interesting to look at. People who only speak Dutch can barely if at all understand Frisian, but for a long time they were (for example) not allowed to use evidence in Frisian in court. It was not until 1980 that Frisian got the status of a required subject in primary schools. I think it also took a while before they got their own Wikipedia. And they are very, very, very proud of their language and it sounds like that is a factor with Kajkavian as well. If you are curious how different it really is, try https://www.youtube.com/watch?v=m1WTTX_ITIE. The narrator is speaking Frisian, the man who appears after 14 seconds into the video is speaking regular Dutch. For written text, try https://nl.wikipedia.org/ versus https://fy.wikipedia.org/. For a long time this wasn't recognized as a seperate language. W3ird N3rd (talk) 17:08, 2 August 2017 (UTC)
We're already led by convenience, Serbo-Croatian wouldn't have won out were it not massively inconvenient to quadruple our work here. Crom daba (talk) 18:28, 2 August 2017 (UTC)
Only in some cases. We have both Scots and English and two different varieties of Norwegian (as well as just "Norwegian"). DTLHS (talk) 18:36, 2 August 2017 (UTC)

New competition[edit]

Hello. If anyone wants to play Emoji-Pictionary, I set up a game at User:WF on Holiday/Comp. As with most games I started in Wiktionary, there are probably loads of mistakes, loopholes, spellos, bad grammars and confusing instructions. But once we've got used to them, we can play happily. On a side note, I'm sure some of our Previous games could be modified by some tech-savvy folks in such a way as to allow normal people to play them. --WF on Holiday (talk) 23:18, 2 August 2017 (UTC)

Arbitrary behavior of certain administrators[edit]

There is an administrator being completely arbitrary on certain entires, as you might see here for example: [1] where he eliminates a translation of a word on the basis that he does not feel that it is a good translation, and yet the example he leaves in place about an LGBT film festival contradicts his assertion. This is, sadly, a consistent pattern and not merely one example; originally I had added "queer" as a translation while citing a specific example of it being translated that way in the name of an Israeli organization, and he eliminated it on the basis that he personally felt it did not fit and was offensive. His behavior is despotic; instead of requesting verifications he just acts as an absolute authority and is nothing but combative when I ask for simple things like justifications for his actions.

It's bad for the project because there are processes. He does not seem to be holding himself to the standards that other wiktionary users are held to, but acting as if it's his personal dictionary. He disagrees with a translation so instead of putting a RFV template on it, he just deletes it and locks the page.

Furthermore he's projecting a considerable amount in his responses, acting as if I am trying to impose my personal views when I am citing specific examples and he is citing no examples other than "I speak Hebrew," which i don't think is the way things normally go on Wiktionary? Like I speak Esperanto but I still have to justify my work on Esperanto terms, as 99% of Wikimedia users have to do.

I don't think Wiktionary was created so that certain people could impose their opinions without justifying them, and people who justify their edits by giving specific examples are treated as if they are troublemakers. I think it was created for the opposite reason and that fairness and transparency are still supposed to be important. Ligata (talk) 14:04, 3 August 2017 (UTC)

I recommend people engaging in a discussion over this take a look at the respective admin's talk page and think of the fact that Wiki-projects are known to prevent new users from joining by stubborn aggressive culture of long-term users. I also strongly advocate that the discussion here not get derailed by a smokescreen (talking about Hebrew definitions) but instead stay on topic (proper conduct and bureaucracy). Korn [kʰũːɘ̃n] (talk) 14:48, 4 August 2017 (UTC)
I had a similar issue with this travel edit. It may have been the wrong place, but I think those were some good examples. Instead of correcting it or requesting a fix/cleanup he just chucked it. In most cases that would be the end of it, but I mentioned him on this page asking to explain this. I don't expect most new users to be that assertive or to even notice their edit has been undone. He still hasn't shown up here and I thought he ignored it, but only just now do I see he did do something in response to that (or so the timelines would suggest): https://en.wiktionary.org/w/index.php?title=travel&diff=47159186&oldid=47158420 which is nice, but I think that would still benefit from the examples I had written. But I can't risk putting something back in that was removed by an administrator. I can understand his time is limited and he can't properly fix every mistake he finds. I get that. But isn't that what Wiktionary:Requests_for_cleanup would be for? W3ird N3rd (talk) 20:30, 4 August 2017 (UTC)
We don't even have time to resolve everything at WT:RFC as it is now (see all the archived unresolved requests). --WikiTiki89 20:38, 4 August 2017 (UTC)
Is that a valid argument for deleting/reverting edits that aren't perfect? The idea behind a wiki is that a valueable contribution doesn't have to be complete or perfect. But by reverting edits that are not perfect, you can quickly discourage any new users from hanging around. In the long term, you will indeed not have enough manpower to verify and clean edits. The cleanup request page isn't very well advertised, that may also contribute to this. W3ird N3rd (talk) 21:16, 4 August 2017 (UTC)
Some badly formatted entries are found many years after they are created. Thus, dealing with them as soon as they are noted is essential. —CodeCat 21:18, 4 August 2017 (UTC)
Some - so you just delete everything before anyone could even have a chance to fix it. If mice keep getting into your house, the solution is not to burn down your house. W3ird N3rd (talk) 01:40, 5 August 2017 (UTC)
If your house could do with some new furniture, but you can't afford any, the solution is not to fill it with mice... Equinox 10:25, 5 August 2017 (UTC)
But if many of your friends are carpenters, you might fill it with not-quite-perfect furniture and put a post-it on it to remind you something needs to be done about it, instead of sitting around in an empty house. And possibly chuck the nonperfect furniture anyway if it's still not fixed after a month. The very least IMHO is that the user who made the edit is (could possibly be partially automated) informed about what was wrong and what needs to be changed before putting that content back. Right now it's just "POOF, it's gone, and if you put it back you risk a ban". Like my examples for travel, I think they would now fit perfectly below the usage note, but I feel like it's a risk to put them back in because SemperBlotto is an administrator. I would have already done it had SemperBlotto been a regular user.
Obviously edits that you would consider mice (vandalism) are not what I'm talking about here. W3ird N3rd (talk) 13:05, 5 August 2017 (UTC)
I do sort of take your point. It's bad that we automatically revert every mess when some (10%? who knows?) messes contain something good. But the entries are public-facing. It suggests that maybe we need some kind of "limbo" or intermediate edit-o-space that allows stuff to exist before it's shown to every random visitor. I can't be the first wikidork to think of this. For now, although it's annoying, I think our approach is as good as it gets. Equinox 00:29, 7 August 2017 (UTC)
Wikipedia uses "Wikipedia:Pending changes" on controversial pages so that edits don't go live until they have been reviewed. We could perhaps apply it to all pages here, and patrol the log of pending changes needing review, instead of our current system of "patrolling" Special:RecentChanges, which some changes slip through. But the actual result might be an extremely large backlog of pending changes awaiting review. This was discussed at least once before; I don't recall many people having strong opinions, but enough opposed it that it wasn't implemented. - -sche (discuss) 06:01, 15 August 2017 (UTC)
  • Since the actions complained of are not administrative in nature, perhaps it would be better to title this section "Arbitrary behavior of certain editors". Cheers! bd2412 T 14:58, 5 August 2017 (UTC)
@BD2412 But there's a difference. If an administrator removes something, you can't put it back. Even if you slightly alter it and believe that is sufficient to fix it, you can't put it back because the user who removed it happens to be an administrator. If you do it anyway you risk a ban. This wouldn't trouble me nearly as much if a regular user had deleted it, I would just fix it and put it back without having to worry about it. W3ird N3rd (talk) 03:03, 6 August 2017 (UTC)
I don't think that's true at all. It would be a substantial misuse of administrative authority to use that authority in connection with one's own editing dispute. bd2412 T 03:06, 6 August 2017 (UTC)
This.__Gamren (talk) 08:38, 6 August 2017 (UTC)
@BD2412 User_talk:Stephen_G._Brown#Abuse_of_blocking_and_page-deleting_powers_by_SemperBlotto.3B_de-cratting_and_de-sysopping_required feels too much like that for me to risk it. While the user in question was wrong (and making silly demands), it makes it clear to me that putting back any content deleted by an administrator is risky. W3ird N3rd (talk) 09:18, 6 August 2017 (UTC)

Extinct species[edit]

Are there any categories for extinct species, or do they go in other categories? I just unearthed Kangaroo Island emu. DonnanZ (talk) 16:02, 4 August 2017 (UTC)

A taxonomic approach would just put them in existing categories (where they exist) alongside extant species. A language-centered approach would favour putting them somewhere else, and not mixing them with extant species. —CodeCat 16:12, 4 August 2017 (UTC)
There is a convention in taxonomic names to place the symbol "" before the name unless such a symbol is not necessary due to context. (See practice on Wikispecies.) We have begun implementing the practice of putting the "" on the inflection line for entries of extinct taxa and elsewhere if the word extint is not already in a label.
English vernacular names do not use the symbol, so it is arguable that a categorical distinction might be useful for some purposes. For many purposes, however, the presence or absence of the word extinct together with the capabilities of search would be sufficient. DCDuring (talk) 22:12, 4 August 2017 (UTC)
There is no value lexical value in saying if a particular species is extinct or not, anymore than if a particular institution is defunct, or a person is deceased. All that matters is if the term still has some kind of usage or currency. —Justin (koavf)TCM 00:07, 5 August 2017 (UTC)
By what definition of lexical? Does lexical exclude definitions, ie, semantics? We have the word extinct in so many definitions. DCDuring (talk) 01:04, 5 August 2017 (UTC)

Can anyone get through to User:Jeff Weskamp?[edit]

They are adding Cherokee entries with manual transliterations, even though automatic transliterations work perfect for Cherokee. This isn't really a big issue, but it's silly and so I left a message on their talk page. They don't seem to have noticed it at all, though, even after I sent another message. Is anyone able to get through to them? A user that ignores their talk is bad, even if they aren't currently causing trouble. —CodeCat 17:10, 4 August 2017 (UTC)

Jeff also edits other Native American languages, including Navajo. I have not checked all of his edits, but quite a few of them.Those I've checked always seem good, even if he adds transliterations unnecessarily. I have attempted to talk with him a time or two, but I don't believe he has ever replied to anyone. I've known other editors who try to avoid interpersonal communication, so it does not seem all that odd. Jeff just takes it to an extreme level. —Stephen (Talk) 22:11, 5 August 2017 (UTC)
Jeff is now adding improper categories to entries, so I hope they will start listening. —CodeCat 19:10, 19 August 2017 (UTC)

Languages distinguishing dotted and undotted i[edit]

Recently I added some code to distinguish dotted and undotted i (Iı, İi) in Turkish and Azeri sortkeys . Till now, they were merged by being converted to lowercase (→ iı, ii) and then uppercase (→ II, II) using English rules (mw.ustring.upper). Thus, words beginning with both i and ı were sorted under I when they were categorized using templates.

Currently the fix only applies to Turkish and Azeri. Are there any other languages currently on Wiktionary that distinguish dotted and undotted i? — Eru·tuon 20:42, 4 August 2017 (UTC)

The following languages have entries with both dotted and undotted i's: Azeri, Crimean Tatar, Egyptian, English, Gagauz, German, Italian, Karakalpak, Tatar, Translingual, Turkish, Zazaki. DTLHS (talk) 21:10, 4 August 2017 (UTC)
Egyptian, German, Italian? And even English? Really? —CodeCat 21:14, 4 August 2017 (UTC)
Italian: dımlı, German: homurdanmayı, Egyptian: ḥtrı͗, English: Category:English terms spelled with ı. DTLHS (talk) 21:26, 4 August 2017 (UTC)
The Italian and German look like errors. The Egyptian is used with a combining diacritic, and it should just use a regular i. As for the English, most of them are probably better attested with a regular i and therefore should probably moved to those spellings. Regardless, English speakers would not treat i and ı as different letters, so sorting them together is correct. —CodeCat 21:31, 4 August 2017 (UTC)
@DTLHS I don't think the German entry you fixed is correct, still. In the lemma entry, the inflection table says it's the definite accusative form. —CodeCat 21:54, 4 August 2017 (UTC)
I guess, then, what I'm really asking is for which languages would we actually want the sortkeys to distinguish the two? — Eru·tuon 21:30, 4 August 2017 (UTC)

I'm going to guess that all the Turkic (and Turkic-influenced) languages in the list should have dotted and dotless i distinguished: in addition to Turkish and Azeri, Crimean Tatar, Gagauz, Karakalpak, Tatar, Zazaki. — Eru·tuon 21:51, 4 August 2017 (UTC)

(edit conflict) Judging by w:Dotted and dotless i, there's the potential in Turkic languages that use the Latin script, even as an alternative, but nowhere else except for ad-hoc use in romanization. Our entry for ı lists only Azeri, Crimean Tatar, Gagauz, and Turkish. Of course, texts in other languages can have names attested in their original spelling, but such cases are so rare that I doubt there are many (if any, at all) with dotting determining their order in any confusing way. Chuck Entz (talk) 21:57, 4 August 2017 (UTC)
I've put the languages that I listed above in a table in Module:languages. I should verify that each one actually has a regular orthographic system that uses the letters, though. — Eru·tuon 00:05, 5 August 2017 (UTC)
Okay, I looked at Wikipedia articles and Wiktionary categories, and Crimean Tatar, Gagauz, Karakalpak, Tatar, and Zazaki all seem to either regularly use dotted and dotless i, or have entries that use them. — Eru·tuon 00:31, 5 August 2017 (UTC)

Category name: "words pseudosuffixed with" or "words ending in"[edit]

Which naming convention should be used for suffixlike endings: "words pseudosuffixed with" or "words ending in"? Examples for both: Category:Esperanto words pseudosuffixed with -acio; Category:Esperanto words pseudosuffixed with -enco; Category:Hungarian words ending in -ikus. --Panda10 (talk) 23:58, 4 August 2017 (UTC)

I prefer "ending with", because I haven't heard "pseudosuffix" before, but I wonder how we could prevent the creation of ridiculous categories for every sequence of letters at the end of the word: like for naming, ending with -g, ending with -ng, ending with -ing (though that's a suffix), ending with ming, ending with -aming. That is, what counts as a "pseudosuffix" or ending such that it gets to have a category? — Eru·tuon 00:03, 5 August 2017 (UTC)
I think these things (pseudosuffixes) are called formatives. Crom daba (talk) 00:26, 5 August 2017 (UTC)
Also desinence. --Vriullop (talk) 07:41, 5 August 2017 (UTC)
See also: previous discussion in July at Etymology Scriptorium.
"Desinence" means typically "inflectional" rather than derivational. With some ovelap with "formative", there's also "formant", used to refers to endings that are not known to be certainly segmentable at all (so e.g. ölyv would have a "formant" -v). "Ending in" is probably good enough a starting point, provided that we craft descriptions for these that clarifies that they are not pseudo-rhyme categories (e.g. we would not want sing in a category "English words ending in -ing").
Something that specifies the etymological origin, such as "ending in Latinate -ikus" might work. This also prevents the risk of bloat through people starting to add "ending in -X" as useless "wrapper" categories for every "suffixed with -X" category.
I'm not sure how these categories should be meshed with the pre-existing suffix categories, though. Do we put them in parallel, or as a parent category for the corresponding proper suffix category? I would lean towards the former, with crosslinks from the category description, but I'm open to arguments in other directions. --Tropylium (talk) 07:48, 6 August 2017 (UTC)

Sanskrit vs. Old Indo-Aryan[edit]

Currently, Module:languages lists only Sauraseni Prakrit as a direct descendants of Sanskrit. This is IMO completely misleading because there is nothing to prove that Sauraseni is any more a descendant of the Vedic dialect of Old Indo-Aryan than any other Prakrit. A simple example is Sanskrit क्षेत्र (kṣetra, region), from Proto-Indo-Iranian *ĉšáytram. The regular outcome of *ĉš in Middle Indo-Aryan is "ch". This is found in all of the Dramatic Prakrits as "chetta" (alongside a "kh" form, that likely came later as part of artificial alignment with Sanskrit), including Sauraseni. Indeed, where Sanskrit simplifies Proto-Indo-Iranian clusters to क्ष (kṣa), the Middle Indo-Aryan languages preserve the original cluster. If Shauraseni was a direct descendants of Vedic Sanskrit we would see only "khetta", no "chetta". So, that being said, we have two options.

  1. Remove Sauraseni as a Sanskrit descendant – Note that CAT:Terms inherited from Sanskrit has been cleared out with Wyang's help, so no module errors will occur. This is keeping in line with our treatment of Sanskrit as only Vedic Sanskrit (+Classical Sanskrit), not all Old Indo-Aryan.
  2. List all of the Dramatic Prakrits (Sauraseni, Maharastri, Ardhamagadhi) as direct Sanskrit descendant – This was suggested at Category talk:Hindi Tadbhava, and would involve treating Sanskrit as a dialect continuum of all Old Indo-Aryan + Classical Sanskrit. WT:ASA would have to be modified accordingly.

Personally, I think either option is better than the status quo. —Aryaman (मुझसे बात करो) 04:00, 6 August 2017 (UTC)

Pinging @JohnC5, माधवपंडित, DerekWinters. —Aryaman (मुझसे बात करो) 04:01, 6 August 2017 (UTC)
I would prefer option #2, ie, considering Sanskrit to be the entire group of mutually intelligible dialects, for the sake of convenience. Wiktionary treats Avestan, Old Norse & Serbo-Croatian as one language while in reality they're all two or more dialects. We can do the same for Sanskrit. ɱɑɗɦɑѵ (talk) 04:46, 6 August 2017 (UTC)
Not to mention none of the non-Vedic dialects are (well-)attested. And we could always have a reconstructed entry *च्शेत्र/*च्षेत्र (cśetra/cṣetra) if it is needed. —Aryaman (मुझसे बात करो) 06:34, 6 August 2017 (UTC)
There is already dialectal diversity within "Sanskrit". Strictly speaking even Classical Sanskrit does not descend from Vedic Sanskrit precisely, but from a parallel dialect that was not written down until later. This in mind, we could probably treat all Middle Indo-Aryan (and most of New Indo-Aryan) as descendants of "Sanskrit". Where MIA diverges from Classical Sanskrit, it would be possible to create reconstructed Sanskrit forms (similar to Category:Latin reconstructed terms). Perhaps we could outright consider merging "Proto-Indo-Aryan" into Sanskrit? Same deal as how we already equate Latin with Proto-Romance. --Tropylium (talk) 07:58, 6 August 2017 (UTC)
I agree that Sanskrit should be the collection of OIA dialects put together. However we cannot merge it with PIA because we need PIA for the Mitanni language. DerekWinters (talk) 15:11, 6 August 2017 (UTC)

making Tagalog an LDL[edit]

This was supported in WT:RFVN#hagok by @Metaknowledge, Mar vin kaiser, Atitarev, Stephen G. Brown (I think). @Rgt2002, TagaSanPedroAko may also have opinions. Please discuss.__Gamren (talk) 08:19, 6 August 2017 (UTC)

I agree that Tagalog is an LDL. —Stephen (Talk) 08:42, 6 August 2017 (UTC)
I also agree that Tagalog is an LDL. --Mar vin kaiser (talk) 09:22, 6 August 2017 (UTC)
Do we have any quotations of Tagalog in use in Wiktionary? Do we know of any online corpora that we can use? Is http://sealang.net/tagalog/corpus.htm a usable corpus to find quotations in use? Can Tagalog texts be found in Google books? What methods can a third party use to verify that Tagalog is so poorly documented that we should allow single mentions for it? --Dan Polansky (talk) 09:55, 6 August 2017 (UTC)
Yes, Tagalog is very poorly documented here in Wiktionary, but thanks for me being a native speaker of Tagalog, I am making efforts to make Tagalog a largely documented language here, from being a least documented language, or a LDL. I agree that Tagalog is still a LDL, and yes, there will be efforts to add quotations showing sample use of Tagalog words for a certain sense. Maybe finding interesting quotes in Tagalog by notable persons, if not by Tagalog-language publications, may help. -TagaSanPedroAko (talk) 11:23, 6 August 2017 (UTC)
@TagaSanPedroAko: The discussion is not about whether Tagalog is well documented in the English Wiktionary but rather whether it is well enough documented on the Internet, by which the users of the phrase mean, whether there are enough quotations of Tagalog in use (not dictionaries) to be found on the Internet. Since, these quotations of Tatalog in use is what the English Wiktionary uses for verification, per WT:ATTEST. And there is a proposal to allow single mentions in dictionaries to suffice for verification of Tagalog; single mentions do not suffice for English, Spanish, German, and multiple other languages. --Dan Polansky (talk) 11:50, 6 August 2017 (UTC)
There are a very few mainstream Internet sources for use in quotes that use Tagalog. The vast majority of Tagalog sources on the Internet will mostly be self-published, but if you can find one reliable one, like a book in Google Books or a Tagalog news website, then, here we go.I'm aware that there are reliable Tagalog (or Filipino) sources in the Net that attest use of certain words, but that will be difficult with the majority of Philippine Internet media use English. If I can dig through a reliable source, then, good.-TagaSanPedroAko (talk) 11:58, 6 August 2017 (UTC)
@Mar vin kaiser Thankfully, after I added Quiet Quintin to my user gadgets here in Wiktionary, I'll drop my position to make Tagalog a LDL. There are a lot of attestations of many Tagalog words in Google Books, that the language is yes, well documented. -TagaSanPedroAko (talk) 12:32, 5 September 2017 (UTC)
@TagaSanPedroAko Can you attest hagok?__Gamren (talk) 06:49, 12 September 2017 (UTC)
@Gamren I found attestations for it, through Google Books (via the Quiet Quentin gadget). Looks like Tagalog is still a WDL, thanks that I used QQ for attestations. -TagaSanPedroAko (talk) 06:58, 12 September 2017 (UTC)

Unsolicited Babel requests[edit]

User_talk:Gfarnab#Babel
User_talk:Awewewe#Babel
User_talk:Pedrianaplant#Babel
User_talk:Leonardo_José_Raimundo#Babel
User_talk:ZH8000#Babel
User_talk:LexiphanicLogophile#Babel

"Could you please add {{Babel}} to your user page? I'd appreciate it. --Dan Polansky (talk) 08:41, 5 August 2017 (UTC)"

I suppose @Dan Polansky means well, but in my book this is spam. The biggest problem I have with this is that he makes it look like it's a personal message. He says he re-types it every time he posts it, but it's still the same message every time. I wouldn't mind if he wrote a personal message for every request and explained why it would be so valuable to him to see that user getting a Babel, or if he would make it clear in the message that it's not really personal.

I personally don't appreciate these messages, but maybe it's just me. W3ird N3rd (talk) 10:17, 6 August 2017 (UTC)

  • The primary purpose of user pages it to give other editors an idea of an editor's competence in a particular language. Babel boxes are the best way of achieving this. Please add a babel box to your own user page (if and when you create one). SemperBlotto (talk) 10:22, 6 August 2017 (UTC)
LOL.
I have seen plenty of users with a babel box, I thought about it and decided not to create a user page at this moment. If and when I do, I don't think I'll add a babel box. I don't really like them. W3ird N3rd (talk) 10:34, 6 August 2017 (UTC)
  • Funny how you're complaining about Dan "spamming" talk pages with something useful to the project... by spamming this forum page. —Μετάknowledgediscuss/deeds 00:44, 7 August 2017 (UTC)
  • I think you don't know what spam is. Spam means unsolicited bulk electronic messages. Being useful or not doesn't matter, although useful spam is less likely to be frowned upon. If you are getting e-mail that you didn't ask for from you local supermarket with various offers that you actually like, it's still spam. I have only brought this up here, nowhere else. I don't have any intention of posting this anywhere else either. I'm also not asking anyone to do or buy anything. You may find this pointless and you are entitled to your opinion, but that does not make this forum post spam. In my opinion the babelbox is getting enough exposure as it is. If such messages are accepted, it might lead to a slippery slope. I just wanted the community to be aware of this phenomenon, if the community thinks it's fine I'll say no more. W3ird N3rd (talk) 05:49, 7 August 2017 (UTC)
The Beer Parlour is the place to discuss these things. This discussion is not spam. That said, personally I'm OK with Dan requesting people to use babel boxes. Sometimes we need to know who speaks a certain language, and the boxes make that job easier. --Daniel Carrero (talk) 05:53, 7 August 2017 (UTC)
I think Babel boxes are a good thing,t requesting them is a good thing, and not responding constructively to such a request is a bad thing. DCDuring (talk) 06:06, 7 August 2017 (UTC)
I also think that it pays for such a request to have some explanation of the purposes served. DCDuring (talk) 06:08, 7 August 2017 (UTC)
Adding a Babel table should be our standard policy, if it's not already. A standard {{welcome}} message includes that request. If users refuse to tell other users what languages they know or they don't they should go somewhere else. Not knowing a language doesn't mean that you can't edit in that language but others editors can check your edits accordingly or monitor edits. --Anatoli T. (обсудить/вклад) 06:21, 7 August 2017 (UTC)
Technically you've fulfilled the request. You've added {{Babel}} to your userpage. Wyang (talk) 06:29, 7 August 2017 (UTC)
And now that we know that you can't speak any languages, any of your contributions will be ignored. SemperBlotto (talk) 06:42, 7 August 2017 (UTC)
[2]suzukaze (tc) 06:48, 7 August 2017 (UTC)
But, at an earlier, saner time: [3]. DCDuring (talk) 11:16, 7 August 2017 (UTC)
I don't exactly like that W3ird N3rd doesn't have a Babel box, but if the user doesn't want one don't make them feel forced to have one. Some very contributing members of Wiktionary don't have user pages at all. That said, W3ird N3rd isn't exactly spamming this forum, but I just don't feel like this discussion is appropriate for the beer parlour, especially since it's targeted at one user alone (Dan). PseudoSkull (talk) 01:49, 8 August 2017 (UTC)
It feels a bit out of place indeed, but I've looked around and Wiktionary:Information desk, Wiktionary:Tea room and Wiktionary:Grease pit were clearly the wrong places. Although this post is indeed about one user, my comment was about the phenomenon. I don't know if any other users are doing this, but what I said would apply to them all the same. My biggest issue is probably this line: "I'd appreciate it." which was repeated for all users. Maybe it's because I'm Dutch (the Dutch are known for being direct), but I just can't stand it when someone pretends to care.
Just one more thing. I mentioned the possiblity of a slippery slope. One of the reasons I don't want a babel box is because (depending on how many languages you know) it looks like a unicorn just barfed a rainbow. We all know the average Wikipedia user page looks like a Christmas tree and while it won't happen overnight, it must have started somewhere and the road to hell is paved with good intentions. It may not happen at all - but if users start pushing a template, even if this one now is a useful one, it might. I believe it would be more wise not to allow any users to promote templates this way and if it is believed the babel box isn't getting enough exposure, have the administrators decide on a way to inform users. But clearly, I'm standing alone on this one. W3ird N3rd (talk) 03:26, 8 August 2017 (UTC)
If the issue you have with the babel box is too much unicorn barf on user pages, then you could use a different method to give information on what your native language is and what your levels of proficiency are in other languages. — Eru·tuon 03:53, 8 August 2017 (UTC)
There's no slippery slope here: Wikipedia-style user boxes aren't allowed, with the exception of Babel, time zone, and maybe one or two others that provide useful information. That's the way it's been since long before I started here 5 years ago, and I doubt it will change. Chuck Entz (talk) 04:40, 8 August 2017 (UTC)
  • I also see no slippery slope. The Wiktionary community has been very careful to avoid unicorn barf.
And I also see no disingenuousness on Dan's part. I, too, appreciate it when users add Babel boxes to their user pages -- at least, when those Babel boxes are at least vaguely accurate, as they provide the community with useful and usable information on who understands which languages, and roughly to what degree. For a multilingual dictionary project, this kind of user metadata is very useful.
FWIW, W3ird N3rd's behavior comes across as immature, and willfully disrespectful of Wiktionary norms, albeit on a minor scale that's more of a slight annoyance than anything actionable. I suspect some of his (her?) reticence comes from the Wikipedia culture and a lack of familiarity with the Wiktionary project. On Dan's part, I see no spam, and nothing inappropriate in asking for a Babel box.
I hope W3ird N3rd can learn more about how Wiktionary functions, and grow to be a comfortable and productive member of the community. ‑‑ Eiríkr Útlendi │Tala við mig 06:09, 8 August 2017 (UTC)
If my contributions in the main dictionary space are not productive, I might as well stop contributing. It's not going to be all that much better in the future. I thought I was being productive, but thanks for pointing out to me that I'm not. I know you think this is immature, but why should I care? Either I really am not productive, in which case you should just think "good riddance" or I am but you insult me (at the very least that's how this comes across), in which case why should I stay? W3ird N3rd (talk) 14:21, 8 August 2017 (UTC)
  • My perspective: 1. Yes, distributing the same message electronically to a larger number of people is spam. Textbook definition. 2. I see no harm in every user receiving this spam message once as it is merely a request for a useful addendum. 3. This is a Wiki-project, not Lord of the Flies, Jante or a Catholic School in a Celtic country. Wiki itself is based on and centered around voluntary contributions. Of course the community can come together and regulate things to prevent harmful additions to the project, but demanding any user share any information on himself or add a specific thing, that is: Forcing involuntary contributions, is the fucking opposite of what this project is supposed to be and everyone who entertains that trail of thought is indeed about to open Pandora's Box and pervert Wiktionary (an open project where everyone can partake) into a generic online dictionary run by a junta of seniors. Korn [kʰũːɘ̃n] (talk) 10:11, 8 August 2017 (UTC)
Just the request isn't even what bothers me most. Had it been worded like "Could you please add {{Babel}} to your user page? The Wiktionary community would appreciate it." I wouldn't have been even close to as annoyed as I was now. I know what many here will say: "what am I complaining about, that's hardly any different at all, what sort of moron are you, yadda yadda yadda". To me this would make all the difference. It would make it clear Dan isn't personally asking me to do this, he is asking on behalf of the Wiktionary community. Which also means that if I decide to ignore it, I'm not letting Dan down personally. To me, that's a big difference. Again, I don't expect anyone to side with me. It's just my opinion. Yes it is a stupid opinion. I'm a stupid person and there's no need to further comment on that, I admit it, move on. W3ird N3rd (talk) 14:21, 8 August 2017 (UTC)
Instead of "Could you place Babel to your user page? I'd appreciate it," you wanted "Could you please add Babel to your user page? The Wiktionary community would appreciate it"? I can't see the difference and English is my native language. Dan is Czech and he does not have a perfect command of English. Most of our editors have a different language as their first language. It has never occurred to me to be offended by English comments that are not just so. I think most people write the best they can and they don't mean to offend or confuse. The reader should bear some of the load of communication by showing a more tolerance and understanding. It improves the atmosphere. —Stephen (Talk) 16:03, 8 August 2017 (UTC)
I tried to explain it, I'll do it again knowing full well it won't make a difference. If you say "Please do X, I'd appreciate it." I feel like I'm letting you down when I don't do it. (and the community may or may not care about X) If you say "Please do X, the community would appreciate it." it tells me the community in general would prefer this, I'm not letting you down personally if I don't. I wouldn't even think this difference, or at least what I perceive as a difference, would be language-dependent. I suppose not every individual would recognize this difference though. And maybe somehow I'm the only one. In which case I'm wrong and my faulty interpretation lead to a long and useless argument of misunderstanding and contempt. Well, if my understanding of the English language is that shitty I probably shouldn't be here anyway. Which was another reason I wouldn't want to add a babel box: I can't judge to what degree I master any language. W3ird N3rd (talk) 16:38, 8 August 2017 (UTC)
@W3ird N3rd: I don't think that I would feel like I'm letting anybody down by not adding a Babelbox, no matter how the message asking for it was worded. It's really not that important to discuss this imo. —Aryaman (मुझसे बात करो) 04:55, 11 August 2017 (UTC)
It seems to me your English is just fine. Personally, I disagree that Dan's phrasing was due to him being Czech. I suspect he prefers in general not to speak on behalf of "the community". But I could be wrong. — Eru·tuon 17:23, 8 August 2017 (UTC)
Indeed, I don't like to speak on behalf of community. The Babel practice is common but the appreciation is mine. --Dan Polansky (talk) 10:47, 19 August 2017 (UTC)
For example this sentence: "The reader should bear some of the load of communication by showing a more tolerance and understanding.". To me, this seems wrong. (the most obvious fix to me would seem to be to change "a more" to "a little more") It could be a joke (writing a broken sentence to prove your point), a genuine error (even a native could make mistakes) or (which would seem more likely as English is not my native language) this is correct but I just don't understand it. I also think I don't write text the way most people do today: I don't use any kind of spell checker or autocomplete. That may also result in me looking at language in a different way. W3ird N3rd (talk) 17:01, 8 August 2017 (UTC)
(An academic discussion on what is spam) "distributing the same message electronically to a larger number of people is spam": Not really. In my job, I receive job-related emails from management that are distributed to a larger number of people, and they are obviously not spam; spam filters are not designed to remove these kinds of messages. A message related to Wiktionary purpose posted in multiple instances on Wiktionary is not necessarily a spam. The definition of spam is not so simple as some people think; I don't think I have a good comprehensive definition. Being posted to a larger number of people is a component of being a spam, but that alone does not suffice. By the way, our welcome messages are much more of a spam than these requests for Babel given how long they take to read. --Dan Polansky (talk) 10:47, 19 August 2017 (UTC)

Weird arrow next to uses of {{taxlink}}?[edit]

@Erutuon, DCDuring, Sgconlaw There is a weird arrow that sometimes appears next to the name of species and such that are formatted using {{taxlink}}. What's its purpose? It looks wrong, and is mentioned nowhere in the documentation. Can we get rid of it? For an example, see пога́нка (pogánka). Thanks! Benwing2 (talk) 20:44, 6 August 2017 (UTC)

I have categories to detect the conditions that cause them, which I consulted as soon as I saw "weird arrow" in the alerts. I found поганка in one of the categories and eliminated it. If they occur when you use taxlink, that means we already have an entry for the taxon involved and the template should be removed. Besides the situation of a new use of the templates there can be "many" entries that are affected by adding a new taxon or vernacular name. When I add either type of entry I try to eliminate any uses of the template in linked entries that would generate the "weird arrow". I will add something about this in the documentation for the two templates, though I don't expect it will be consulted, this being the first time it has come up, though I might be wrong. DCDuring (talk) 21:06, 6 August 2017 (UTC)
Also, I watch the category (as well as most other taxon-related categories) and would have detected the entry the next time I checked my watchlist. DCDuring (talk) 21:09, 6 August 2017 (UTC)
I remember seeing the "weird arrow" before. DCDuring, wouldn't it be sufficient for the template to place entries that require your intervention in the category, without the arrow also appearing? — SGconlaw (talk) 21:23, 6 August 2017 (UTC)
We already have such categories, which I aggressively police to keep empty.
The trouble is that it takes me quite a while to find the instances of redundant templates without using ctrl-f on the displayed text to find "=>". It is always at least a bit faster with the "=>". The problem is worst in entries with unusually large Hyponyms or Derived terms sections, with multiple L2 sections, with the use of {{taxlink}} or {{vern}} in the middle of definitions for polysemous terms or in unexpected locations.
If someone knew a way so that something displayed in the entry that optionally only an anointed few (me included) could see, we could eliminate the need for anyone to consult and grasp the documentation to eliminate the offending "=>". DCDuring (talk) 21:40, 6 August 2017 (UTC)
No idea how to do that. Maybe it could be made more understandable by replacing it with some reduced-size text like "needs attention" (compare the "Invalid ISBN" warning generated by {{ISBN}}), but I don't know whether you think that would make the warning too prominent. — SGconlaw (talk) 21:49, 6 August 2017 (UTC)
The offending "=>" can be eliminated with CSS. If we enclose this symbol in a HTML tag with a unique class name (say class="taxlink-redundant"), and create a CSS style rule that vanishes it (display: none;), which can either be placed in the HTML tag or in MediaWiki:Common.css, then the symbol can be un-vanished at will. Putting the style rule in MediaWiki:Common.css requires the help of an admin. Let me know which option you would prefer and I can give further help. — Eru·tuon 21:56, 6 August 2017 (UTC)
@Erutuon:'s solution seems great. I'm an admin. I would just need to be instruction as to what to put where so that I could still see the "=>" (which has the advantage of being easy to type and rarely used except for this purpose). The name for the style could be something like "redundant template finding aid" or a comprehensible abbreviation of that. DCDuring (talk) 22:16, 6 August 2017 (UTC)
I guess its value as a recruitment tool for proper (non-redundant) use of {{taxlink}} and {{vern}} is not much of a consideration. DCDuring (talk) 22:18, 6 August 2017 (UTC)
Why shouldn't I be taking the approach of having some red text telling folks that they should remove the offending template? I think there is precedent for that. It might even be in continuing use. DCDuring (talk) 22:21, 6 August 2017 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── (edit conflict) Well, I suggested the class name taxlink-redundant, but you can choose a different one. Another idea: redundant-taxlink-mark? Whatever you choose, it should be made up of basic Latin and hyphens. Spaces will be misinterpreted. The code to add to MediaWiki:Common.css (the period . indicates that what follows is a class name):

.taxlink-redundant {
display: none;
}

And the code to add to your Special:MyPage/common.css:

.taxlink-redundant {
display: inline;
}

And then, in the template code for {{taxlink}}, replace <sup>=></sup> with <sup class="taxlink-redundant">=></sup>.

If you want to use a different class name, just replace taxlink-redundant in each of the three code snippets with whatever class name you choose. — Eru·tuon 22:33, 6 August 2017 (UTC)

If you want to keep the mark, how about changing it to a message with instructions that only displays in preview mode? For example, <sup class="error previewonly"><small>(Replace {{temp|taxlink}} with a regular link.)</small></sup>. Admittedly, that will be even more annoying than the little arrow thingy. — Eru·tuon 22:37, 6 August 2017 (UTC)

I was hoping to use the same class for both {{vern}} and {{taxlink}}. It might be useful for other similar applications though I don't know of any.
We have plenty of instances of the much more annoying technique used to enforce correct use of templates by displaying 80 or more characters of red text, sometimes with incomprehensible messages buried in them.
I will sleep on this before implementing and give others a chance to weigh in, but thanks for the implementation suggestion. It seems to fit the bill perfectly. I take it that CSS is not much more burdensome on server resources than HTML and doesn't raise the risk of latency problems like JS. DCDuring (talk) 23:15, 6 August 2017 (UTC)
@Erutuon: You had mentioned above that we could accomplish the optional display of default-hidden text if we "create a CSS style rule that vanishes it (display: none;), which can either be placed in the HTML tag or [] ". Where exactly would the HTML tag reside? DCDuring (talk) 18:47, 8 August 2017 (UTC)
The HTML tag that I mean is the <sup>=></sup> that appears in the template source code. — Eru·tuon 18:51, 8 August 2017 (UTC)
That seems like a better implementation, since evidently I am the only one using and virtually the only one aware of this. I could include a reference to the decloaking technique in the documentation for {{taxlink}} and {{vern}}. No adminship required either. DCDuring (talk) 20:01, 8 August 2017 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Well, if you choose that option, you will have to use the following code in your common.css:

.taxlink-redundant {
display: inline !important;
}

The !important makes the style rule overrule the CSS in the HTML tag; otherwise, the CSS in the tag will win out. — Eru·tuon 20:37, 8 August 2017 (UTC)

That is what I will do. Thanks for the help. If I have problems, I will see you on your talk page. DCDuring (talk) 20:53, 8 August 2017 (UTC)

reading out "-"[edit]

In results, such as 2-0, how is the hyphen spelled out? I think such a pronunciation should be added to its entry --Backinstadiums (talk) 21:16, 6 August 2017 (UTC)

Isn't it silent in many cases? The score would just be read as "two nil". I suppose on occasion it would be read "two to nil". (Also, it should really be an en dash.) — SGconlaw (talk) 21:24, 6 August 2017 (UTC)
@Sgconlaw: The singer Tee Grizzley, in his song First Day Out, says "two and o" for 2-0 at min. 3:48 --Backinstadiums (talk) 22:10, 6 August 2017 (UTC)
I agree with SGconlaw - two nil. It can be different in broadcast results, if the home team loses it would be "Team xx nil, Team yy two". DonnanZ (talk) 23:31, 6 August 2017 (UTC)
If we're talking about sports scores and the like, no one would ever say "two nil" or "two to nil" in Canada. It would be one of the following (in rough order of frequency): "two nothing", "two to nothing", "two to zero", or possibly "two zero". Andrew Sheedy (talk) 18:40, 7 August 2017 (UTC)
I just realized this is irrelevant, as the discussion is about the hyphen, not the 0, but oh well. Andrew Sheedy (talk) 19:03, 7 August 2017 (UTC)
:-D — SGconlaw (talk) 03:46, 8 August 2017 (UTC)
It could be either read as "to" or as nothing. When counting game wins rather than the in-game score, it's often read as "and" (in the US at least), as in "My friend and I are five-and-three", although this is more often done for wins-vs-losses of a single party (this is the case in Backinstadiums's song reference above, even though those are trials and not literal "games"). --WikiTiki89 18:55, 7 August 2017 (UTC)

Missing category?[edit]

I can't find any category for Washington, D.C. or District of Columbia, only for the state of Washington. I guess there should be one, but what should the name be? DonnanZ (talk) 23:17, 6 August 2017 (UTC)

  • The main form is at "Washington, D.C." so I made a category for that. There are plenty of words that refer to the District. Good call. —Justin (koavf)TCM 23:29, 6 August 2017 (UTC)
Brilliant, thanks. DonnanZ (talk) 23:37, 6 August 2017 (UTC)

Share your thoughts on the draft strategy direction[edit]

At the beginning of this year, we initiated a broad discussion to form a strategic direction that will unite and inspire people across the entire movement. This direction will be the foundation on which we will build clear plans and set priorities. More than 80 communities and groups have discussed and gave feedback on-wiki, in person, virtually, and through private surveys[strategy 1][strategy 2]. We researched readers and consulted more than 150 experts[strategy 3]. We looked at future trends that will affect our mission, and gathered feedback from partners and donors.

In July, a group of community volunteers and representatives from the strategy team took on a task of synthesizing this feedback into an early version of the strategic direction that the broader movement can review and discuss.

The first draft is ready. Please read, share, and discuss on the talk page. Based on your feedback, the drafting group will refine and finalize this direction through August.

SGrabarczuk (WMF) (talk) 16:11, 8 August 2017 (UTC)

Unsorted formations[edit]

I've seen Unsorted formations in descendant trees formatted as either * Unsorted formations or ; Unsorted formations. Is there a written guideline for this? --Victar (talk) 16:48, 8 August 2017 (UTC)

The standard practice is with *, so that it's listed on the same level as all other formations. —CodeCat 15:55, 9 August 2017 (UTC)
@CodeCat: Is that outlined in a guide or something somewhere? Like I said, I've seen both, so there doesn't seem be true "standard". @JohnC5? --Victar (talk) 20:47, 11 August 2017 (UTC)
@Victar: I've always used ;. On an unrelated note, Victar, please don't just start moving around entries (specifically the new entries) without discussing with anyone. I'm not convinced that was a good choice and may now have to revert all that. If the is not phonemic then it should not be included; if it is then it shouldn't be subscript. It is extremely frustrating that you just did this. —JohnC5 02:43, 12 August 2017 (UTC)
@JohnC5: I only moved two entries; not the end of the world. Also, very unrelated and should have been discussed elsewhere. --Victar (talk) 03:03, 12 August 2017 (UTC)

Quotations vs. Citations[edit]

I'd like to know the protocol for using them, as well as the differences they are meant to represent --Backinstadiums (talk) 14:43, 9 August 2017 (UTC)

I don't know if we have a standard for the terms, but I have been using the term citation to refer to sources providing evidence for information stated in entries, which are usually placed in "References" or "Further reading" sections. The {{cite}} and {{R:}} groups of templates may be used for this purpose. On the other hand, a quotation is an extract from a source that is provided as an example of the entry in use, and which is placed directly under a definition. The {{quote}} and {{RQ:}} groups of templates is used for them. For example, at merlion, there is one "citation" in the "References" section, and a number of "quotations" under the various definitions. However, note that entry pages have a tab called "Citations" which really contains quotations. — SGconlaw (talk) 15:07, 9 August 2017 (UTC)
I've been confused by this as well. Seems like they are used interchangeably. I've seen plenty of quotations from books that are over a hundred years old, providing no clue of how the entry is used today or how you could use it yourself. Personally I prefer examples. They don't come in a collapsed box, there is no question about proper citing due to copyright issues and they are designed to show how the entry is and can be used without clutter. Personally I put quotes and citations all the same on the citations page. W3ird N3rd (talk) 17:16, 9 August 2017 (UTC)
here are my unpopular opinions: quotations and citations are used interchangeably, I don't think there's a meaningful distinction. "Examples" are made up and may not reflect actual usage, there are no potential copyright issues with quoting from parts of works. The citations page is at best useless and at worst actively harmful and should not be used except to collect evidence for missing words or senses. DTLHS (talk) 17:26, 9 August 2017 (UTC)
If examples don't reflect actual usage they are likely to be bad examples. Copyright issues could arise if a quotation is too long or not properly attributed and laws for this possibly vary around the world. Quotations often are not reflecting actual (common) usage either, so I don't think that's a good reason to have them. W3ird N3rd (talk) 20:18, 9 August 2017 (UTC)
My understanding is that Wiktionary's servers are based in the USA, so it is primarily US law that must be complied with. It is unlikely that the quotations we use would violate copyright for two main reasons. First, all material published before 1923 is in the public domain in the USA and can be freely reproduced. Secondly, most of our quotations are obtained from works available on Google Books and the Internet Archive. If it is possible to view either a snippet or a full page preview of a book on Google Books, then the use of that portion of the book must be fair use under the law. Ergo, quoting an even shorter portion on Wiktionary must also be fair use. — SGconlaw (talk) 11:29, 10 August 2017 (UTC)
I wouldn't go so far as to say that availability on Google Books is indicative of anything, but the amount of text in the kind of quotes we use should fall under fair use. If the quote is too long for fair use, it's way too long for our purposes. Chuck Entz (talk) 14:06, 10 August 2017 (UTC)
In the books I've been reading lately, I've come across at least one to two dozen words in each one that we don't have entries for. Is it safe to take quotations for each of those words from the same book? How many quotations should I limit myself to to avoid violating fair use? Andrew Sheedy (talk) 17:47, 10 August 2017 (UTC)
@Andrew Sheedy First of all you should obviously check those words haven't been made up by the writer and they pass WT:CFI. There is no limit. For each word you should limit yourself to one or two quotes, there is no point anyway in having more quotations from the same work. As for quoting from the same work but on different page entries on Wiktionary, I would say that if the total amount quoted from the work is less than 5% of the entire work you have absolutely nothing to worry about. For a book that means there is no practical limit. For a poem a bit more would be allowed, some poems just might accidentally end up being entirely quoted here in small bits. As long as there's no obvious intention to violate copyright by overquoting a work, you'll be safe. W3ird N3rd (talk) 19:11, 10 August 2017 (UTC)
I'd completely forgotten about the 5% rule--thanks for reminding me. And don't worry, I always make sure to find citations for words before I add them (which is the main reason I haven't gotten around to adding more...). Andrew Sheedy (talk) 17:15, 11 August 2017 (UTC)
@Andrew Sheedy not sure if you are being sarcastic, genuinely grateful, referring to the 5% rule in general or if there really is a 5% rule for quotes/citations. It's just a number I picked, it could have also been 1%, 3%, 7%, etc. The point remains the same however, for fair use (which I think includes the right to quote for the U.S.) it would generally be a reasonable safe threshold. It could be exceeded in various cases, if I wrote a review for a poem that is twice the length of the original poem, there's a good chance I could cite the entire poem in small pieces. But under 5% for all quotes combined you simply don't have to worry about it - which is the majority of the time. W3ird N3rd (talk) 18:33, 11 August 2017 (UTC)
I thought it was an actual rule (i.e. you can legally reproduce 5% of a work). Maybe it is in Canada? I'll have to look that up. Andrew Sheedy (talk) 02:40, 12 August 2017 (UTC)
Yes, Wiktionary servers are in the U.S., but Wiktionary content might be reused by people in other countries without fair use. W3ird N3rd (talk) 19:11, 10 August 2017 (UTC)
Not that it is terribly relevant to the conversation, but Wikimedia servers are not all based in the US, nor should we expect that they will reside in the US exclusively in the future. - TheDaveRoss 12:53, 11 August 2017 (UTC)
Indeed irrelevant to the discussion at hand. I suspect servers outside the U.S. are caching servers, caches have different rules, but if someone wants to know more they should start a new discussion. W3ird N3rd (talk) 05:06, 12 August 2017 (UTC)
  • I think (but this just my interpretation) that citations and the citation page are perhaps meant for long quotes. "I have a dream", "Yes we can" or "Build a wall" would be a quotes. This would explain why quotes are allowed in the main dictionary space: copyright generally shouldn't apply to a quote. For example:
We choose to go to the moon in this decade and do the other things, not because they are easy, but because they are hard
This is a quote and there's pretty much no chance this is copyrighted, similar to the moon not being copyrightable. However:
We choose to go to the moon in this decade and do the other things, not because they are easy, but because they are hard, because that goal will serve to organize and measure the best of our energies and skills, because that challenge is one that we are willing to accept, one we are unwilling to postpone, and one which we intend to win, and the others, too.
Is citing Kennedy and likely can be copyrighted, so it requires proper attribution (what proper is depends on the country you are in) and/or be allowed by fair use. W3ird N3rd (talk) 19:11, 10 August 2017 (UTC)
I doubt very much if the latter quotation is a breach of copyright. It is still only a small portion of the entire speech. It might be a different matter if we reproduced, say, a third or half of the speech, but that isn't what we do anyway. I agree with Chuck that the quotations we use here at the Wiktionary are unlikely to raise copyright issues. — SGconlaw (talk) 21:50, 10 August 2017 (UTC)
Fair use is a lot more permissive than the right to quote, but is specific to the United States. W3ird N3rd (talk) 03:25, 11 August 2017 (UTC)
For what it's worth, speeches made by government officials in their official capacity are in the public domain, so none of that speech is copyrighted in the US. - TheDaveRoss 12:58, 11 August 2017 (UTC)
That's true for this example, I should have mentioned that. Thanks. W3ird N3rd (talk) 05:06, 12 August 2017 (UTC)
Late: The terms "quotation" and "citation" are usually used interchangeably in en wikt to refer to attesting quotations. WT:CFI says "it is better to cite sources" and "but we may use quotations found on them", and it has "Number of citations" section; we have Citations: namespace for attesting quotations. There is Wiktionary:Quotations, which says "Quotations, also called citations, serve two purposes ...". --Dan Polansky (talk) 12:57, 19 November 2017 (UTC)

Idiom[edit]

Do we have a page which lists all entries containing "Idiom" as a headword? If not, can we get one made? I guess we prefer Verb rather than Idiom for things like take an axe to. -WF

  • Before it was declared forbidden, I've used the ====Idioms==== header in the past for set expressions using a particular word, such as at 糞#Idioms. I see that some other JA entries have these expressions listed under ====Derived terms====, which doesn't seem quite right either, as these aren't "terms", sometimes comprising even full sentences.
What is the accepted header for these items now? ===Verb=== is not applicable for most of the Japanese expressions I can think of. ‑‑ Eiríkr Útlendi │Tala við mig 17:24, 9 August 2017 (UTC)
Please limit this to English-only. “Idiom” is the conventional translation for the Chinese part of speech of chengyu. Wyang (talk) 07:42, 10 August 2017 (UTC)
Are we going to have "Haiku" as a part of speech next? —CodeCat 09:45, 10 August 2017 (UTC)
Tangentiality. Are you OK? Wyang (talk) 09:51, 10 August 2017 (UTC)
That doesn't seem a part of speech coordinate with noun, verb, adjective, but I don't know how it would be used. Are they not used as nouns, verbs, adjectives, or something else? It seems like using "Word coined by Shakespeare" as a part of speech header. Admittedly, we have "Proverb", which might be similar. — Eru·tuon 17:32, 10 August 2017 (UTC)
A proverb is generally nothing more than a sentence. —CodeCat 17:40, 10 August 2017 (UTC)
  • Great discussion, but what I wanted was a page listing all entries with {{head|en|idiom}} —This unsigned comment was added by WF back from hols (talkcontribs) at 15:26, 10 August 2017‎ (UTC).
    There's no way to get a single-page listing, but you can search the wikicode for insource:/\{\{head\|en\|idiom/. [edit: Actually, it is a single page, just because there are so few.] — Eru·tuon 20:32, 10 August 2017 (UTC)
    It's also possible to de-list "idioms" from Module:headword/data. Then it would no longer be recognised as a valid POS category and end up in Category:head tracking/unrecognized pos. —CodeCat 20:35, 10 August 2017 (UTC)
    Thanks Erutuon! That's exactly what I wanted. I'm been making my way through those pages. A little cleanup done, and a few of them sent to RFx. --WF back from hols (talk) 23:40, 11 August 2017 (UTC)
    Without a plan on what to do after that, that would be unwise. There are 2,305 entries using {{zh-idiom}} (insource:/\{\{zh-idiom/) and 4,317 entries in the category for Chinese idioms, so that tracking category would become cluttered. And the cooperation of editors who handle Chinese would be needed to get the entries moved to the proper part of speech. — Eru·tuon 21:16, 10 August 2017 (UTC)
    I really hate the mentality that everything that is “improper” in English is assuredly improper in other languages by default, and needs to be “fixed”. Idiom is a perfectly fine part of speech in Chinese, and is in fact the most common translation for Chinese chengyu. Chinese lexicography treats these as a separate category of words, and there are numerous dictionaries compiled just for words belonging to this category. The comprehensive Chinese dictionaries typically do not mark entries by their part of speech, due to the analyticity of the language. In those monolingual and bilingual dictionaries that do, these words are either marked by  成  (cheng, idiom) (primarily in bilingual dictionaries) or unmarked (in Chinese–Chinese dictionaries), in juxtaposition to  名  (noun),  动  (verb),  形  (adjective),  副  (adverb),  惯  (phrase),  谚  (proverb),  歇  (xiehouyu), etc. Examples include the Contemporary Chinese Dictionary, the Comprehensive Standard Chinese Dictionary, the Oxford Chinese Dictionary, the Times New Chinese–English Dictionary and so on. The same idiom can be used as noun, verb, adjective, adverb, etc. depending on the context in the sentence, and their use is different from that of proverbs, phrases, and xiehouyu. Wyang (talk) 22:27, 10 August 2017 (UTC)
    Ahh. If they can be used as multiple other parts of speech, then I can see the lexicographic usefulness of keeping them as they are rather than trying to list all the other parts of speech they can be used as. However, it would be helpful to distinguish them somehow from the concept of idiom in English, which is quite different. The description in the category page Chinese idioms is probably not correct. — Eru·tuon 22:53, 10 August 2017 (UTC)
    I guess those entries should also contain {{lb|en|idiom}} so they show up in Category:English idioms if they are indeed an idiom. Since there are so few that shouldn't be a problem. Some recently disappeared already so this looks like it's getting phased out. W3ird N3rd (talk) 20:47, 10 August 2017 (UTC)
    But is idiom a context in which the word appears? If not, then it might be a misuse of the label template. —CodeCat 20:52, 10 August 2017 (UTC)
    I agree that the POS should be based on how they are used and not where they come from. Thus, I would say "proverbs" should really have the POS "clauses". --WikiTiki89 21:00, 10 August 2017 (UTC)
    Possibly, but in that case the English idiom category will have to be populated in some different way. insource:/\{\{lb\|en\|idiom\}\}/ gives 210 hits. W3ird N3rd (talk) 21:34, 10 August 2017 (UTC)

MW has a new feature to see dates of coinages[edit]

https://www.merriam-webster.com/words-by-first-known-date/1786Justin (koavf)TCM 07:30, 10 August 2017 (UTC)

Very cool. But really, the dates are sense-specific, and hence word-specific only if the word is monosemic. Wyang (talk) 07:39, 10 August 2017 (UTC)
At first I thought you meant MediaWiki, and was worried they were up to another waste of human resources. --WikiTiki89 15:42, 10 August 2017 (UTC)

-градить and other "combining form"s[edit]

A bunch of Russian entries are appearing in Category:head tracking/unrecognized pos, because they use the POS category "verbal combining forms" which is not valid. They are also being categorised as verbs, which is even less correct because these forms don't actually exist. They are only found in compounds, and are thus comparable to creating cran for the first morpheme in cranberry, or liezen for the base verb of verliezen. Something should be done about these. They can't be moved to the reconstruction namespace, they are not reconstructions because they are not conjectured to exist; we know they don't exist. A valid POS should also be used so that they don't clog up cleanup categories anymore. —CodeCat 18:00, 10 August 2017 (UTC)

What's wrong with "Combining form"? Crom daba (talk) 18:54, 10 August 2017 (UTC)
Perhaps they should be recategorized as "combining forms" or have that category added. It is a recognized lemma type in Module:headword/data. I agree they don't really count as verbs in a sense. But I think you are against the idea of a combining form, because you've recategorized combining form entries that I've created. — Eru·tuon 19:03, 10 August 2017 (UTC)
A combining form is a non-lemma form that is used when combined with another morpheme. That's very different from this. —CodeCat 19:05, 10 August 2017 (UTC)
Why is it not a lemma? I see how you can say it's not a real word, but it is a lemma in that it is a form representative of a paradigm of related forms (i.e. the conjugated forms given in the conjugation table). --WikiTiki89 19:30, 10 August 2017 (UTC)
These are lemmas, I'm not disputing that. I'm saying that combining forms aren't lemmas. Most of the categories in Category:Combining forms by language contain nonlemmas, even though the categories themselves are categorised as lemmas. —CodeCat 20:15, 10 August 2017 (UTC)
Oh. Yeah, it looks like what we use "combining form" to mean is completely different from what these are. In fact I was actually in favor of removing the hyphen from these entry names. I think we need to come up with a special name for these. Something like "unused base verb". --WikiTiki89 20:20, 10 August 2017 (UTC)
There's also things like Judeo- which are a bit in between. They are of course combining forms of nouns in Ancient Greek, but in English they don't really belong to anything. Or do they? —CodeCat 20:28, 10 August 2017 (UTC)
I think that's a separate unrelated issue. **гради́ть (**gradítʹ) morphologically could have stood on its own if it existed, but it just so happens that it only exists with prefixes (although it's quite possible that it did exist in Proto-Slavic or earlier). Judeo- I would say is a combining form whose uncombined forms don't exist. --WikiTiki89 20:41, 10 August 2017 (UTC)

Wiktionary: a translation dictionary only?[edit]

Should we stop pretending to be a good monolingual dictionary, for the achievement of which the wiki way ("wisdom of crowds") seems ill-suited? Would we be better of playing to what seems to be our strength: translation. This would mean "translation target" would be an automatic justification for any English entry and would upgrade the importance of phrasebook entries and common collocations. It would probably benefit from simplification of complex polysemous entries like those for technical, let alone really polysemous terms. DCDuring (talk) 19:21, 11 August 2017 (UTC)

The project should probably be forked, to support the deletionist and never-delete-anything-ist camps. Equinox 19:29, 11 August 2017 (UTC)
A "good translation dictionary" necessarily describes complex polysemous English words, so no. DTLHS (talk) 23:12, 11 August 2017 (UTC)
How can we a good translation dictionary now, then? DCDuring (talk) 04:23, 12 August 2017 (UTC)
I suspect this has been triggered (although probably not initiated) by the revert of my edit on "technical". Wiktionary:Requests_for_cleanup#technical W3ird N3rd (talk) 23:45, 11 August 2017 (UTC)
Don't take it too hard. I know that basic nouns, verbs and adjectives with multiple senses are hard and the most basic function words are harder yet. If cleaning up technical were easy, then I would have done it myself. I'm out of practice and never successfully tackled any basic function words. DCDuring (talk) 04:29, 12 August 2017 (UTC)
I think the solution, to appease both camps, is to actually allow the oft-discussed collocations section/namespace/whatever. This would allow us to be a far better translations dictionary because each collocation would have a translation section and we wouldn't have to resort to the controversial "translation target argument." The inclusionists would also be able to include far more, since much of what is hotly debated in RFD could be kept as a collocation. Deletionists could also be satisfed because there would be less pressure from inclusionists to keep SOP collocations in the mainspace. Andrew Sheedy (talk) 04:53, 12 August 2017 (UTC)
Wiktionary has included languages other than English for a long time, if not from day 1. So it's only right that Wiktionary should be translation-oriented. One inconsistency I have found regarding SoP terms is that there are entries for vegetable soup and pea soup, yet none for tomato soup, and there's bound to be translations for that. One nice touch I have just found is translations for soft-boiled egg and hard-boiled egg listed under boiled egg. DonnanZ (talk) 13:36, 12 August 2017 (UTC)
Inclusion of collocations is at most half of the solution. If, as User:DTLHS notes, "[a] 'good translation dictionary' necessarily describes complex polysemous English words", how do we improve our entries for such terms? Or is the current state of these entries good enough for translation work and for ESL learners, with native speakers mostly ignoring such entries anyway?
If we include more collocations, can we rely on the entries for collocations to share the burden of the definitions for verbs like go (go clubbing) and "particles" like abox and away?
Is it reasonable to admit that we can't really help those users who take a component-oriented approach to looking at sentences? Just as we say that users need help in determining where morphemes break in German and other compound nouns, we should also say that users can't be expected to know which meanings are only fully captured in collocations. Expecting users to wade through derived terms in go to find go clubbing does not seem very realistic. If a user knows to go to clubbing, that user probably doesn't need the go clubbing entry at all. DCDuring (talk) 16:22, 12 August 2017 (UTC)
I don't really object to making translation our focus. However, in order to be a truly comprehensive translation dictionary, I think we also have to be a comprehensive monolingual dictionary. And I don't think we're really doing as badly as you feel. Yes, we're a long way from being another OED, but we're also good enough that I'm able to use Wiktionary as my primary dictionary. Conversely, we actually suck at translations from English into other languages (even common ones like French and Spanish). I'm really not convinced it's our strength. FL to English translations tend to be much better, but even these are often lacking. The reality is that we're a work in progress on all fronts, and always will be.
Now, if we include collocations, I think we should handle them more or less as follows:
  1. Do not move definitions from main entries over to collocation entries (some duplication is fine, and people should still be able to find what the want in the main entry);
  2. Create separate entries for them, rather than hosting them in another mainspace or on the same page as any of their component words (we could potentially treat collocations like "forget about" differently from "piece of furniture", the latter having its own entry, the former sharing a page with "forget");
  3. Label them with a banner just as we do for phrasebook entries, to mark them as SOP and allow us to continue to function as a monolingual dictionary, regardless of our focus;
  4. Include full definitions in collocation entries, for clarity;
  5. Eliminate obvious information like pronunciation or etymology from collocation entries, but retain things like translations and synonyms;
  6. Link to collocations from the entries of each of their component words (excluding really basic words, like articles);
  7. Use either "Derived terms" or "Related terms" (possibly renamed) or a new "Collocations" section to host lists of collocations, subdividing the list into different categories as necessary;
  8. Allow collocations in all languages so that we can truly function as a translation dictionary: someone translating from French should be able to look up "pointe de pizza" or "se faire tuer" (or find these in the entries for pointe and pizza / faire, and tuer) and find the corresponding English collocations, "piece of pizza" and "get oneself killed".
I doubt we'll ever solve the problem of people taking a component-oriented approach to looking up multiword terms or collocations. But that doesn't mean we shouldn't try to be helpful to those who do know how to identify multiword expressions. The best we can do is list multiword terms and collocations in the entries for each of the constituent parts, and make long lists easier to navigate by splitting them up by category. Andrew Sheedy (talk) 17:46, 12 August 2017 (UTC)
I'd prefer it if we hosted collocations but they were not listed at constituent lemma pages and generally had close to zero incoming links to them. Crom daba (talk) 18:48, 12 August 2017 (UTC)
How would a person find them all then? Andrew Sheedy (talk) 21:50, 12 August 2017 (UTC)
  • One interesting aspect of mass inclusion of collocations as a new class of entry is that we would be substituting two boundaries that needed some kind of policing for one. Instead of a single include/exclude decision, we would need to decide whether to include or exclude and whether something was a collocation or an idiom. I am not confident that we would achieve any more agreement in total on these two decisions than we do now on one. Are we imagining that any collocation at all would be entered, subject to current RfV? live free or die? parlare con tono di condiscendenza? Wouldn't we be increasing the number of truly offensive items? Should we exclude full sentences that are not proverbs and not phrasebook entries? (More decisions!!!) DCDuring (talk) 19:31, 12 August 2017 (UTC)
Very true, although maybe we would be able to create a stricter set of criteria for determining whether something is SOP or not? I think a lot of RFD debates would be mostly solved if entries could be kept as collocations: those where terms are technically SOP, but not transparently so (sometimes because they use obscure senses of a word), and are not necessarily easily understood (e.g. "nature preserve"); those where an expression uses a more or less consistent word order and has become a fixed phrase; and those where the only justification for keeping an entry is its value as a translation target. I think most people could agree to keep such entries, but label them as collocations. Andrew Sheedy (talk) 21:50, 12 August 2017 (UTC)

IPA ≠ audio[edit]

Entries where the pronunciation is different from that in the audio, as for example in polemic, should be automatically detected and listed --Backinstadiums (talk) 20:25, 11 August 2017 (UTC)

How do you propose we do that? DTLHS (talk) 20:28, 11 August 2017 (UTC)
@DTLHS: Auto-generated subtitles could be created using some software and then compare both columns of data. 90% of the job would be done that way, the rest could be manually reported individually as @Wyang has proposed --Backinstadiums (talk) 07:12, 12 August 2017 (UTC)
You vastly overestimate the ease of generating written transcriptions, much less IPA, from audio files. Others can probably explain better why this is so difficult. See e.g. [4]. DTLHS (talk) 07:32, 12 August 2017 (UTC)
Not automatically, but perhaps via a more accessible feedback system: “Saw an error on the page? Report it here.” Wyang (talk) 21:52, 11 August 2017 (UTC)

Merging Category:Chinese language and Category:Sinitic languages to a single category (Category:Chinese language(s)?)[edit]

Sinitic languages is just another name for the Chinese languages. It is confusing to have both categories on Wiktionary. It seems there is room for improvement in the category system for macrolanguages; there are categories such as Category:Mandarin terms derived from Sinitic languages which really should be renamed to Category:Mandarin terms derived from other Chinese languages. Wyang (talk) 07:14, 12 August 2017 (UTC)

I agree that the current situation, in which we have two sets of categories for what is basically the same entity, is confusing. It would be hard to merge the two categories, however. x language is a category created by {{langcatboiler}} that uses data from Module:languages, while x languages is a category created by {{famcatboiler}} that uses data from Module:families. And currently only a language with a data file can have entries; a family cannot. I'm not sure how to merge the two in the existing system. And what code would we use for the combined entity? How can we make something be simultaneously a language and a family? — Eru·tuon 23:41, 12 August 2017 (UTC)

Language request: Old Kannada[edit]

Old Kannada (Kannada: ಹಳೆಗನ್ನಡ (haḷegannaḍa)) needs to be included. Proposed code: okn. It is a Dravidian language. Immediate ancestor: Proto-Tamil-Kannada. Scripts: Brahmi, Kadamba, Kannada. Descendants: Middle Kannada -> Modern Kannada. ɱɑɗɦɑѵ (talk) 07:28, 12 August 2017 (UTC)

That seems like a reasonable language to add. Can you give any examples or indication of how different it is from Kannada kn? (Other notes: Exceptional codes need to be formatted differently, so it would have to be dra-okn. We cannot add Kadamba script because it seems that it is not in Unicode. Proto-Tamil-Kannada is also not registered as a language.) —Μετάknowledgediscuss/deeds 07:33, 12 August 2017 (UTC)
@Metaknowledge: There's a significant difference between Old Kannada & its modern descendant. It's barely intelligible with modern Kannada. Some sound changes (like the transformation of Proto-Dravidian *p to Kannada [h]) are not present in Old Kannada. The case-suffixes are also different. As for the script, I hope it'll be acceptable to create lemmas in Old Kannada in the brahmi or the kannada script. -- ɱɑɗɦɑѵ (talk) 12:26, 12 August 2017 (UTC)
@माधवपंडित There seems to be a distinction between Old Kannada and Purva Halegannada. Should we encode them separately? DerekWinters (talk) 13:27, 12 August 2017 (UTC)
@DerekWinters: I saw that as well. About 500 years of time gap. I think Pūrva-Halegannada is what we'd call pre-Old Kannada or Proto-Kannada. But the matter source is small... as it is, Halegannada is poorly documented on the internet. If i'm not wrong Proto Kannada attestations are from just a few oldest inscriptions. Perhaps we can make Proto Kannada an etymology only language, used in etymology but cannot have lemmas of its own. -- ɱɑɗɦɑѵ (talk) 13:35, 12 August 2017 (UTC)
It can be like Primitive Irish or Pictish, attested from very few sources. Personally I think it better to add it separately. DerekWinters (talk) 13:39, 12 August 2017 (UTC)

A quick update on changes of translation adder[edit]

I have updated the gadget to fetch language scripts from the module. Also, it fails (gracefully of course) if the input script is not in the list of scripts from module. So, you may notice some functionality changes. Let me know if the changes are for the worse. Dixtosa (talk) 19:21, 12 August 2017 (UTC)

Is it anything to do with the annoying little +- signs that have popped up in translations sections? DonnanZ (talk) 19:30, 12 August 2017 (UTC)
Those were always there, but the spacing is off now ([5]) (Chrome) DTLHS (talk) 19:37, 12 August 2017 (UTC)
Yes. Fixed. Dixtosa (talk) 20:29, 12 August 2017 (UTC)
Also added the ability to hide the transliteration input if the language has automatic transliteration that overrides manual.--Dixtosa (talk) 13:08, 20 August 2017 (UTC)

Distinction between derived and related terms[edit]

It's been a long while since I did any serious editing here. I've been updating some of the derived words sections. I noticed that the section for "language#Derived terms" looked very sparse, so I added some more entries. After doing so, I saw that many of them were already in the Related terms section.

Has policy changed lately? My understanding has always been that Derived terms is for words formed by appending affixes ("metalanguage") and compounds ("dead language", "language lab") and that Related terms was reserved for words that are etymologically related in some other way ("linguistics", "lingua franca").

I notice that the Wiktionary:Entry_layout page doesn't make this very clear and doesn't give any examples. Perhaps it could be updated?

In the meantime, I'll tidy up Derived terms and Related terms for language, but please revert if this is no longer the way things are done.

Paul G (talk) 19:38, 12 August 2017 (UTC)

Technically a derived term is also a related term. So sometimes there are lists of terms that people have just put all together under "related terms" without distinction. Your understanding matches mine and your edits to language look fine. DTLHS (talk) 19:43, 12 August 2017 (UTC)
That, too, is my understanding of the distinction between the two terms. — SGconlaw (talk) 20:06, 12 August 2017 (UTC)
Confusion can also be caused be placing some derived terms under hyponyms and others under derived terms or related terms. Personally I would like to see hyponyms done away with - I can hear the protests already. DonnanZ (talk) 20:38, 12 August 2017 (UTC)
Thanks for the responses. There seem to be a number of pages where derived terms are words formed with affixes and related terms are compounds — rock, for example — so some editors at least seem to have thought this is what the sections are for. — Paul G (talk) 20:47, 12 August 2017 (UTC)

IPA policy[edit]

The (phonemic) English pronunciation keys in most of the major dictionaries (as well as the associated Wikipedia article use ⟨r⟩ as a standard phoneme. I feel that if this is the common usage it ought to be a standard policy across Wiktionary pronunciation sections. In many articles people have replaced ⟨r⟩ with ⟨ɹ⟩, ⟨ɚ⟩ et al. and while this is phonetically correct, it goes against the phenomic standard, and had created a disparate mess with little to no consistency. The best solution in my opinion is to just have both phonetic and phonemic pronunciations wherever possible, and make it a policy that ⟨r⟩ belongs in /r/ and ⟨ɹ⟩ belongs in [r], etc. This has the advantage of giving the maximum amount of information, while remaining in standard with M-W, Collins, etc. Any input would be appreciated. --Pariah24 05:07, 13 August 2017 (UTC)

I wonder if it is possible to create {{en-IPA}} to standardise the generation of IPA for English (and represent the dialectal variation; cf. International Phonetic Alphabet chart for English dialects). Having manual IPA on all 480,000+ English lemmas would be a logistical nightmare. Wyang (talk) 05:17, 13 August 2017 (UTC)
We had a discussion about this many years ago. At first I was in favor of using /r/ in the phonemic representation of English, but eventually I came around to the idea of using /ɹ/, chiefly because we are not an English-only dictionary. If we were, if English Wiktionary had only English entries, I still would prefer /r/; but because we have entries in thousands of languages, including ones where /r/ really does stand for [r], I think it's ultimately less misleading to use /ɹ/ for English. —Aɴɢʀ (talk) 07:39, 13 August 2017 (UTC)
I agree with Angr. If we use /r/ for [ɹ], readers seeing /r/ in other languages might mistakenly believe that they represent the same, or similar sounds. We fill a different niche than other English dictionaries, and as a result, our policies might differ in some areas. Andrew Sheedy (talk) 17:14, 13 August 2017 (UTC)
I agree with Angr and Andrew. Since most English speakers pronounce ⟨r⟩ as [ɹ], it's appropriate to use /ɹ/. I'd second Wyang on creating an English IPA template. — justin(r)leung (t...) | c=› } 19:28, 13 August 2017 (UTC)
There's no need for a separate template- normalization can take place in the IPA module. DTLHS (talk) 19:29, 13 August 2017 (UTC)
How can one implement a module without a template, exactly? —Aryaman (मुझसे बात करो) 20:39, 13 August 2017 (UTC)
Huh? All IPA is already processed through Module:IPA. All we would need to do is implement specific rules for English. DTLHS (talk) 20:47, 13 August 2017 (UTC)
@DTLHS: I would assume we would make MOD:en-IPA and implement it in {{en-IPA}}, just like every other language with an IPA module. —Aryaman (मुझसे बात करो) 23:03, 14 August 2017 (UTC)
What about English dialects? How do we ensure the correct symbols are used and symbols are used in a consistent manner, for say, RP? Wyang (talk) 21:34, 13 August 2017 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── I'm more concerned with just having a standard to go by than what that standard is. I guess I'll start changing /r/ when I see it, although I still think it would be helpful—especially on pronunciations that differ significantly from the standard phonemes—to have separate /phoneme/ and [phone] pronunciations. Pariah24 (talk) 23:41, 13 August 2017 (UTC)

What do you mean by "pronunciations that differ from the standard phonemes"? — Eru·tuon 23:55, 13 August 2017 (UTC)
Sorry if that's an awkward way to put it...I mean pronunciation differences in accents/dialects, and loanwords, and cases like pun and spun whose pronunciations are both /(s)pʌn/ but phonetically are [pʰʌn] and [spʌn]. It would be helpful to someone learning English to have both versions. Pariah24 (talk) 00:11, 14 August 2017 (UTC)
Ahh, I see. Phonetic transcriptions showing the exact pronunciation of stops are welcome. I think there are some transcriptions like that already. As for accents, keep in mind that many dialectal features are phonemic, because dialects do not all share the same phonological system, and so we show them in the phonemic transcriptions. You can see examples in Appendix:English pronunciation. Not shown on that page are the phonemic transcriptions for American English dialects without the horsehoarse merger. (See hoarse for an example.) — Eru·tuon 00:38, 14 August 2017 (UTC)

Regarding whether to create a separate module and template for English IPA transcriptions: I think it would be much neater than adding a lot of English-specific stuff to Module:IPA. I say a lot, because I think it would be a good idea to automatically convert between different transcription systems for RP, if we could get someone who knows enough about them. For instance, automatically displaying both the OED's more old-fashioned transcription of lot, /lɒt/, and Geoff Lindsey's more modern one, /lɔt/. @Mr KEBAB proposed something like this, but I haven't done anything with the idea yet. — Eru·tuon 00:02, 14 August 2017 (UTC)

@Erutuon: Yes I did, but if we're going to use Lindsey system here we should fully follow it, not cherry-pick some of the symbols and not others (I'm saying this because I believe I proposed a mixed system a year ago, this is not a good idea for several reasons). Mr KEBAB (talk) 00:40, 14 August 2017 (UTC)

I think it would be very nice to have something similar to what we have for Ancient Greek and Latin, with different regional or period pronunciations all indicated. The template input for English would obviously have to be the broad phonemic transcription, though, rather than the spelling of the word. We'd have to be careful, however, of cases where pronunciation variants actually represent different phonemes rather than differences of realization; in such a case we'd probably need to call the template multiple times on the page, each time with a separate phonemic transcription and corresponding generated phonetic transcriptions for various dialects – there would be parameters for which dialects/variants to include or not include. – Krun (talk) 00:51, 14 August 2017 (UTC)

Hi,
An input from outside. In French Wiktionary, we had a large discussion on pronunciation and neutrality two years ago and we renewed our policy. We started by defining that phonetic information have to be based on audio recordings and have to be several to describe variety (in space, time, social groups). A phonemic information have to be based on a specific analysis, made on a specific dialect and can't stand for a whole language. There is a diversity of phonemic representation. To be neutral on this perspective is not to select and promote one analysis (equal one variety) but to give the different analysis, with sources. So: phonetic with audio sources, phonology with written sources (linguistics piece of work).
Finally, we consider the needs of the readers, and we consider they do not need dozen of phonetics and dozen of phonological representations. They want a short information, giving a usual way of pronouncing a word, consensual, as unmarked as possible, and we created a third way to indicate this specific information, with backslash signs like \θis\. This last one is provided in the first part of the page, and the other ones on the second part of the page, for people eager to have more precise information. It was quite not a huge change, but a great improvement in the frame it offer for people to add new information without colliding with existing ones. Less controversies on "false phonological representation" and more accurate descriptions. If you want to know more about this, I can help you, or translate some pieces of French Wiktionary policy Face-smile.svg Noé 10:16, 15 August 2017 (UTC)
I think Wiktionnaire has a good system. I find it interesting how the very broad pronunciation is included in the header, but I don't like how more detailed pronunciation is relegated to the bottom of the entry and often neglected. Having a very broad transcription with everything else in a collapsable box might be a good solution. On the other hand, it would be hard to decide what transcription to use when a word has been affected by a merger or split in many dialects. Andrew Sheedy (talk) 16:07, 18 August 2017 (UTC)

Words with uncertain reading[edit]

Recently the Egyptian entry jsqꜣrwnj

isqArw
n
y
T14
N25

was added alongside our previously existing entry jsqꜣrnj

isqAr
n
y
T14
N25

But these aren’t in fact two different attestations with two different spellings; they’re both representing a single attestation from the Merneptah Stele, where the original engraver inscribed a hieroglyph very poorly and modern authors have proposed two different readings of what it was intended to be. Do we have any policy about what to do in such a case — keep only the more plausible/widely accepted entry? Keep both? (And, if so, what would they be marked as? Alternative forms, even though they really aren’t?) — Vorziblix (talk · contribs) 10:01, 14 August 2017 (UTC)

Perhaps create an "alternative reading" template, and use it in the entry for the less widely accepted reading. Then list the less widely accepted reading in the Alternative forms section for the more widely accepted form. — Eru·tuon 16:30, 14 August 2017 (UTC)
Sounds good. For now I’ll just do {{form of|Alternative reading}} rather than an altogether new template, but if more of these start cropping up, so that categorizing them becomes useful, I’ll go for a separate template. — Vorziblix (talk · contribs) 23:18, 14 August 2017 (UTC)
Thanks, that clears things up a lot. — Vorziblix (talk · contribs) 23:18, 14 August 2017 (UTC)
Another example is ᚐᚆᚓᚆᚆᚈᚈᚋᚅᚅᚅ / ᚐᚆᚓᚆᚆᚈᚈᚐᚅᚐᚅ. In most cases, it's possible to be reasonably certain how to read an inscription, but when it's not (in individual cases), the practice does seem to be to have multiple (cross-linked) entries. Whether or not it is sensible for one of the entries to be a "form of" redirect to the other entry depends on whether the difference in reading entails a difference in meaning. - -sche (discuss) 06:30, 15 August 2017 (UTC)

Flag gadget edit request[edit]

Could an admin change the URL for the Ancient Greek flag in MediaWiki:Gadget-WiktCountryFlags.css from Flag_of_Palaeologus_Dynasty.svg to Byzantine_imperial_flag,_14th_century,_square.svg? The file has been moved, and there's been an error message in the browser console because the CSS file tries to load the file using the old name. — Eru·tuon 17:41, 14 August 2017 (UTC)

DoneDixtosa (talk) 17:55, 14 August 2017 (UTC)

What's the deal with the garbage "American Sign Language" entries?[edit]

Is there an editing tool that produces these, maybe with ASL as the first in a list of languages? I don't think it's a single vandal producing all of them. DTLHS (talk) 20:41, 14 August 2017 (UTC)

@DTLHS: Do you have a link or diff? —Justin (koavf)TCM 21:09, 14 August 2017 (UTC)
People often create fully-formed ASL entries with all the usual headings, but with no actual content or definition. Yes, there is a tool that creates these, but I can no longer find it. I've seen it before. Equinox 21:11, 14 August 2017 (UTC)
  • It's the New Entry Creator; one of its defaults is ASL. —Μετάknowledgediscuss/deeds 21:35, 14 August 2017 (UTC)
    Actually, it's the second-from-the-top entry template, on the search results page, not the New Entry Creator. There should really be an AbuseFilter to take care of those. --Yair rand (talk) 01:21, 21 August 2017 (UTC)

@DTLHS: how would you improve them? --Backinstadiums (talk) 22:12, 14 August 2017 (UTC)

I don't think you understand. They're contentless entries that are deleted on sight. —Μετάknowledgediscuss/deeds 22:16, 14 August 2017 (UTC)

Adding language code 'ghc'[edit]

Hi all, I am thinking it might be useful to add the code 'ghc' for the historic common written language of Ireland and Scotland, particularly in cases where it's not clear whether a word derives from Irish or Scottish Gaelic. Gherkinmad (talk) 21:33, 14 August 2017 (UTC)

I don't know what lect you are referring to or when it was used. We have codes for Old Irish (sga) and Middle Irish (mga), and those should suffice. —Μετάknowledgediscuss/deeds 21:37, 14 August 2017 (UTC)
I can kinda see the point. While, technically, Scottish Gaelic can be seen to be differentiating itself from Irish as early as the Book of Deer, for pretty much the entire Middle Ages you can't really tell between them. And everything after 1200 is currently classified as either ga or gd. So a Classical Gaelic could be seen as a useful intermediary step:
  • pgl Primitive Irish (–c.600)
    • sga Old Irish (c.600–c.900)
      • mga Middle Irish (c.900–c.1200)
        • ghc Classical Gaelic (c.1200–c.1800)
          • ga Modern Irish (c.1800–)
          • gd Scottish Gaelic (c.1800–)
That would require some refactoring, though. It would make etymologies slightly less messy: as it is, there appears to be an issue with taking a gd word back through ga to mga. This way, they could both branch from ghc. --Catsidhe (verba, facta) 21:55, 14 August 2017 (UTC)
Do we have a resolution? Gherkinmad (talk) 23:08, 14 August 2017 (UTC)
Resolution? We barely have the start of a discussion! Also, this sort of thing has been suggested before (by me at least once, IIRC) and not happened, so maybe a wider debate will have some impact. --Catsidhe (verba, facta) 23:26, 14 August 2017 (UTC)
(Without expressing an opinion on whether this is a good or bad idea,) it would be possible to add 'ghc' as an "etymology-only language" so that etymologies could refer to it, even if we don't want to add it as a "full language" with its own entries / language sections (which might duplicate many mga and ga entries?). - -sche (discuss) 06:46, 15 August 2017 (UTC)
  • @Angr is the expert, and he hasn't voiced a need for this as far as I've seen. But I'd like his thoughts. —Μετάknowledgediscuss/deeds 03:57, 15 August 2017 (UTC)
    For reference, the code was removed following this discussion in 2013. (I have no great knowledge of the subject and defer to people like Angr and Catsidhe who are familiar with the Irish language(s).) - -sche (discuss) 06:37, 15 August 2017 (UTC)
    My views haven't changed since that 2013 discussion. I think mga, ga, gd, and gv are sufficient to cover all Goidelic lects from the 10th century to today. The problem with making it an etymology-only language is that etymology-only languages are varieties of one particular existing language, but the whole motivation behind ghc is to avoid calling it either Irish or Scottish Gaelic (because it's basically both). —Aɴɢʀ (talk) 08:33, 15 August 2017 (UTC)
    For the purpose Catsidhe is talking about, it seems like it could be considered a variety of Middle Irish... but then, I don't see why branching Scottish Gaelic and Irish from ghc is any better than branching them both from mga, or why branching them from mga like we do now causes "an issue" — Catsidhe, can you explain? - -sche (discuss) 09:39, 15 August 2017 (UTC)
    No one considers Middle Irish going as late as 1800, though. Middle Irish is generally seen as ending around 1200 (much earlier than Middle English, for example), so we consider everything after that to belong to one of the modern languages, even though the literary language (as opposed to the colloquial language) is virtually identical in Ireland and Scotland until around 1800. —Aɴɢʀ (talk) 11:55, 15 August 2017 (UTC)
    Which is why I find the distinction between Early Modern Irish and Early Scottish Gaelic (to 1800) to be annoyingly artificial. There is nothing linguistic which distinguishes just about any given 14C Irish from 14C Scottish. The only way you can tell in most cases is by knowing beforehand where or by whom it was written.
    Also, having ga cover 800 years of history makes it tricky to use for both historical research and for current usage. Unless you're paying attention, it can be easy to miss that one word became moribund in the 16C, and another entered the language in the 1980s. The former case isn't going to help if you're writing a letter to someone in Gaoth Dobhair, the latter isn't going to help if you're doing Mediaeval research. --Catsidhe (verba, facta) 12:16, 15 August 2017 (UTC)
    Yes the motivation is to avoid calling the language either Irish or Gaelic, because in the case of the English word Gael we simply don't know which variety it came from, and we might be a little more honest if we simply said so. The OED has the word first in modern English in 1775/1810 from Scottish Gaelic, but I completely accept that the word might have a longer history in the language, and so I was thinking we could meet Angr halfway by saying it derives from Classical Irish/Gaelic, or otherwise simply that it derives from Middle Irish. Gherkinmad (talk) 16:48, 15 August 2017 (UTC)
    @Angr OK, the matter has been more or less resolved. However I would still advocate the ghc code for cases where there is a further intermediary stage, otherwise there are three words for Gael in modern Irish: Gaoidheal, Gaedheal and Gael, all covering the same time period. Any thoughts? Gherkinmad (talk) 16:03, 16 August 2017 (UTC)
    There will have to be three entries for modern Irish anyway, since the spellings Gaoidheal and Gaedheal were used up until the mid-20th century, long after ghc would be over. That's the main reason for my opposition to ghc: it would increase unnecessary redundancy. If we had it, we would have to have Gaoidheal in both ga and ghc instead of just ga; likewise we would have to have new ghc entries for common words whose spellings haven't changed, like fear, bean, mac, , , athair, máthair, and so on and so forth. It doesn't seem worth it to me to duplicate the effort. —Aɴɢʀ (talk) 16:15, 16 August 2017 (UTC)
    @Angr, @-sche Could we add ghc as an etymology-only language? Because as the matter stands, one would have to say that the modern Irish word is first attested in print in 1567 in Scotland, without further explanation. I just don't see a way to credit this properly without referring to a further intermediary stage of the language which is of course still Irish. Gherkinmad (talk) 16:53, 16 August 2017 (UTC)
    As was said before, etymology-only languages always have a parent language that they belong to. This parent is used, for example, in determining which section should be linked to. So which language does ghc belong to, Irish or Scottish Gaelic? It doesn't solve the problem at all, just moves it. —CodeCat 18:29, 16 August 2017 (UTC)
    It belongs to Middle Irish: Scottish Gaelic and (Late) Modern Irish could both branch from ghc whose parent language is mga. I'm sorry for pressing this so much, I just know you really have to make your case if you want to edit Wiktionary. Gherkinmad (talk) 19:28, 16 August 2017 (UTC)
    If ghc's parent language is mga, then this carries the implication that all ghc terms are mga terms. Every link to a ghc term in fact creates a link to a mga entry because of how the parent of an etymology language works. Since every link should have an entry behind it, it implies that any links in etymologies to ghc terms attested from 1200 to 1800 are implicit requests for Middle Irish entries to be created on those pages. So, you want us to create Middle Irish entries for terms attested as late as 1800? —CodeCat 20:05, 16 August 2017 (UTC)
    If that's what will eventuate no I don't, though this does need further discussion, as it's clear not everyone accepts the current policy as it is. Gherkinmad (talk) 21:03, 16 August 2017 (UTC)

Tagalog enclitic forms[edit]

In Tagalog, any word ending in "a, e, i, o, u, or n" has an enclitic form (sort of). For example, the word "malaki" (big), to say a "big person" one says "malaking tao", adding an "ng" at the end. And that goes for adjectives, nouns, verbs, all words. The question is, do we make an entry of the enclitic form for all the words in Tagalog that has them? --Mar vin kaiser (talk) 10:51, 15 August 2017 (UTC)

It sounds a bit like English -'s or Latin -que, i.e. a clitic that can be added to virtually anything. And we don't have entries for person's or virumque, so I'd say we shouldn't have an entry for malaking either, but just one for malaki and one for -ng. BTW, how do words ending in other sounds behave? —Aɴɢʀ (talk) 11:58, 15 August 2017 (UTC)
@Angr: Well, for example, the word "maliit" (small), to say a "small person" would be "maliit na tao". Actually some see the word "malaking" to be a contraction of the word "malaki" and the word "na" which links words together. One problem is that for example, the word "taong", it could mean four things,
  1. "taong" - a black veil for mourning
  2. "taóng" - water container (we don't write diacritics to indicate stress in Tagalog, so both are under the same entry)
  3. "taong" - the word "tao" (person) + "na"
  4. "taóng" - the word "taón" (year) + "na"
So my point is, shouldn't the last two be in the entry "taong" also? --Mar vin kaiser (talk) 13:23, 15 August 2017 (UTC)
@Mar vin kaiser: Well, look at butcher's: it has several meanings of its own, but the transparent one of butcher + the clitic -'s isn't actually listed. —Aɴɢʀ (talk) 13:42, 15 August 2017 (UTC)
@Angr: Good point. Although, the entry it's has it. But I do see your point. --Mar vin kaiser (talk) 13:45, 15 August 2017 (UTC)
@Angr: The reason why I feel it's important is because for example, any two words that are beside each other, the first one has to be in enclitic form, and think of the number of entries that have two words. For example, "free will" is "malayang loob", but there won't be any entry for "malayang", only "malaya". And that would go for all the other entries that have two words. --Mar vin kaiser (talk) 13:59, 15 August 2017 (UTC)
@Mar vin kaiser: It's probably at it's because in the standard written language, the one thing it's isn't is it + the possessive -'s, but only it + the contracted verb -'s. As for the headword line, that's not a problem. At the entry for malayang loob, just add |head=[[malaya]][[-ng|ng]] [[loob]] to the headword template. —Aɴɢʀ (talk) 14:13, 15 August 2017 (UTC)

Checkusers[edit]

Versageek has been inactive for more than a year, so per the WMF policy her checkuser rights have been revoked. The policy requires that any local wiki have two or more checkusers if they have any, my rights have been suspended as well pending our electing another. We can opt not to bother having local checkusers and simply rely on the stewards to take care of requests, or we can nominate one or more new checkusers and have some elections.
From my perspective it is not strictly necessary to have local checkusers, but it is convenient. Almost all of the work these days is keeping track of and blocking the long-term pests, and making sure we are actually blocking Wonderfool when we think we are. - TheDaveRoss 12:59, 15 August 2017 (UTC)

Having local checkusers is definitely a good thing. I'm surprised WF hasn't made any votes to encheckuserify anyone. —Μετάknowledgediscuss/deeds 16:57, 15 August 2017 (UTC)
"Encheckuserify"? Beware, lest you affixiate. — Kleio (t · c) 20:02, 15 August 2017 (UTC)
I thought User:Chuck Entz was a checkuser, since he does a good job of keeping track of the IPs/locations of various vandals. He seems like a good candidate for the position. - -sche (discuss) 19:47, 15 August 2017 (UTC)
Oddly enough, I probably wouldn't have as much to say if I were a checkuser, since I understand there are fairly strict rules about what information obtained with the checkuser tools can be disclosed and when you can use them. Right now, I get pretty much all my information from geolocating just about every IP that does something out of the ordinary and looking for patterns (that and monitoring the abuse filter logs). I'm not sure what I would be allowed to say/do if I spotted an IP that had earlier turned up in a checkuser investigation (though I could probably block them). That said, I'm game, if everyone thinks it's a good idea. Chuck Entz (talk) 02:43, 16 August 2017 (UTC)
I actually think Chuck is a great candidate, but I was under the impression that we had an old (unwritten?) rule that no one user should have all the user rights at en.wikt simultaneously. —Μετάknowledgediscuss/deeds 04:19, 16 August 2017 (UTC)
You both bring up good reasons for pause. Who else wants the job? We could nominate WF; then he'd have to ID himself to the Foundation to get the flag... ;) lol - -sche (discuss) 05:10, 16 August 2017 (UTC)
I think Chuck is a great choice as well. Re "having all the rights", I don't see a problem there. Our 'crats have a fairly limited scope of responsibility which doesn't much change how they might be able to (ab)use the CU tools. This is a different story than other wikis which have roles such as ombudsmen, abrcom, etc.
Re limiting your ability to act, I have not found that to be a problem. In the cases where an anonymous contributor is connected to a previously blocked logged-in account you may have to be somewhat oblique (e.g. not using the name of the blocked account, just saying that they are evading a block) but that is actually a fairly rare situation. - TheDaveRoss 14:56, 16 August 2017 (UTC)
I thought I was clear enough above, but I'll restate it: any minor concerns I may have mentioned as an aside have no bearing on my main point, which was that I'm willing to be a checkuser, if that's what the community wants. I'm not a hat collector, and I can't say that being a bureaucrat, for instance, has exactly enhanced my life, but if someone needs to do it, it might as well be me. Chuck Entz (talk) 23:16, 8 September 2017 (UTC)
Probably better to have some local ones. Equinox 19:49, 15 August 2017 (UTC)
Local is good, but what about Chuck's stated concerns. Where is it written that checkusers can't disclose publicly available info? Who can be asked about this? DCDuring (talk) 04:27, 16 August 2017 (UTC)
Perhaps we can create a new class of superuser: "Chuckuser". DCDuring (talk) 04:28, 16 August 2017 (UTC)
lol! - -sche (discuss) 05:10, 16 August 2017 (UTC)
@DCDuring: The policy dictating the use of the tool is here, and is also governed by the privacy policy and the access to nonpublic information policy. There are lots of words there, but essentially it is OK to talk about publicly available information, and it only gets tricky when your interpretation of public information is affected by nonpublic information. - TheDaveRoss 15:02, 16 August 2017 (UTC)
So Chuck's concerns are in maintaining a "Caesar's wife" standard, probably appropriate. DCDuring (talk) 22:04, 16 August 2017 (UTC)

For what it's worth, I have these user rights on Wikispecies, so I am already vetted by the WMF. I would be willing to have those tools here. —Justin (koavf)TCM 05:12, 16 August 2017 (UTC)

Metaknowledge started a vote for Koavf, and I made a comment on the discussion page there suggesting that we also vote on admin status at the same time. - TheDaveRoss 12:35, 21 August 2017 (UTC)

I would like to become a checkuser. --Daniel Carrero (talk) 07:13, 16 August 2017 (UTC)

DI CheckUser. PseudoSkull (talk) 22:09, 16 August 2017 (UTC)
@PseudoSkull, I don’t think we accept most of the specialized terminology and abbreviations used by Wikipedia/Wiktionary here, such as CheckUser, RfV, RfD, and so on, but we put them in the Wiktionary:Glossary. —Stephen (Talk) 22:27, 16 August 2017 (UTC)
If there are 3+ external citations, I would disagree. PseudoSkull (talk) 22:28, 16 August 2017 (UTC)
But let's discuss that elsewhere. Perhaps in WT:TR so that the discussion at hand can continue. PseudoSkull (talk) 22:28, 16 August 2017 (UTC)
I nominated User:Chuck Entz here. - TheDaveRoss 20:40, 11 October 2017 (UTC)

Review of Ecjklangs (talkcontribs)' contributions[edit]

Most of these sex-related entries appear only in Urban Dictionary (OneLook backs me up on this), but some entries - such as sexcess - are somewhat citable. Not sure how durable they are though (I mean, floorcest anybody?). Anyone in the mood for a look-through? --Robbie SWE (talk) 08:47, 16 August 2017 (UTC)

Translations added by IvanScrooge98 (talkcontribs)[edit]

This erroneous edit by User:IvanScrooge98 in Recent Changes attracted my attention. A quick check of their recent additions of Chinese translations shows that he/she is certainly a non-speaker of Chinese. A large proportion of their added Chinese translations were outright incorrect, others often problematic. Some recent, outright erroneous examples include: diff, diff, diff, diff, diff, diff. It's a shame that such sloppiness was not picked up earlier and was allowed to persist for such a long time. Their additions of translations in other languages also need to be thoroughly checked. Wyang (talk) 10:23, 16 August 2017 (UTC)

Hmmm… excluding the first one (from zh.wiktionary), I based the other edits, as I usually do, on the respective Wikipedia articles. I'm sorry if there's something wrong and willing to fix my mistakes. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 14:32, 16 August 2017 (UTC)
You should be careful when using non-English Wikipedias as a source because they are full of made-up garbage. Every now and then I have to remove a Portuguese translation that you add because it doesn’t meet our attestation criteria or are inaccurate. There’s no harm in using Wikipedia as a starting point when researching translations, but you should at least check Google Books. — Ungoliant (falai) 14:48, 16 August 2017 (UTC)
Guess I should more when I can. Sorry. [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 15:14, 16 August 2017 (UTC)
Most of the errors in Chinese translations are in your inferred Pinyin readings and traditional/simplified forms; these are more serious factual errors. Please see if you can fix the examples above, now knowing that they contain errors. Many of your added Chinese translations are sum-of-part terms which do not warrant inclusion on Wiktionary, but that is less serious of a problem. Wyang (talk) 00:59, 17 August 2017 (UTC)
@Wyang: is diff fine, for instance? [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 13:40, 17 August 2017 (UTC)
It's better, though both terms are SoP and should link to the individual components. Could you please also fix the six others? Wyang (talk) 21:54, 17 August 2017 (UTC)
@Wyang: would you mind check if my attempts are correct? Also, should I undo my additions at Warwick and Portoferraio? [ˌiˑvã̠n̪ˑˈs̪kr̺ud͡ʒʔˌn̺ovã̠n̪ˑˈt̪ɔ̟t̪ːo] (parla con me) 10:22, 18 August 2017 (UTC)
Not really, you haven't fixed the factual errors on those pages. It's all right- I will fix them. Please do not add other Chinese translations. Wyang (talk) 10:26, 18 August 2017 (UTC)

definitions vs. predicates[edit]

The entry for alt-right and the Tea Room/2017/August#alt-right discussion are recent manifestations of a failure to respect the concept of a definition. We do know how to do so, but sometimes some contributors act as if they believe that any predicate about a definiendum that they or someone else puts down in writing is a potential definition.

"Headquarters of US military imperialism" is not a definition of Pentagon, whether or not you believe the truth of "The Pentagon is the headquarters of US military imperialism".

What do we have to do to see to it that this basic notion of lexicography is respected? Would voting on a policy help? A definition style guide? DCDuring (talk) 22:02, 16 August 2017 (UTC)

Statistics for numbers of etymologies[edit]

For those of you who wanted to know, as of August 16, 2017, the largest number of etymologies any entries on the English Wiktionary has is 15. That entry is zꜣ. (Wouldn't it be amazing if you ever saw "Etymology 27", "Etymology 9432", etc. LOL ROFL LMAO) PseudoSkull (talk) 01:57, 17 August 2017 (UTC)

"Etymology 4320603, Etymology 4320604" PseudoSkull (talk) 01:58, 17 August 2017 (UTC)
You're ready for Wikidata Face-smile.svg Noé 15:29, 17 August 2017 (UTC)
Quite impressive! — Eru·tuon 19:11, 17 August 2017 (UTC)

AWB Rights[edit]

I'd like to use AWB on Wiktionary, mainly to run typo fixes, and a regex I made to split long See Also/Related terms/etc sections into columns, and to be able to search inside templates, and to dump Wiktionary data offline for faster searching and whatnot. I already have rights on English Wikipedia. Pariah24 (talk) 19:08, 17 August 2017 (UTC)

I notice that no one has even nominated you for autopatroller status, which means that all your edits are marked for review. It seems silly to have people not trusting your edits enough to stop checking all of them, but at the same time giving you the ability to make them in bulk. Chuck Entz (talk) 02:36, 18 August 2017 (UTC)
DI AWB (wiki sense). PseudoSkull (talk) 04:44, 18 August 2017 (UTC)
I really don't care if people review my edits, and the AWB policy makes no mention of that as a prerequisite. I've been editing Wikipedia for quite a while longer than I've had this account, and this "we don't trust your edits" business sounds pretty anti-AGF to me. Never had an admin say something like that to me before; do you speak this way to everyone? A simple no would have sufficed. Pariah24 (talk) 08:53, 18 August 2017 (UTC)
I am sure Chuck was not intending to offend. We have the edit patrol feature enabled here, and the general practice is that once someone has been editing for a little while the people who patrol edits notice that they make reasonable edits and don't need to be patrolled any longer. If you are not yet set to autopatrolled status it may indicate that you have not edited here sufficiently (or, sometimes, sufficiently well) to have been noticed and flagged by a patroller. I would suggest that you just continue making good edits and I am sure you will be autopatrolled an eligible for AWB in no time. - TheDaveRoss 12:21, 18 August 2017 (UTC)
Thank you. Somehow I managed to give the impression that we're so suspicious of them that we have them under surveillance, or that we think there's something wrong with their edits, or that we only talk to the cool people who already know the secret handshake... The simple fact is that AWB access requires that we know the contributor in question well enough to be sure they know Wiktionary's standards and practices well enough to avoid making mistakes, since those mistakes would be propagated much more widely using the AWB tool, and that we just don't know them well enough- yet. Chuck Entz (talk) 08:41, 20 August 2017 (UTC)

Tsolyáni language[edit]

Do we include words in this fictional language? I'm working my way through some missing French nouns and came across zaqé which our French friends define as "Troisième jour de la semaine dans le calendrier tsolyáni". SemperBlotto (talk) 05:48, 18 August 2017 (UTC)

No, see Wiktionary:Criteria_for_inclusion#Constructed_languages. DTLHS (talk) 05:50, 18 August 2017 (UTC)

Using HTML attributes instead of classes for WT:ACCEL[edit]

Currently, WT:ACCEL has data passed to it using CSS classes, so that the resulting Wikicode looks like this on bar: <span class="form-of lang-en plural-form-of"><b class="Latn" lang="en">[[bars]]</b></span>. There's a few points to note about this.

  1. There's two wrapping HTML elements, span and b, even though these could easily be combined into a single b element, as long as WT:ACCEL is modified to recognise not just span elements.
  2. If step 1 is done, then there is no more need for the lang-en CSS class, because WT:ACCEL can extract it directly from the b element's lang= attribute.
  3. HTML allows you to specify custom attributes named data- followed by any text. We can use this, rather than CSS classes, to specify the inflectional data.

All in all, the line above would end up looking like this: <b class="Latn" lang="en" data-accel-form="plural">[[bars]]</b>.

What do people think of this change? @Dixtosa in particular. —CodeCat 12:54, 19 August 2017 (UTC)

Looks a bit cleaner. Equinox 12:56, 19 August 2017 (UTC)
Looks cleaner, yes, but I do not see any other benefit... yet. --Dixtosa (talk) 13:21, 19 August 2017 (UTC)
I very much like this, even if there are no benefits besides the neatness. — Eru·tuon 23:46, 19 August 2017 (UTC)
@Dixtosa Can MediaWiki:Gadget-WiktAccFormCreation.js line 20 be modified to $('.form-of a.new').each(function(){, and line 23 to var formof_classnames = $(this).closest(".form-of")[0].className.split(' ');? This will allow elements other than span to contain the acceleration information, which facilitates step 1. —CodeCat 11:22, 20 August 2017 (UTC)
Step 1 has been completed, and the link on bar now looks like this: <b class="Latn form-of lang-en plural-form-of" lang="en">[[bars]]</b>. Step 2 can now be implemented. It might be as simple as putting something else in line 76 of MediaWiki:Gadget-WiktAccFormCreation.js, but I'm not sure what. The way the code is currently written, it only passes the classes to the function (the details parameter), not the wrapper element itself. Would replacing line 76 with lang: $(link).closest("[lang]").attr("lang"), be sufficient? Perhaps the code should be restructured so that the element itself is passed around instead of only the classes, but I will leave that to Dixtosa to implement. —CodeCat 18:34, 20 August 2017 (UTC)

Disambiguate Wikisaurus (thesaurus) entries by language[edit]

So wikisaurus:juoppo -> wikisaurus:fi:drunkard, wikisaurus:drunkard/Finnish or wikisaurus:drunkard/fi and

wikisaurus:insane -> wikisaurus:en:insane, wikisaurus:insane/en or wikisaurus:insane/English

Whether we use English or native words in the pagetitle, collisions would quickly happen as soon as someone added non-English words (which they may have refrained from out of uncertainty). I personally prefer the first scheme, because it is similar to what we use for topical categories like Category:de:Graph theory and because it does not imply the existence of useless superpages (parent page? root page? the opposite of a subpage). As for using native versus English words: the WS entry is tied to meaning seen as abstract from specific words, so I do not see why we should not use English. Are there any large synonym groups that cannot be succinctly expressed in English?__Gamren (talk) 13:23, 19 August 2017 (UTC)

I prefer Wikisaurus:English/drunkard, following the same scheme as Rhymes pages. —CodeCat 13:26, 19 August 2017 (UTC)
I prefer to keep the current Wikisaurus setup for its simplicity until it becomes obvious that collisions are an actual problem. --Dan Polansky (talk) 14:02, 19 August 2017 (UTC)
Here are some strings that might be expected to have many synonyms in more than one langauge: god (Danish/English), person, gut (Nynorsk/German), pen (Welsh/Norwegian/Mindiri/Mapudungun). Is it obvious yet?__Gamren (talk) 14:18, 19 August 2017 (UTC)
From what I have seen, collisions have not become an actual problem yet. Currently, we cater for collisions by being setup for multiple languages per Wikisaurus page, on the model of the mainspace. If you start expanding Danish part of Wikisaurus and you run into obstacles preventing you from productively expanding that part, we can see how to best remove them. --Dan Polansky (talk) 14:27, 19 August 2017 (UTC)
For reference, one of the subject home pages: Wiktionary:Wikisaurus#Multilingualism. One past discussion: Wiktionary:Beer_parlour/2009/March#Wikisaurus_-_non-English_entries - here, a suggestion was made that would lead to wikisaurus:fi:juoppo. --Dan Polansky (talk) 14:39, 19 August 2017 (UTC)
I have now edited WS:god and created WS:da:beautiful (I would be fine with CodeCat's suggestion above, as well).__Gamren (talk) 18:25, 19 August 2017 (UTC)
I support having pages like Wikisaurus:English/drunkard, per CodeCat. It would be consistent with rhymes and reconstruction pages.
--Daniel Carrero (talk) 12:42, 21 August 2017 (UTC)
I also think that would be the best format. Andrew Sheedy (talk) 04:23, 25 August 2017 (UTC)
WS:Danish/pain, WS:Danish/furthermore, WS:Danish/villain.__Gamren (talk) 15:19, 10 September 2017 (UTC)
But inconsistent with topical categories. --Dan Polansky (talk) 19:14, 10 September 2017 (UTC)
I'm puzzled; the proposal is to have pages for the x language synonyms of the x language translation of an English word? — Eru·tuon 18:12, 10 September 2017 (UTC)
I don't like this change in practice. If such a change should take place, let me note that language codes are slightly winning over full language names in Wiktionary:Votes/2017-07/Rename categories; WS:da:villain would be more in keeping with that than WS:Danish/villain. However, perhaps the opposes in that vote are not only about language code vs. language name. I seem to prefer WS:da:villain over WS:Danish/villain; what I prefer most is the continuation of the preexisting practice in many non-English Wikisaurus entries, including WS:příbuzný, Finnish: WS:juoppo, French: WS:chat, Hindi: WS:मनुष्य, Polish: WS:gruchot, Portuguese: WS:duradouro, and Telugu: WS:కుక్క. --Dan Polansky (talk) 19:14, 10 September 2017 (UTC)
The current practice is supported by {{ws}} in that the template automatically links to a Wikisaurus entry if it exists; thus, in WS:bird, there is "[WS]" next to "passerine", linking to WS:passerine; this works automatically with current practice for both English and non-English WS entries, but will require additional markup with the proposed new practice. --Dan Polansky (talk) 19:20, 10 September 2017 (UTC)
@Erutuon No, this proposal is about changing the pagetitle of WS entries, such that entries in different languages are on different pages. Please read what I initially wrote.__Gamren (talk) 18:49, 11 September 2017 (UTC)
@Gamren: I did read it, but I am responding to the fact that you want to move the synonyms of juoppo to a page titled with an English translation drunkard. I was describing what this means: having a page for synonyms of a translation of an English word. (In this case, a page for Finnish synonyms of a translation of the English word drunkard.) It is kind of confusing as a concept. It seems similar to having an expanded table of translations for the various senses of an English word. So it would be less confusing if it were titled Translations:fi:drunkard.
It would get more confusing when the word (a given series of letters) has several different meanings in English and in the language of the page. How do you tell people that they can't add senses of the word in the language of the page? For instance, that WS:de:fast should not contain synonyms for a German meaning, "almost", but can contain synonyms for an English meaning, "moving at a high speed".
And which English-titled page should the content of a non-English-titled page be moved to, if it contains multiple senses that do not all belong to a single English word? (I do not know if this case exists because I haven't worked on Wikisaurus pages much, if at all.) — Eru·tuon 19:15, 11 September 2017 (UTC)
We use {{ws sense}} to specify the sense, so there should not have to be doubt as to what sense a page uses. But I do not really mind native-language titles.__Gamren (talk) 07:19, 14 September 2017 (UTC)
I have renamed the discussion from "Disambiguate WS entries by language" to "Disambiguate Wikisaurus (thesaurus) entries by language" to make it easier to find later. --Dan Polansky (talk) 08:21, 17 September 2017 (UTC)
Some more notes:
1) The Danish content of Wikisaurus:god can be at Wikisaurus:brillant; WS pages are often synonym rings, and thus you can often find a synonym for which the headword in Wikisaurus namespace is still unoccupied. This makes it possible to continue with reduced overlap even in the current setup. Admittedly, once a lot of languages get covered by Wikisaurus, it may be increasingly hard to ensure non-overlap, but we have not arrived at that stage yet, so we do not know. Alternatively, the English content of Wikisaurus:god could be at Wikisaurus:deity.
2) If editors insist/very much desire that there is a guaranteed non-overlap, the naming scheme proposed long time ago should be considered: Portuguese Wikisaurus:duradouro would be at Wikisaurus:pt:duradouro, or there would be, for Danish, Wikisaurus:da:smerte (pain) instead of WS:Danish/pain. Then, the automatic "[WS]" hyperlinking as seen in WS:bird is easy to make work, by providing language code to {{ws}} in entry markup. This scheme is dissimilar to the one used by categories and the one used by rhyme pages, but that is fine: Wikisaurus has problems different from those of categories and rhymepages, and therefore can use a different scheme, one tailored to meet its needs. Whether language codes or language names are used in this scheme is a separate choice to be made, and is relatively less important; Wikisaurus:Danish:smerte and Wikisaurus:Danish/smerte make it equally easy to make "[WS]" automatic interlinking work.
--Dan Polansky (talk) 08:26, 17 September 2017 (UTC)

employment category?[edit]

Do we have a category for employment related terms like job title, trade union, severance pay, etc.? This would eminently useful IMO. ---> Tooironic (talk) 02:16, 20 August 2017 (UTC)

English names for letters of the Arabic language[edit]

Do we include these? Our page Arabic script has a table of them, but they link to the Arabic letters themselves. SemperBlotto (talk) 04:47, 20 August 2017 (UTC) (I've just added the French zhâl - hope it's OK)

Category:en:Arabic letter names DTLHS (talk) 04:48, 20 August 2017 (UTC)
So my French term seems to be wrong - I can't figure out how to correct it. SemperBlotto (talk) 04:51, 20 August 2017 (UTC)
What do you mean wrong? DTLHS (talk) 04:54, 20 August 2017 (UTC)
We're probably missing the English names of some letters if you're concerned that it's a red link. DTLHS (talk) 04:56, 20 August 2017 (UTC)
Also some of the entries currently in Category:en:Arabic letter names are not letters; they are Arabic diacritics. Wyang (talk) 04:57, 20 August 2017 (UTC)
OK, I'leave it alone - totally outside my comfort zone. SemperBlotto (talk) 05:00, 20 August 2017 (UTC)

Edits by 217.76.10.22[edit]

This IP user has been adding "Ancient Armenian" (= Old Armenian) terms as etymons for Modern or Ancient Greek terms, while deleting the old etymologies, as well as some other things. The etymologies are dubious: for all of them, because the terms were attested before the time of Old Armenian, and doubly so for some, because they are phonologically implausible (առասպել (aṙaspel) supposedly yielding μῦθος) or the etymology actually goes the other way (ῥινόκερως was calqued by ռնգեղջիւր (ṙngełǰiwr)). Not sure what to do here besides revert the edits, which I've done. It would probably be better manners to explain to him or her, but I don't feel like it. — Eru·tuon 06:42, 20 August 2017 (UTC)

I reverted an edit by this same idiot (using a slightly different IP) that added "ancient Armenian" to the etymology for an Old Armenian term- they're even further out there than you give them credit for. They're changing their IP, so I doubt they'll read anything you leave on their talk page, but it never hurts to try, I guess. That said, feel free to revert them- as far as I'm concerned, they're only one step removed from the vandals who randomly replace language headers with the names of their own languages. Chuck Entz (talk) 08:13, 20 August 2017 (UTC)
@Erutuon, Chuck Entz: They're still at it: Special:Contributions/108.41.0.140. Not effectively range-blockable, I presume? —Μετάknowledgediscuss/deeds 05:50, 20 October 2017 (UTC)
Given that 217.76.10.2 (talkcontribswhoisdeleted contribsnukeedit filter logblockblock logactive blocksglobal blocks) geolocates to Armenia and 108.41.0.140 (talkcontribswhoisdeleted contribsnukeedit filter logblockblock logactive blocksglobal blocks) geolocates to Ohio, I don't think a range block is a good idea... My impression is that the Armenian IP is an idiot, while the American one just has no idea at all about what they're doing. They posted to Talk:iber, so I was able to respond with a whole laundry list of what they got wrong (pretty much everything). With any luck they'll read it and realize they don't quite have it all figured out yet, after all. Chuck Entz (talk) 07:23, 20 October 2017 (UTC)

break[edit]

Hi there. I'm taking a month off Wiktionary to concentrate on things IRL. You won't be hearing from me at all. So, in the unlikely case that you see someone here who you think might be me, who's following my edit patterns or whatever, it won't be. Thanks. . --WF on Holiday (talk) 17:49, 20 August 2017 (UTC)

BBC Pidgin English platform[edit]

An interesting source of vocabulary if anyone feels like taking up a new language (Category:Nigerian Pidgin language). DTLHS (talk) 18:50, 22 August 2017 (UTC)

Nigerian Pidgin English has a big problem with orthographic norms (i.e., they're all over the place). The BBC uses a very acrolectal orthography, but many sources use a phonemic orthography (like what we try to standardly use for Krio, which is an extremely similar language from Sierra Leone). —Μετάknowledgediscuss/deeds 18:56, 22 August 2017 (UTC)

Kingdom of Great Britain[edit]

Translation requests can no longer be added to this entry, I'm guessing because sense IDs are being used. How to fix this bug? ---> Tooironic (talk) 13:30, 23 August 2017 (UTC)

Category:etyl cleanup - cleanup going backwards?[edit]

This shows 109703 entries needing attention at the moment, a day or two ago it was 109682. Does this mean someone is still using {{etyl}}? DonnanZ (talk) 19:44, 23 August 2017 (UTC)

Probably. If the template still works and there are no obvious errors or warning messages people are still going to use it. DTLHS (talk) 19:48, 23 August 2017 (UTC)
Then maybe the template should be disabled to give us a chance. It's a terribly long job getting rid of it ({{etyl}}) anyway. DonnanZ (talk) 19:54, 23 August 2017 (UTC)
Could always make an edit filter that, at very least, flags entries which add {{etyl| to an entry. - TheDaveRoss 20:07, 23 August 2017 (UTC)
The person(s) still using it need(s) to be traced somehow. DonnanZ (talk) 14:37, 24 August 2017 (UTC)
I added AF71 to flag these with "etyl" so we can see who is using them. - TheDaveRoss 15:42, 24 August 2017 (UTC)
It's caught one fish already, User:Froaringus. DonnanZ (talk) 16:10, 24 August 2017 (UTC)
 :-/ Oops. Sorry! While I try to use inh, der, bor, from time to time I found etyl useful when there is a composite word... But if the idea is to reduce its presence obviously I won't use it again.--Froaringus (talk) 18:14, 24 August 2017 (UTC)
I use this. I have not been made aware of alternatives, nor that it is unacceptable to use it. Equinox 18:18, 24 August 2017 (UTC)
{{etyl}} should generally be replaced with {{der}}, {{inh}}, {{bor}}, {{cog}}, or {{noncog}}, depending on the situation. — Eru·tuon 18:34, 24 August 2017 (UTC)
Ditto. SemperBlotto (talk) 18:28, 24 August 2017 (UTC)
Personally I use {{der}} and {{cog}}. I don't believe in using {{bor}} or {{inh}}, in any case those two create excess bumph at the bottom of the page. DonnanZ (talk) 19:18, 24 August 2017 (UTC)
They only create "excess bumph" because we make them double-categorize, which can be in principle be disabled if the community so decides. --Tropylium (talk) 21:18, 24 August 2017 (UTC)
There was actually a vote that decided the opposite. So apparently the majority wants this. —Rua (mew) 10:54, 25 August 2017 (UTC)
I'm also guilty...it seems to me there have been cases where none of the other templates fit, but my memory may be playing tricks on me. I can't remember exactly which entries they were. I wasn't actually aware that we were trying to get rid of {{etyl}} either. Andrew Sheedy (talk) 04:30, 25 August 2017 (UTC)
I originally had a problem with {{der}} when I didn't know the actual word it was derived from, until I hit on using a hyphen in the last field, e.g. {{der|nb|gml|-}}. DonnanZ (talk) 08:17, 25 August 2017 (UTC)
When you don't know the word, you're supposed to leave it empty. A hyphen means "no word belongs here", which is wrong. —Rua (mew) 10:55, 25 August 2017 (UTC)
If you know the language it comes from, at least that can be shown that way. Alternatively {{der|nb|gml|}} which brings up the question "term?" DonnanZ (talk) 11:03, 25 August 2017 (UTC)
Yes, that's the point. If you don't know the term, it puts down a notice so someone else can add it. —Rua (mew) 11:07, 25 August 2017 (UTC)
The first example is not necessarily "wrong", {{der|xx|yy|-}} can be used in other circumstances. DonnanZ (talk) 18:31, 29 August 2017 (UTC)
I still use it in my cleanup of Spanish entries because all I know is the Latin words they come from. Should I use der every time? Some people have mindlessly converted etyl into der in every case, but the issue still remains that we've changed the system so that it will always be clear whether terms are inherited or borrowed. Even if we deprecate etyl, we'll just have to populate cat:etyl_cleanup with entries using der but not bor or inh... It seems to me that we'll just have cleaner template parameters. Ultimateria (talk) 22:32, 29 August 2017 (UTC)
I get the impression that some users resent having a cleanup forced upon them, which may not have been necessary if it wasn't for the {{bor}} / {{inh}} brigade. I suggest that if you want to clear the list just use {{der}}, and let the purists sort out what are considered as borrowed and inherited terms later. The use of {{der}} involves less keystrokes than {{etyl}}, which is something in its favour. DonnanZ (talk) 09:06, 30 August 2017 (UTC)
That is not particularly helpful. That means that after {{etyl}} is "cleaned up", {{der}} will have to be cleaned up. Better to leave {{etyl}} or to learn where to use {{bor}} and {{inh}}. There are also cases where {{der}} can't be used: words that are not involved in an etymology, and have a hyphen in the second parameter of {{etyl}} (for instance, {{etyl|en|-}} {{m|en|word}}, which must be changed to {{cog|en|word}} or {{noncog|en|word}}). — Eru·tuon 22:33, 1 September 2017 (UTC)
We can generate cleanup lists with the new templates to pick out some of the mistakes or omissions. For example, an etymology whose first etymology template is {{der}} is probably missing something; either it should be {{inh}} or {{bor}}, or a step has been left out. Conversely, {{bor}} can only appear as the first etymology template, and {{inh}} can only be preceded by other {{inh}}s. —Rua (mew) 22:41, 1 September 2017 (UTC)
@CodeCat: I suppose that would work in the majority of cases, but as it relies on the first step in the etymology coming first, there may be some cases that fall through the cracks. (Hypothetically; I can't point to any examples.) — Eru·tuon 20:47, 2 September 2017 (UTC)
Etymologies aren't always written in order (see bindiga, for example), and sometimes {{der}} is appropriate when something can neither be said to be borrowed nor inherited. —Μετάknowledgediscuss/deeds 22:35, 2 September 2017 (UTC)

Lithuanian dalyvis participles[edit]

mylįs. Should these be categorized as lemmas or non-lemmas? DTLHS (talk) 22:11, 23 August 2017 (UTC)

Can we not give it an English name? —Rua (mew) 22:14, 23 August 2017 (UTC)
I have no idea. What would you call them? DTLHS (talk) 22:14, 23 August 2017 (UTC)
Adjectival participles. That's what Wikipedia calls them. —Rua (mew) 22:17, 23 August 2017 (UTC)
In any case, they are non-lemmas, because they are forms of verbs. The part of speech should be changed to Adjective though. —Rua (mew) 22:19, 23 August 2017 (UTC)

Please restore my admin rights[edit]

As title. It's been more than a year now. It's incredibly frustrating to not be able to delete new user vandalism or delete the original as I move entries with wrong titles. Wyang (talk) 09:38, 24 August 2017 (UTC)

Support, please give Wyang and CodeCat's rights back. --Daniel Carrero (talk) 10:27, 24 August 2017 (UTC)
Support per Daniel Carrero. --Anatoli T. (обсудить/вклад) 10:30, 24 August 2017 (UTC)
Oppose. The issue with Module:links still hasn't been solved. —Rua (mew) 10:49, 24 August 2017 (UTC)
That issue doesn't need to be resolved in order for the two of you to be admins, you just both need to agree to be reasonable adults and not wheel war when you disagree. There are lots of things on the wiki with which various admins disagree, but it is possible to disagree and also be willing to abide by the decisions of others. If what you are really saying is that you are incapable of having admin rights and acting reasonably, then I don't think you should have the admin rights. - TheDaveRoss 12:55, 24 August 2017 (UTC)
When I receive admin right's I'm going to use them to fix problems, and that includes the problem currently in Module:links. If Wyang immediately reverts, aren't we back to square one? The issue won't be solved until Wyang agrees not to revert, and so far he hasn't. In the past, I asked for people to come to a consensus on what the desired situation is, but there is still no consensus, which means there is no decision to respect and everyone is free to do as they like. Which leads to the edit war. There must be a consensus decision before this will stop, I've said it before. —Rua (mew) 13:37, 24 August 2017 (UTC)
(Edit conflict) The sad part is that, if I were to make a shortlist of the people most qualified to solve this, these two would be at the top of the list. Both have continued to make valuable contributions for a year in spite of it all, so they do know how to rise above things- it's just when you put them together that things go wrong. The ideal solution would be for them to agree to leave the disputed part of Module:links alone for now, and to work on a solution that both can agree to. I had a proposal on the table that no one seemed to object to, but any workable solution would be great. Chuck Entz (talk) 13:47, 24 August 2017 (UTC)
I don't think that's fair. Wyang wants the status quo. I will respect the status quo if there is a clear consensus that it's how we want to do things. So far, I haven't seen any. The only reason it's the status quo is because Wyang got the last revert, not because it's what anyone wants. —Rua (mew) 13:50, 24 August 2017 (UTC)
To say it another way, I will only accept a consensus on which version is preferred, I will not accept Wyang's version unless that's also the consensus. Without consensus, I will only accept my version. —Rua (mew) 13:53, 24 August 2017 (UTC)
SupportAryaman (मुझसे बात करो) 11:02, 24 August 2017 (UTC)
Oppose per CodeCat. DCDuring (talk) 13:03, 24 August 2017 (UTC)
Oppose --Victar (talk) 16:51, 24 August 2017 (UTC)
Support -- ɱɑɗɦɑѵ (talk) 17:09, 24 August 2017 (UTC)
Oppose. I'll repeat what was said in previous discussions, which is still true today: Neither admin has done or said anything to indicate that they will not resume the conflict when granted admin powers. Every one of the many attempts to resolve the conflict has failed magnificently. Perhaps we could restore their admin powers only on the condition that they will not make any edits related to the original conflict, but that might be difficult to enforce. --WikiTiki89 20:23, 24 August 2017 (UTC)
It can be enforced through a consensus, as I noted. —Rua (mew) 20:47, 24 August 2017 (UTC)
I think CodeCat (Rua) is right in demanding consensus. I praise her for keeping her position in the face of disagreement and would encourage her to continue. While I wish Wyang to have his admin powers restored too, I wish he was more willing to participate in consensus-building process one year ago concerning Module:links and I'd like him to focus on said consensus-building process if he wants to keep his changes on that module. --Daniel Carrero (talk) 21:11, 24 August 2017 (UTC)
I just want to point out that CodeCat is equally responsible for the failure to resolve the conflict. You can go back to the previous discussions and see for yourself. --WikiTiki89 21:16, 24 August 2017 (UTC)
OK, would you like to show us any diffs or quotes to back up that statement about CodeCat? I believe I did read those discussions in their entirety but I could have missed that part. --Daniel Carrero (talk) 21:20, 24 August 2017 (UTC)
I don't want to resurrect all the old arguments here. Go back to the all the discussions and see for yourself. --WikiTiki89 14:38, 25 August 2017 (UTC)
I didn't envisage that this would transition into a vote... anyway, if what is needed is a guarantee of not editing in the relevant modules, then that can be provided on my part. The dispute is not going to resolve itself even if the current state continues for another ten years, but I'm not really bothered now- there are more important things to do in life and on Wiktionary compared to spending hours and days to argue anonymously and disinhibitedly in an online forum. Those debates were a waste of mental exercises to the detriment of collegiality, really. It's a shame that Wiktionary as an online platform has also largely given in to the shortcoming of general online disinhibition secondary to anonymous, asynchronous and non-face-to-face communication, as exemplified by the discussion above. Stances are more important than comments and opinions; they precede the latter and are boldified. There were numerous moments in the past year when I wished I could delete the original entries with incorrect titles as I moved them; those entries often stayed in the speedy delete category for days before someone heeded them. There are still numerous incorrect Pinyin entries from residual moves. Likewise Wiktionary:Requests for deletion has a backlog of Chinese, Thai, etc. requests which should have been cleared a long time ago. Reverting vandalism only to see it being readded moments later. Recently was hoping to fix the link in {{ja-suru}} too, but was unable to. Wyang (talk) 22:20, 24 August 2017 (UTC)
@Chuck Entz: It appears to me that we have reached a point where one party can agree not to engage in the conflict, and the other cannot ("Without consensus, I will only accept my version."). This to me suggests that Wyang can be trusted not to wheel-war, and therefore has achieved a level of readiness that CodeCat has not, irrespective of the resolution of the original conflict. —Μετάknowledgediscuss/deeds 00:15, 25 August 2017 (UTC)
To play the devil's advocate, this may be because the current version of the module is Wyang's preferred version. —suzukaze (tc) 02:28, 25 August 2017 (UTC)
The guarantee is “not editing in the relevant modules”, not conditional on the state of the pages. Wyang (talk) 02:50, 25 August 2017 (UTC)
Support. I'm not involved with this, but I'm a generally positive thinker (at least I'd like to say). PseudoSkull (talk) 00:20, 25 August 2017 (UTC)
Oppose. Leasnam (talk) 02:43, 25 August 2017 (UTC)
Support. I'd have preferred resolution of the specific underlying conflict, by Wyang's guarantee changes my vote. DCDuring (talk) 04:41, 25 August 2017 (UTC)
Support. In addition, perhaps Codecat could outline the two sides of their dispute so we can all understand what it's about. Then we might be able to reach a consensus. I think I knew something about it at one time, but I've completely forgotten. —Stephen (Talk) 05:29, 25 August 2017 (UTC)
I want to remove the phonetic_extraction part from Module:links. To compensate, some edits need to be made to Module:th-translit, so that it works like regular transliteration and doesn't need special module support. In general I'm opposed to coding language-specific exceptions into generic modules, especially as in this case it's completely unnecessary. —Rua (mew) 10:59, 25 August 2017 (UTC)
Support restoring the rights of the sides of the conflict, provided that they don't make direct edits related to that conflict. --Z 07:00, 25 August 2017 (UTC)
At the moment I think I oppose restoring rights to either party. While they may be willing to accept one outcome or another for this specific situation, it doesn't appear that the next similar situation which arises will have any different result. The appropriate response when two well-meaning individuals disagree is to stop taking action and engage in conversation. If no mutually agreeable solution arises then have a vote and abide by the results of that vote. If that amicable resolution path is not one that an individual is willing to follow then I don't think they should be given the power to override page protection. If one or both Wyang and CodeCat say that they are willing to resolve issues in "the wiki way" going forward then I would support the reinstatement of admin rights. - TheDaveRoss 15:00, 25 August 2017 (UTC)
Consensus is not the wiki way? I want a consensus I can abide by, but so far there isn't any. The state of Module:links is left to the last editor. I want a consensus on the state of the module, whether to include Wyang's code or to exclude it. I'll abide by a consensus as long as it's clear that there is one. —Rua (mew) 15:25, 25 August 2017 (UTC)
All of your statements in this conversation reinforce the idea that you are not willing to behave in a reasonable way. It is great that you are willing to abide by consensus, but you also say that you will immediately engage in further wheel-warring until that consensus is reached. As with all non-vandalism issues, if there is controversy the status quo remains until there is consensus to change the status quo. Your statements boil down to "I will do what I want until there is a vote to stop me" which is not at all how collaboration works. Being bold only extends to the point that someone objects. - TheDaveRoss 15:37, 25 August 2017 (UTC)
I objected to Wyang's addition of code to Module:links. When I removed it, it was reinstated by Wyang. So Wyang didn't try to find consensus for his additions: someone objected, and yet his edits remain. Since I still object to his edits, and there is no consensus to keep them or remove them, it's at my discretion to remove them again. —Rua (mew) 15:56, 25 August 2017 (UTC)
So you were both unreasonable. It takes two to make a wheel war. I think TheDaveRoss has hit the nail on the head with all of his remarks here. The reasonable, adult course of action here is to leave things as they are for now, come up with a compromise that both sides can he happy with, then remove the disputed code as part of implementing the new solution. Chuck Entz (talk) 16:33, 26 August 2017 (UTC)
@Chuck Entz, it was only in the beginning that they both were unreasonable. That's no longer the case. Only Codecat promises to resume her previous edits to the modules. Wyang has guaranteed “not editing in the relevant modules”, regardless of what Codecat does with them. Wyang's word can be trusted and his admin status should be reinstated. —Stephen (Talk) 09:25, 29 August 2017 (UTC)
@Chuck Entz Any decision? Or is it going to last till the end of time...? Wyang (talk) 00:00, 1 September 2017 (UTC)
Sorry. I've been trying to avoid heat stroke while taking public transportation in 107-degree weather (I think that's somewhere in the 41-42-degree range in centigrade), so I've been a bit pre-occupied lately (did I mention that an AwesomeMeos sock has been very active lately, and I believe an Uther Pendrogn sock has posted recently as well). I hope a long weekend in my air-conditioned apartment will give me a chance to sort all of this out. Chuck Entz (talk) 02:20, 1 September 2017 (UTC)
@Chuck Entz Any update for now? Please see history of the entry zoosexuality. Wyang (talk) 23:26, 18 September 2017 (UTC)
As said in my comment above, I'm not really bothered anymore. CodeCat is free to do whatever she wants to those pages. I would rather spend my time and energy more efficiently on other things than wasting it on this. Wyang (talk) 22:35, 25 August 2017 (UTC)
Tentatively Support, based on the character revelation above. Leasnam (talk) 16:43, 26 August 2017 (UTC)
Support for Wyang, but not for CodeCat/Rua, given their respective attitudes. Andrew Sheedy (talk) 17:26, 27 August 2017 (UTC)
  • Provisional support for both Wyang and CodeCat, or for neither. I oppose restoring admin rights for only one and not the other. And I support immediately desysopping both of them if the wheelwarring starts again. —Aɴɢʀ (talk) 20:55, 27 August 2017 (UTC)
    • CodeCat states that she will edit the modules in question as soon as she is reinstated. Wyang, however, has promised not to edit the modules again and will allow CodeCat to have her way. Since Wyang will not participate in wheelwarring, there will be no more wheelwarring. —Stephen (Talk) 00:56, 2 September 2017 (UTC)
  • If anything, CodeCat's remarks above made me realise that I should rather take a detour than have any interaction with her... Life is too short. Wyang (talk) 11:28, 28 August 2017 (UTC)
  • I oppose re-sysopping Wyang. The wheel war for which he was desysopped was pretty incredible, showing a severe lack of self-control or lack of understanding why a protracted speedy revert war or wheel war is bad. Furthermore, some of his remarks on this very page suggest that he does not support consensual decision making but rather decision making based on strength of argument, which all too often results in the holder of such a view making themselves the judge of which arguments are strong and which are not, resulting in non-consensual decision making. Wyang had plenty of time to draft a vote to help resolve the matter of disagreement with CodeCat, but he did not do that, and is unlikely to do so given his apparent oppositions to votes. Wyang's adminship would provide undeniable benefits, but the things said seem to offset those benefits. --Dan Polansky (talk) 16:24, 1 September 2017 (UTC)
  • Sorry to take so long- the past couple of months have been very difficult for me IRL, and I'm only getting to this now because I'm off work due to a foot infection and have the time.
  • I would have preferred for the issues leading to the conflict to have been resolved before acting, but I don't feel I should be overriding the decisions of the community that made these two people admins. Basically, I intervened because of a wheel war that was spiraling out of control. Now that @Wyang has committed to not continuing the wheel war, that is no longer an issue. I will now restore both to adminship.
  • I will just say that @CodeCat/Rua should refrain from any kind of victory dance, because she owes the restoration of her admin rights strictly to Wyang's actions, not her own. I would highly recommend for both to strive to outdo each other in maturity and magnanimity instead of in petty stubbornness, as before, but that's not something I can or should enforce. Chuck Entz (talk) 21:36, 21 September 2017 (UTC)

I would like to state again for the record that I strongly oppose the reinstatement of admin rights to CodeCat/Rua. She has consistently shown the she lacks key skills in working well with others, and has no capability and/or desire to change that. I also believe that this resolution of reinstatement should have voted upon on a proper vote page. --Victar (talk) 21:31, 4 November 2017 (UTC)

Note that this is probably triggered by Victar's diff repeated diff and diff unexplained diff removal of descendants from an entry, and my attempts to restore the removed content. —Rua (mew) 21:38, 4 November 2017 (UTC)
You're absolutely right that this incident reminded me of your character, and the danger of giving you admin rights, but this has been longstanding opinion myself and others, which you can easily read above. --Victar (talk) 21:56, 4 November 2017 (UTC)
Although I have the ability to remove admin rights, my authority to do so comes from the community, who voted to make her an admin. My earlier action was solely for the purpose of stopping a disruptive and inexcusable wheel war, not to overrule the community's decisions. If you want to remove her admin rights, you'll have to get the community to vote to do so.
Not that I'm endorsing her attitude or actions- parts of which I indeed dislike- but I'm not the king of Wiktionary. I'm only going to do what the community asks me to do, or what I assume they would ask me to do if there were time for a vote, in the case of emergencies such as this one. Sorry to disappoint you. Chuck Entz (talk) 23:47, 4 November 2017 (UTC)
I have created Wiktionary:Votes/sy-2017-11/Desysopping CodeCat aka Rua, seeing no better option. --Dan Polansky (talk) 23:59, 4 November 2017 (UTC)
Thanks, @Dan Polansky. --Victar (talk) 04:50, 5 November 2017 (UTC)
I understand, @Chuck Entz. I was just surprised that this discussion didn't first become a formal vote, with a set voting period and a vote minimum, especially for such a contentious issue. I fault myself for not speaking up. --Victar (talk) 04:50, 5 November 2017 (UTC)

घॣ (ghl̥̄) and other Devanagari "Translingual" consonant + vowel combinations[edit]

These were created en masse by User:Thecurran, but there are so many things wrong on those pages:

  • The pronunciation is just the Sanskrit pronunciation, not a Translingual one. The pronunciations in modern languages are certainly not /ɡʱl̩ː/.
  • They are placed under the heading of Syllable, but they are not a list of syllables (cf. Category:Japanese syllables). They are simply a ‹consonant + vowel› combination; Sanskrit syllables are far more complex than these combinations.
  • The alternative forms section inappropriately makes use of {{pi-alt}} (a template listing the equivalents of a Pali term in other Brahmic scripts, relying on the backend of Module:pi-Latn-translit). As a result, most (if not all) of the forms produced are incorrect and the entry has been unheeded by others since creation. The author had also attempted to create Module:Deva-Latn-translit, but that was probably too ambitious.

Do we really need these entries? In this case I would argue that the benefit of absence of this information outweighs that of their presence. Wyang (talk) 11:13, 28 August 2017 (UTC)

I think they should all be deleted. घॣ (ghḹ) is definitely not Translingual, l̥̄ is unused in all of the Indic languages, Sanskrit included, since it was an artificial creation by grammarians. As for "syllables", क्षाउईएय्र् (kṣāuīeyr) is also a syllable, and I don't think anyone wants entries like that. —Aryaman (मुझसे बात करो) 11:36, 28 August 2017 (UTC)

What is the benefit of absence, Wyang? Warmest Regards, :)—thecurran Speak your mind my past 06:31, 8 September 2017 (UTC)

Do You know what it's like to decipher Devanagari without being taught it, Aryaman? In abugidas, many vowel signs inconsistently change in both position and appearance as they are applied to various consonants. For native readers these variations are readily apparent but, for non-native readers, each CV (consonant+vowel) cluster must be read as an independent grapheme. For non-natives to read or transcribe even a tiny sample text, they must isolate and look up each CV cluster as if it was from a syllabary; akin to Cherokee or hiragana. The process is so complex and tedious that humanity completely lost the ability to understand the Egyptian hieroglyphic syllabary for millennia before the Rosetta stone was found.

The reason for keeping ancient pronunciations is to allow one-to-one correspondence. Otherwise backwards compatibility is lost because accepting modern and historic mergers means that transcribing forwards and then backwards takes the user to a different wrong CV cluster than what they started with. Warmest Regards, :)—thecurran Speak your mind my past 05:31, 9 September 2017 (UTC)

@Thecurran: Frankly, don't give me any of that crap. My formal Hindi education ended when I was 6 years old, and so I never learned Devanagari at school. I taught it to myself about two years ago as I sought to relearn my native language, and it was absurdly easy. Second, Devanagari has changed very little since the age of Classical Sanskrit. The same conjuncts are used, and while there have been a few additional consonants accommodated in e.g. Hindi using the nuqta. Besides we use slightly modified IAST for all Devanagari languages so the phonemic outcome and these "mergers" and "shifts" don't really matter. Third, every piece of Devanagari in the main space is automatically transliterated with Lua anyways. Finally, do you really think someone who can't read Devanagari can type it in the search box... —Aryaman (मुझसे बात करो) 10:06, 9 September 2017 (UTC)
Logic has given way to emotion and this conversation has gotten impolite. It would probably be better if we paused this conversation to let things cool down for a fortnight or two. Warmest Regards, :)—thecurran Speak your mind my past 23:49, 9 September 2017 (UTC)
@Thecurran: Sorry for the excessive outburst, I just wanted to say that yes I know how hard it can be to learn a new script. The problem is why don't we do this for every other script then? Gujarati, Odia, Bengali, Sylheti, Kannada, Tamil, Telugu, Khmer, Thai... you get the point. Wiktionary should not be a character index or Unicode analyzer, it's a dictionary. —Aryaman (मुझसे बात करो) 04:01, 10 September 2017 (UTC)
As a non-native reader of Devanagari, deciphering and transliterating Devanagari was not difficult, compared to some of the other scripts. I'm not quite sure I understand your argument using the Egyptian hieroglyphs, but as an abugida, you just need to read Devanagari (or even just the diagram in Devanagari#Compounds) to know the basic mechanism of the script. As a dictionary, Wiktionary is not for including these units of the Devanagari script which are isolatable by the mouse cursor. There are many more complex scripts than Devanagari, such as Arabic, Burmese, Khmer, Thai and Tibetan ― we don't include Arabic لْ, Burmese ကြို, Khmer ក្បួ, Thai กั, Tibetan ཁྱུ as "Syllables", and I don't think we should, same for घॣ (ghḹ), or स्त (sta). (What is desirable, on the other hand, is a Unicode string analyser tool, which breaks a string into the Unicode characters and links to their entries.) Apart from this superfluity, what is also contributing to the benefit of absence is the erroneous content in these entries: the erroneous alternative forms, inappropriate header and pronunciation. Wyang (talk) 06:02, 9 September 2017 (UTC)

At the least, the abuse of {{pi-alt}} should be excised.—suzukaze (tc) 00:54, 18 September 2017 (UTC)

Despite the colocal survival of liturgical Coptic, nobody from Africa, Europe, or Western Asia for millennia was able to decipher Egyptian hieroglyphics because they (like most people on Earth) only had familiarity with abjads or alphabets, instead of abugidas or syllabaries; logographic or phonetic. This is one of the most profound effects of the Sapir–Whorf theory of linguistic relativity; humans who had never learned syllable-based writing simply could not mentally grasp the possibility of its existence. Even after the Rosetta Stone was found, it took a whole generation before someone could finally start cracking the code and even more generations before anyone could become fluent.
Having roots anywhere from Southern Asia to Eastern Asia gives You a monolithic advantage in Devanāgarī/Pāli/Sanskrit over the rest of us, who mentally compartmentalize glyphs only as consonants or vowels.
Even today the alphabetization order of the Japanese syllabaries is a direct descendant of Devanāgarī's: 'a 'i 'u 'e 'o k- s- t- n- h- m- y- r- w- 'n
As long as there exists a merger-less list/catalogue of the Devanāgarī syllables (exactly the same as our treatment of the Hangul Jamo), our readers will be able to decipher text art that a mouse cannot highlight; like in greeting cards or GIFs.
Incidentally, whenever a person tries to copy and paste into Google a syllable that they mistake for a letter, nothing relevant comes up to explain it to them. As such, it would be a very positive move for Wiktionary to cover these because Wiktionarians are the best placed people on Earth to bridge this knowledge gulf and because it increases Wiktionary's web presence; see a need, fill a need. Warmest Regards, :)—thecurran Speak your mind my past 09:30, 28 September 2017 (UTC)
“Having roots in Eastern Asia gives you a monolithic advantage in Devanāgarī/Pāli/Sanskrit” ― not at all. The issue is that Wiktionary has certain criteria dictating what is includable on the dictionary, and cursor-separatable units in a text string are simply not within the scope. What can be included on Wiktionary is individual Unicode characters, and the modern Hangul syllables are included because they are one Unicode character each. There is no precedent for including cursor-separatable units. There is no entry for
which are all valid strings someone may look up on Google.
For Devanagari, these units are not 'syllables' as written on the entry of घॣ (ghḹ) at all, and the units can be incredibly complex: a single cursor-separated unit can range from the 4-character द्वै (dvai) (here), to 6-character ग्न्या (gnyā) (here) and ष्ट्रि (ṣṭri) (here), to even the 9-character unit *र्ष्ट्रीं (rṣṭrīṃ).
There is absolutely no reason to include these sequences of characters per your rationale that someone may look the sequence of characters up; it's unestablished practice, a misuse of resources and only reduces the credibility of the site, especially if the information is presented like घॣ (ghḹ). Wyang (talk) 23:48, 28 September 2017 (UTC)
@Thecurran: The comparisons to Egyptian and Coptic are hyperbole; Devanagari (or any other abugide) is no harder to master than any other alphabet, abjad, or syllabary. And I've lived in the United States since childhood, I am not particularly immersed in Indian culture (which is a world of difference from Chinese and Japanese culture; the idea of a common East Asian cultural sphere that includes India is tenuous... India has more contacts in the Middle East), and I taught myself Devanagari in the just the past few years. As far as I can tell my brain is not wired to Devanagari or any abugida at all (I use an alphabetic keyboard to type in Devanagari, for instance). As for it being alien to Europeans:
"The first Hindi books, using the Devanagari script or Nāgarī script were one Heera Lal's treatise on Ain-i-Akbari, called Ain e Akbari ki Bhasha Vachanika, and Rewa Mharaja's treatise on Kabir. Both books came out in 1795. Munshi Lallu Lal's Hindi translation of Sanskrit Hitopadesha was published in 1809. Lala Srinivas Das published a novel in Hindi Pariksha guru in the Nāgarī script in 1886.[5] Shardha Ram Phillauri wrote a Hindi novel Bhagyawati which was published in 1888." – Hindi literature on Wikipedia
For reference, in 1799 the British (who introduced modern typesetting to India) defeated the quickly-modernizing Kingdom of Mysore. There reach on the Indian subcontinent was quite tenuous, but already they were able to typeset Devanagari. The "alien-ness" of abugidas was not really a problem.
As for written Devanagari that a computer cannot highlight, Google Translate has image translation and transliteration for all the major Indian languages, and its OCR is pretty good in my experience. I don't see much added value in having these entries. We're a dictionary, not a sound database. —Aryaman (मुझसे बात करो) 00:35, 29 September 2017 (UTC)
@Thecurran You asked what would be the benefit of these entries' absence: it's very simple. These entries misrepresent the nature of the writing system. The POS is given as "Syllable", but they're not syllables. The language is given as "Translingual", but they're not translingual. Your "Alternative forms" aren't alternative forms. To be blunt, you don't understand what you're working with, and it shows.
Let me give you an illustration: Sanskrit अधर्म (adharma) is made up of three syllables, but they're not divided like you might think: the last -र्म consists of the r from the second syllable, combined with the consonant and (half) vowel of the third syllable- so much for syllables.
Now, when you take the same characters and make them Hindi अधर्म (adharm), suddenly there are only two syllables, with the vowel going away and the third syllable becoming part of the second syllable- two languages, two different relationships between characters and syllables. So much for Translingual.
Finallly, your "Alternative forms" section is really a "how to spell this in other scripts" section. Yes, Sanskrit is a language written in multiple scripts, and Devanagari is a script used to write multiple languages, but you can't combine the two. Your "Syllables" are (mis)represented as being a characteristic of the Devanagari script, but your "Alternative forms" are characteristic of Sanskrit- you don't use the Thai script to spell Hindi, for instance. They can't be real Alternative forms and Translingual at the same time.
So, to sum it up, you've managed to take multiple concepts and muddle them all together in a way that is more likely to confuse non-speakers than educate them. I'm all for bringing separate ideas and paradigms together- but you have to keep straight what's what. Chuck Entz (talk) 02:52, 29 September 2017 (UTC)

My short opinion: Devanagari and friends are abugida systems, while kanas are syllabic system. It does not need to combine every abugida letters to present every possible syllables (no one will look up for them either), meanwhile kanas are syllables by nature. --Octahedron80 (talk) 03:23, 29 September 2017 (UTC)

Who can explain how to copy the text out of an image to insert straight into Google? I believe the difficulty of the process and the hiccoughs in OCR on non-uniform media are the reasons it is used on many sites as a method of detecting non-robots. What about changing a heading to ISO 15919? It is untrue that no one would ever search for these syllables because I googled for these syllables to read text from a greeting card image, which is what inspired me to add it in the first place. Sapir–Whorf linguistic relativity has a great deal of scientific evidence, and meta-analyses show that we can determine with confidence a lot about a person's background from minutiae as small as how they draw a circle. The modern decipherment of Egyptian hieroglyphics is historical fact. The difficulty of removing our individual backgrounds from our perception is actually one part of the reason that we are in such disagreement about these edits. There is an important feature in Wikipedia called dicdef, which tells Wikipedians that some types of articles belong out of Wikipedia on Wiktionary because they're better classed as dictionary definitions. It is for this purpose that Wiktionary came into existence. Wikipedia also has a feature called BEBOLD, which tells us that the fact that something has no established practice is actually not a good reason to delete it; it is better to put effort into honing it or fixing it instead. Warmest Regards, :)—thecurran Speak your mind my past 19:57, 29 September 2017 (UTC)
Questions:
1. How did you get the syllables into Google? If you can use a keyboard (instead of copying-and-pasting), you should be able to realize that the vowels are separate from the base letter.
2. Alright, somehow you've put the syllable into the Google/Wiktionary search box. How would looking up the individual syllables aid comprehension of the words that compose the text on the greeting card? —suzukaze (tc) 20:55, 29 September 2017 (UTC)
@Suzukaze, Those are great questions ^_^ !
  1. First I looked through Wiktionary's Devanāgarī letters to find something that resembled each syllable body (CV cluster) I saw;
    since Wiktionary did not list syllable bodies, my landmark letters were all consonants, the syllable onsets.
  2. Then I would copy one of those consonants into Google.
  3. I would originally visually search through dozens of words on dozens of pages to find the actual syllable body;
    this quickly stretched to thousands of words, so I started Ctrl+F-ing those consonants, but each one still took hours.
  4. Next, I would paste them together into a notepad, because I could only get through two or three syllable bodies per day.
  5. After getting through a single word, I would Google translate them; I would only get through a couple each week.
  6. Then I would guess the most reasonable word translation and copy this too, to give me hope.
  7. Whenever I had completed a sentence, I would re-translate it to pick up the word-by-word errors.
  8. After finishing the first message, I started transliterating it, but most sites only listed the क-compounds; w:Devanagari#Compounds.
  9. Then I started a spreadsheet so I could lay out the non-क compounds (syllable bodies) to lessen the hours of transliteration.
  10. When I started seeing patterns in the spreadsheet, I tried to predict some of the smaller gaps.
  11. I had to translate and transliterate several greeting cards before I found I the one I wanted to send.
  12. Next, I spent some more time researching to fill out the remaining gaps to be prepared for next time.
  13. Then I realized that I could save several days of other people's lives by finding a way to upload my spreadsheet to Wiktionary.
  14. After hours of transcription, I saw that several historic mergers had made the spreadsheet non-invertible.
  15. After hours of tweaking, I spent a few days fleshing out these pages so they would not get deleted for being too short.
The task was so arduous that I am sure many people would give up before they even got halfway, but it need not have been so hard. If there was a list, a page, or a collection pages that were freely available and completely invertible, it would make it easier for us native abjadists and alphabetists to dabble in Devanāgarī and much easier for us to study Devanāgarī by ourselves online.
Please help me to hone it or to consolidate it instead of just deleting it en masse. Warmest Regards, :)—thecurran Speak your mind my past 07:33, 30 September 2017 (UTC)
@Thecurran: Why would you do all this!? Hours and hours spent for no real reason... did you not think to even use a tool such as [6]? Or even the built-in Wikimedia Devanagari keyboard (accessible with ctrl-M)? You're treating Devanagari totally wrong. No one learns Devanagari using the method you described, so I doubt you "could save several days of other people's lives". There are only ~30 consonants, and ~20 vowels and ~19 vowels matras. There are not thousands and thousands of ligatures, these are just combinations of consonants and matras. You clearly know nothing about abugidas. —Aryaman (मुझसे बात करो) 15:14, 30 September 2017 (UTC)
@Aryaman, You are being unnecessarily rude. Please stop assuming my idiocy, so that You can read what I wrote properly. I ploughed through thousands of words of text; not thousands of ligatures, because I couldn't find any catalog of compounds. Warmest Regards, :)—thecurran Speak your mind my past 16:44, 30 September 2017 (UTC)
@Thecurran: That isn't much better, when there are plenty of input tools that can do the same kind of work in seconds. Perhaps I am being rude, I just find it difficult to discuss rationally with someone who compares the script I use every day to hieroglyphics and cuneiform. —Aryaman (मुझसे बात करो) 16:48, 30 September 2017 (UTC)
Hindi is incredibly vital to humankind; I revere it and I would like to see it join the 6 official languages of the UN. I did not mean to cause offence or disrespect, so I apologize. That being said, please remember that most of the people on Earth cannot read Abugidas. Most people know only Abjads and Alphabets; or less. Please understand that Abugidas can be hard for those of us with no formal training. Please remember that Wiktionary is also for novices; not just experts like You. Where and when do You think we are supposed to learn that these tools exist? How are we supposed to figure out which ones are good? I did not compare Devanāgarī to cuneiform. Perhaps You could better understand the difficulty between yourself and me if I phrased it in terms of the Heisenberg Uncertainty Principle; every observation affects and is affected by both the observer and that which is being observed. We need to try to walk in one another's shoes; to see through one another's eyes; or progress becomes unlikely. Warmest Regards, :)—thecurran Speak your mind my past 17:04, 30 September 2017 (UTC)
He wasn't taking offense at your attitudes toward Hindi, but toward your making things seem unnecessarily arcane and complicated.
Simply put, there's no substitute for reading up on the target language to learn how the writing system and orthography work. That's what I did when I was teaching myself Hebrew and Greek back in my teens, long before I knew anything about linguistics. Providing a deceptively simplistic solution like this gives the impression that you don't need to do that.
There are lots of potential approaches to learning a new language or writing system, and most of them just make things worse. For instance, one could analyze English by making lists of words that have each possible two-letter sequence. That would capture digraphs like "th", but cases such as knothole would cause confusion, and sifting through combinations such as the "hr" in three would be a waste of time.
There are lots of approaches, but very few that are any good- let's not promote the unhelpful ones. Chuck Entz (talk) 17:56, 30 September 2017 (UTC)
@Thecurran Your replies involve a lot of tangential (and incorrect) arguments. The dependence of the outcome of observations on both the observer and observee is the observer effect in quantum physics, not Heisenberg's uncertainty principle. The involvement of cuneiform, Egyptian hieroglyphs and the Rosetta Stone is only marginally relevant at best; it doesn't answer the following concerns:
  1. That the inclusion of these 'Translingual syllables' is outside the scope of Wiktionary and unjustified ― I have illustrated using many examples in the replies above;
  2. That the headings of 'Syllable' are a misnomer ― they are 'cursor-separatable units of Devanagari';
  3. That it is combinatorially and logistically unfeasible to include all Sanskrit syllables, or all Devanagari units separated by the mouse cursor (or an unfamiliar person) that a user may search in Google;
  4. That the pronunciation on the page is misinformation ― it does not represent 'Translingual pronunciation of the sequence of characters in the title';
  5. That the alternative forms are an abuse of the {{pi-alt}} and generates erroneous content on the pages.
Wyang (talk) 05:24, 1 October 2017 (UTC)

After over 90 years, 21-volume dictionary of Akkadian is completed and released for free (non-commercial)[edit]

https://oi.uchicago.edu/research/publications/assyrian-dictionary-oriental-institute-university-chicago-cadJustin (koavf)TCM 04:08, 30 August 2017 (UTC)

Thanks for sharing that. To be clear, the license they are using does not seem to be compatible with Wiktionary, so large-scale importing will not be possible. - TheDaveRoss 14:22, 30 August 2017 (UTC)
Just to be clear, that's old news. It's been up for a while. But thanks for spreading the awareness. --WikiTiki89 20:56, 31 August 2017 (UTC)

Germanic wa-stems[edit]

Can there be a category for germanic wa-stems?
For proto-germanic, wa-stems could just be a-stems without any irregularity. But in MHG for example, terms derived from wa-stems (e.g. klē with gen. klēwes, and knie with gen. kniewes, knies) have a special declension.
There could be:

-80.133.104.168 13:31, 30 August 2017 (UTC)

No objection from me. --Victar (talk) 17:31, 1 September 2017 (UTC)
Support. I don't see why not. —Stephen (Talk) 23:50, 1 September 2017 (UTC)
Support, since we already do this for e.g. Old Saxon. KarikaSlayer (talk) 23:27, 2 September 2017 (UTC)
No objection from me either...Support Leasnam (talk) 16:43, 6 September 2017 (UTC)

LDLs: Unusual spellings[edit]

For a LDL a single usage is enough for attestion. But what is, if the usage uses an unusual spelling?
For example:

  • Low Germans usually use sch as in High German and Dutch (e.g. Low German Schipp and Minsch (in some dialects), High German Schiff and Mensch, Dutch schip and mensch, English ship and human). But there was at least one author who used sh (namely Robert Garbe who has Shinen, Shikksal).
  • Low Germans usually use s before l, n, m, w as in Middle High German (e.g. LG Snee and Swien, MHG snē and swīn, Dutch sneeuw and zwijn, Eng. snow and pig). But there was at least one author who used ß (Capital Sz) (namely Gloede in his Zutemoos who has Sznee, Szwien).

Should clearly unusual spellings be somehow excluded or should they be somehow normalised?
E.g. knowing Zutemoos' spellings, one could normalise his Sznee, Szwien etc. to Snee, Swien (for the lemma only, not in quotes of course). -80.133.104.168 13:31, 30 August 2017 (UTC)

They should be included, but as rare/alternative spellings of the more usual spelling. —Rua (mew) 14:30, 30 August 2017 (UTC)
  • Whether or not a language uses normalised spellings for entries should be noted on its respective W:A page. Normalisations usually only applay to historic languages and those with foreign scripts. Korn [kʰũːɘ̃n] (talk) 15:00, 30 August 2017 (UTC)

Megleno-Romanian orthography[edit]

I'm interested in possibly adding some Megleno-Romanian vocabulary, but it doesn't seem like there is a standardized orthography in use, and varies considerably. Add that to the fact that there is rather little actually written by the few remaining speakers themselves, and scant records overall. The DEX Romanian etymological dictionary uses one orthography (which I think some of the few existing entries on English Wiktionary are based on), but it seems a bit inconsistent and probably differs from what native speakers and other sources may use, as it tries to approximate the Romanian spellings to offer linguistic comparisons, rather than being a serious attempt to use a certain system.

Then there's the orthography on the Romanian Wikipedia page article for the language: https://ro.wikipedia.org/wiki/Limba_meglenorom%C3%A2n%C4%83, as well as the Omniglot site's brief entry on it: http://www.omniglot.com/writing/meglenoromanian.htm, which uses the wiki info.

I also noticed that there are a substantial amount of Megleno-Romanian words added to the Occitan Wiktionary already https://oc.wiktionary.org/wiki/Categoria:Mots_en_meglenoroman%C3%A9s_eissits_d%E2%80%99un_mot_en_latin, interestingly, but they seem to use a slightly different orthography than these, based on Petar Atanasov, 1990, Le mégléno-roumain de nos jours, Balkan-Archiv, Neue Folge, Beiheft Band 7, Amborg, Helmut Buske Verlag.

The main differences between these systems seem to be in the use of characters like -ḑ- for -dz- or -z- and -ț- for -ts-, as well as -ǫ-, -ă-, -ạ-, and inconsistencies in -l'- vs -ľ-.

I know there's not many people here who can probably answer this or help out, but just in case, does anyone have any input on what should be done in approaching this obscure and very endangered language? I suppose the same could also be said of the Istro-Romanian language here as well. Word dewd544 (talk) 20:36, 31 August 2017 (UTC)

Something similar to what we have currently is the orthography of Theodor Capidan's grammar and dictionary. As far as phonology goes:
  • a e i o u /a e̞ i o̞ u/. Same with the long vowels, only longer.
  • /ɐ/ occurs only in unstressed, initial position.
  • ę /æ/? /ɛ/? occurs in one dialect area as a variant of ea.
  • ǫ /ɔ/? /ɒ/? is the M-R cognate of Daco-Romanian â, ă.
  • Rising diphthongs are au̯ eu̯ iu̯ ou̯, and falling i̯a i̯e i̯o i̯u.
  • There's a palatal ľ as well as a velar ł, which mostly occurs at the end of words.
  • ts /t͡s/ < old c before e, i.
  • /t͡ʃ/ < old c after e, i.
  • /d͡ʒ/ < old g before e, i.
  • ń is /ɲ/.
This is just from the phonology section. KarikaSlayer (talk) 16:17, 1 September 2017 (UTC)