User talk:Benwing2/2012-2019

From Wiktionary, the free dictionary
Latest comment: 4 years ago by Benwing2 in topic Bot request 2
Jump to navigation Jump to search

Welcome!

[edit]

Welcome!

Hello, welcome to Wiktionary, and thank you for your contributions so far. Here are a few good links for newcomers:

  • How to edit a page is a concise list of technical guidelines to the wiki format we use here: how to, for example, make text boldfaced or create hyperlinks. Feel free to practice in the sandbox. If you would like a slower introduction we have a short tutorial.
  • Entry layout explained (ELE) is a detailed policy documenting how Wiktionary pages should be formatted. All entries should conform to this standard, the easiest way to do this is to copy exactly an existing page for a similar word.
  • Our Criteria for inclusion (CFI) define exactly which words Wiktionary is interested in including. There is also a list of things that Wiktionary is not for a higher level overview.
  • If you already have some experience with editing our sister project Wikipedia, then you may find our guide to Wikipedia users useful.
  • The FAQ aims to answer most of your remaining questions, and there are several help pages that you can browse for more information.
  • We have discussion rooms in which you can ask any question about Wiktionary or its entries, a glossary of our technical jargon, and some hints for dealing with the more common communication issues.

Also, please add a BabelBox to your userpage so we can help you with the languages you'll be working in.

I hope you enjoy editing here and being a Wiktionarian! If you have any questions, bring them to the Wiktionary:Information desk, or ask me on my talk page. If you do so, please sign your posts with four tildes: ~~~~ which automatically produces your username and the current date and time.

Again, welcome!

RuakhTALK 14:07, 22 August 2012 (UTC)Reply

Vowel length

[edit]

Please do not edit policy or guideline pages to reflect your personal opinion on this matter without discussing with other editors with experience in Ancient Greek entries first. —Μετάknowledgediscuss/deeds 02:15, 15 November 2013 (UTC)Reply

You have new messages Hello, Benwing2. You have new messages at Metaknowledge's talk page.
You can remove this notice at any time by removing the {{talkback}} template.

Μετάknowledgediscuss/deeds 02:29, 15 November 2013 (UTC)Reply

I was the one who wrote many of the original orthography and transliteration standards, though they have undergone some changes in the intervening time. In any case, I'm happy to address some of your issues. However, I think we need to set some things straight first. To begin with, we have had a great many self-proclaimed experts come and go on this project. You must understand that the anonymous context of the internet forces us to treat claims of authority with a grain of salt. Additionally, assertions of what absolutely needs to happen right now simply won't do. Things are done here based on consensus. If you would like things to change, that's completely reasonable. However, you must present your evidence, and win allies with discussion. Personally, I think that vowel length and accent are real components of Ancient Greek phonology, and are something that merits note in our entries; however I think it's important to understand what the purpose of transliterations are here on Wiktionary. Transliterations are never used here as a substitute for the original script, as they are in many other contexts. They are a pedagogic tool, used to help those who don't understand the original script, which they accompany. So, they are an approximation for the uninformed. A highly precise technical transliteration is unnecessary, and serves only to confuse those whom it is meant to help. -Atelaes λάλει ἐμοί 03:04, 15 November 2013 (UTC)Reply
Sorry to barge in like this but this issue with vowel length is one of many issues with Wiktionary which (esp. compared to the English Wikipedia) make it look rather amateurish. (Lack of references is another one.)
I understand your concern about self-proclaimed experts. But go look at my contributions on the English Wikipedia and you will see that I do actually know a bit about the subjects at hand. Ask User:CodeCat, User:Angr and others who contribute to Wikipedia linguistics/language articles about me, if you want.
I'm also guessing that you are not an expert in linguistics, but may have some Classicist knowledge of Ancient Greek. The Classicist viewpoint comes through in various things you say (denigration of transcriptions as an "approximation for the uninformed", insistence on use and importance of the original script, apparent unconcern with not noting vowel length explicitly in all cases). However, Wiktionary is a linguistic work; this goes especially for etymologies. Hence we need to be following linguistic standards, not Classicist standards.
On top of this, your statements about transcriptions are wrong on a number of counts:
  1. In technical linguistics articles esp. on historical linguistics and etymology, it is not reasonable to expect that readers can handle every script out there. Transcription (not "transliteration", which refers to letter-for-letter representation in Latin script, although for Greek the difference isn't too great) is the norm and is the only reasonable way e.g. for even a knowledgeable reader to handle the different languages. Hence, something like the etymology of Old Irish ibid "he drinks" that makes references to Latin bibō and pōtō, Greek pī́nō, Armenian ǝmpǝm, Sanskrit pibati, Old Church Slavonic piti will make everyone go crazy if they are written in four different scripts (Greek, Armenian, Devanagari, Cyrillic) with the expectation that the readers "should" know all these scripts and are "uninformed" (your words) if they don't.
  2. Furthermore, the problem here is that the original Greek script wasn't properly reflecting long vowels, either. This is evidently due to your assertion, made into policy, that vowel length doesn't need to be noted in the Greek script or transcription — a typically Classicist viewpoint, quite reasonable in the context of intrepreting a work of Ancient Greek literature but not appropriate to a linguistic work.
Where should this discussion take place? I'm not asserting, and never asserted, that this change must happen "right now", but it does indeed need to happen at some point, hopefully soon. I am almost positive that all the other linguists working here (I've seen CodeCat and Angr here, there must be others) will agree with me, so I imagine consensus is not too hard to reach on this.
For reference, compare what's done in Latin, Old English, Old High German, etc. where long vowels are always indicated in all uses of every word including in head words, even though the original texts didn't have length marks any more than the original Greek texts did. Greek should follow what every other language does.

Benwing (talk) 09:07, 15 November 2013 (UTC)Reply

If you would like to gain official consensus the Beer parlour is the appropriate place. There are indeed a number of other editors who seem to prefer the more involved transcriptions. I have held them off thus far, but it's quite possible that a determined and eloquent proponent could cause a shift in policy. Until such time, though, I would ask that you refrain from editing existing entries to conform to your view, as I will continue to undo such edits. If you wish to create new content, you are more at liberty to do so as you wish. -Atelaes λάλει ἐμοί 16:43, 15 November 2013 (UTC)Reply

Moving pages

[edit]

We generally avoid redirects on Wiktionary, so when you move a page to correct the spelling, could you place {{delete}} on the redirect that's left behind? —CodeCat 10:47, 2 July 2014 (UTC)Reply

Will do. Does it matter where I put it in the redirect page? Presumably after the redirect itself, on the next line?

Rollback

[edit]

Rollback link is very close to patrol link so I misclick them sometimes. I use a browser extension which enables me to select a screen region with a mouse and "click" all of the selected links at once upon release. Edits here are high volume and people often make mistakes... Cheers --Ivan Štambuk (talk) 12:16, 2 July 2014 (UTC)Reply

Old French

[edit]

I'd be interested to know what your background is. Renard Migrant (talk) 17:58, 12 July 2014 (UTC)Reply

With {{fro-conj-er}}, could you fix it if possible to not need any parameters? Like {{fro-conj-er|dress}}, using Lua can't it deduce that the stem is dress by taking off the final -er? Renard Migrant (talk) 10:30, 24 July 2014 (UTC)Reply
You're right, this is possible. I'll look into it. Benwing (talk) 11:21, 24 July 2014 (UTC)Reply
Please don't delete words that definitely exist. I have no idea why you would do that, I can only assume you haven't read WT:CFI#Attestation. Renard Migrant (talk) 12:19, 24 July 2014 (UTC)Reply
I have undone the deletions. They appear to be Anglo-Norman words, not standard OF words. Standard OF has -gier, not -ger. Benwing (talk) 12:21, 24 July 2014 (UTC)Reply
As I'm sure you know, there's no such thing as standard Old French. Standardization didn't exist yet, by a few hundred years at that. Renard Migrant (talk) 12:22, 24 July 2014 (UTC)Reply
But use of -gier is pretty consistent in Francien works. And there is such a thing as standard spellings in the handbooks. e.g. amer is standard, aimer is not. I actually question whether aimer is a spurious form based on the later language. Yes, you might find occasional places where 'aim-' and 'am-' intrude on each other, but that doesn't (IMO) justify having an entry for aimer. In general, I've been trying to correct a mess of mistakes, e.g. non-standard forms like herberger having entries while standard herbergier doesn't, or std forms being claimed as alternatives to non-standard forms, etc. The current situation has the appearance that someone didn't really know OF very well when creating the entries. Certainly the conjugations were completely and utterly wrong; whoever did them just copied Modern French declensions and hoped they were the same (oops ...). Since you complain, I will not delete these forms but I'll continue to redirect non-standard to standard forms, to try and reduce the chaos of these forms. Benwing (talk) 12:30, 24 July 2014 (UTC)Reply
These are scholarly standard forms. Basically the forms preferred by scholars. We include all words whether included by scholars or not. Less common forms are not mistakes! Do you propose to delete honor because we already have an entry for honour? If your 'corrections' involve deleting truthful information then please stop. If you're just not well enough informed on the subject, also stop. Renard Migrant (talk) 12:34, 24 July 2014 (UTC)Reply
If you want to nominate these forms for deletion, your rationale will have to be "these definitely exist but I don't like them". See what response you get from other people. Renard Migrant (talk) 12:35, 24 July 2014 (UTC)Reply
Look, I already told you I won't be deleting any entries. And my corrections aren't deleting truthful info. You're welcome to look over my changes and critique them if you really want. BTW I'm going to bed now so if you don't hear any more responses from me for awhile it's not because I'm ignoring you or anything but just because I need sleep. Benwing (talk) 12:51, 24 July 2014 (UTC)Reply
Sorry you're right. Renard Migrant (talk) 13:24, 24 July 2014 (UTC)Reply
We class Anglo-Norman as a dialect of Old French. See Template talk:xno. — Ungoliant (falai) 19:08, 24 July 2014 (UTC)Reply

Category:Pages with module errors

[edit]

Seems to be Module:fro-verb's fault. Keφr 10:21, 28 July 2014 (UTC)Reply

Thanks, I fixed it. Benwing (talk) 19:53, 28 July 2014 (UTC)Reply

These are named contrary to our template naming customs. Also, User:CodeCat has been developing a category boilerplate infrastructure recently, into which it might be desirable to integrate these two. Keφr 08:05, 12 August 2014 (UTC)Reply

How are they supposed to be named? I couldn't figure that out from the link you posted. I named them based on Template:Spanish conjugation boiler. Is that also misnamed? Benwing (talk) 08:39, 12 August 2014 (UTC)Reply
I guess {{fro-preterite catboiler}} or something similar. And I never saw these templates, but I suppose yes; though these naming conventions are not actually strict policies. They are just a codification of some coding practices, some of which are relatively recent. Keφr 09:43, 12 August 2014 (UTC)Reply
How about {{fro-preterite type catboiler}} and {{fro-verb ending catboiler}}? Benwing (talk) 05:48, 13 August 2014 (UTC)Reply
Fine by me. Keφr 09:42, 13 August 2014 (UTC)Reply
OK, they've been changed. Benwing (talk) 10:55, 13 August 2014 (UTC)Reply

حقق

[edit]

I created حَقَّقَ (ḥaqqaqa) because of your edits on حق. By the way, how much do you know about the Arabic language? --Lo Ximiendo (talk) 00:00, 15 August 2014 (UTC)Reply

I studied Arabic for a couple of years and I know the verb conjugations reasonably well. I wonder why they aren't automated? Seems like a perfect opportunity since the conjugations are so systematic. I've written code in other circumstances to generate Arabic verb conjugations and it isn't all that hard. Benwing (talk) 03:17, 15 August 2014 (UTC)Reply
Hi and welcome. You're more than welcome to take over the work on Module:ar-verb (there are many existing working templates too, which cover various conjugations). Even if the module doesn't provide transliterations, it would be great to have it. Please don't underestimate the amount of work required for this module to cover all types of conjugations. Pls add a Babel to your user page, so that people know which languages you speak. --Anatoli T. (обсудить/вклад) 23:18, 24 August 2014 (UTC)Reply
Hi Anatoli. You might have noticed I've done a bunch of changes to Module:ar-verb, generalizing the code (e.g. you can specify an arbitrary number of verbal nouns), finishing form I geminate (including the alternative jussive forms), and adding form II and III strong. It should be easier to expand from now on and it does provide transliterations, using Module:ar-translit. You're right that it's a lot of work to get all the conjugations. Potentially especially problematic are the hamzated ones. I think the best thing here is to write a module that substitutes the correct hamza seat based on the surrounding vowels. This is definitely possible, and there are detailed rules (which I wrote) on the Wikipedia page on "hamza". I'll look into adding Babel stuff; not quite sure how to do it but I'll look at some existing user pages. I already have this info on my Wikipedia user page. Benwing (talk) 01:14, 25 August 2014 (UTC)Reply
Oh, I didn't notice that you edited the module page. At some stage I just lost motivation. I've got Arabic grammar books though, so I can help with testing the module for specific conjugation types and might add some types, once all the infrastructure is there and we have some working examples. I won't be able to fix any issues with the wrong display for diacritics. I hope User:ZxxZxxZ can also help. Good luck! --Anatoli T. (обсудить/вклад) 01:23, 25 August 2014 (UTC)Reply
Thanks. I'm not sure what the issue is with the diacritics; I notice a comment about shadda + fatha getting displayed wrong, but I don't see this, regardless of whether I put the diacritics in shadda-fatha order or in the fatha-shadda order that you stuck in using dia.sh_a. Possibly this bug has been fixed in the software? Benwing (talk) 01:29, 25 August 2014 (UTC)Reply
BTW there's also a detailed Wikipedia page on w:Arabic verbs which I wrote awhile ago; it lists all the conjugations with all the weaknesses. It's largely in transliterated form so the hamza issue doesn't come up and isn't treated as a weakness. Benwing (talk) 01:32, 25 August 2014 (UTC)Reply
The diacritics bugs are not consistent and they are visible when testing with different OS and browsers. I think it's best to use the correct logical order and address the issues when they happen. Your WP page looks very good. The focus should be on the Arabic script, though, so hamzated verbs should take into account spelling changes. --Anatoli T. (обсудить/вклад) 01:47, 25 August 2014 (UTC)Reply
Thanks. Agreed on targeting the Arabic script. If the diacritic bugs are still there and simply requiring reversing the order of shadda-fatha and such, then the correct way to deal with them is to postprocess the output, applying the reversals as necessary. Do you see the errors on your machine? (If so, what is your OS and browser? I'm using Chrome on Mac OS X, and no problems for me.) Take a look at User:Atitarev/ar-conjug-I-geminate-test and tell me if you see the errors in any of the numerous forms with shadda-fatha (e.g. 'dalla' or 'dallā') or combinations with other short vowels. Benwing (talk) 02:29, 25 August 2014 (UTC)Reply
I currently see User:Atitarev/ar-conjug-I-geminate-test correctly on Windows 7, Firefox 31. --Anatoli T. (обсудить/вклад) 02:36, 25 August 2014 (UTC)Reply

About moving Arabic verbs

[edit]

I moved two verbs and a noun to أصل from اصل. Would you like to create entries for the two verbs that are listed on the latter? --Lo Ximiendo (talk) 22:35, 31 August 2014 (UTC)Reply

Done. Benwing (talk) 22:40, 31 August 2014 (UTC)Reply
I actually mean the verbs تأصل and استأصل. --Lo Ximiendo (talk) 23:49, 31 August 2014 (UTC)Reply
I'm confused. If you can move those two verbs to where they belong, I can add the conjugations. Benwing (talk) 01:21, 1 September 2014 (UTC)Reply
I added the verbs already. *gulp* --Lo Ximiendo (talk) 11:02, 1 September 2014 (UTC)Reply
Thank you very much! I went ahead and added the conj. Benwing (talk) 11:26, 1 September 2014 (UTC)Reply

أجاب

[edit]

I wasn't too sure about the imperfect, especially in the automated conjugation table that was given to the entry. Have you noticed that? I did. --Lo Ximiendo (talk) 10:21, 1 September 2014 (UTC)Reply

Noticed what? I just checked my verb tables and it looks correct. I have tables for the verb أقام and the ones I generate for that verb look correct, and أجاب should follow exactly the same conjugation. Is there anything in particular that seems wrong to you?
BTW which automated tool are you using to do the edits such as you did on أجاب? I only know of AWB but usually it announces itself in edit entries. Benwing (talk) 11:14, 1 September 2014 (UTC)Reply
The ar-conj template gives out yujību instead of ar-verb's yajību. That's what I noticed. --Lo Ximiendo (talk) 11:24, 1 September 2014 (UTC)Reply
ar-verb is wrong, ar-conj is correct. Forms II, III, IV and Iq take prefixes with -u- in the active imperfect, whereas all the others take -a-. There may be lots of other errors in ar-verb but I'm pretty confident in the correctness of ar-conj. Benwing (talk) 11:30, 1 September 2014 (UTC)Reply
Maybe it's just the editor's fault that they used yajību instead of yujību? --Lo Ximiendo (talk) 11:33, 1 September 2014 (UTC)Reply
Probably ... I'm thinking actually that ar-verb needs to be automated like ar-conj so you don't have to type in any more info than what you type into ar-conj (except to clarify the radicals in a few cases), and it automatically figures out the radicals from the headword and generates the 3rd-person masculine singular past and non-past indicative. I added a comment to your talk page about this. Benwing (talk) 11:38, 1 September 2014 (UTC)Reply

تأمل

[edit]

I also created تأمل to move it from أمل. Cheers. --Lo Ximiendo (talk) 10:08, 2 September 2014 (UTC)Reply

ar-verb forms for ط و ع

[edit]

I think there should be a way to modify {{ar-verb forms}} so that it accommodates Arabic roots such as ط و ع. --Lo Ximiendo (talk) 10:30, 2 September 2014 (UTC)Reply

I don't have a very good understanding of all those templates. Can you explain how {{ar-verb forms}} is used? Do you call it directly or is it call from another template? Where is it used (in the headword line, etc.)?
However, all the code to handle all types of Arabic roots is already in Module:ar-verb. In the process of generating conjugation tables it generates all the forms that {{ar-verb forms}} generates and it handles all the types of roots and in general does all sorts of things way better than any of the current templates. Notice for example that in a non-form-I verb, all I have to do is write e.g. {{ar-conj|III}} and it automatically infers the appropriate radicals and generates all the forms, with all the vowels and also automatically transliterated. There's no reason that {{ar-verb}} couldn't take similar parameters and automatically generate the vocalized head word, the vocalized 3rd-person masculine singular imperfect indicative to display in the headword line, plus automatic transliteration, etc. Benwing (talk) 11:03, 2 September 2014 (UTC)Reply
You could have a look at ج ه د for an example of {{ar-verb forms}} at work. --Lo Ximiendo (talk) 11:56, 2 September 2014 (UTC)Reply
I moved the red link verbs to their new homes, along with those that were already created. They also request definitions (maybe not the form I and II verbs?). --Lo Ximiendo (talk) 10:43, 3 September 2014 (UTC)Reply

Beer parlour

[edit]

These discussions you are starting at the BP about Arabic templates, don't really belong there. The BP is sort of like the Supreme Court in that discussions there should affect all of Wiktionary. --WikiTiki89 14:18, 3 September 2014 (UTC)Reply

@Atitarev Actually you were the one who started the latest one. --WikiTiki89 14:20, 3 September 2014 (UTC)Reply

Two Arabic verb categories

[edit]

I created Category:Arabic form-? verbs‏‎ and re-created Category:Arabic geminate form-II verbs‏‎ because they have members, but I can easily delete them if you think we shouldn't have them. In the latter case, you should make the corrections necessary so the entries don't get categorized in them. Thanks! Chuck Entz (talk) 15:47, 5 September 2014 (UTC)Reply

Yeah, these categories should be there, thanks. The first one indicates a mistake in the entry (missing form= param) but it's still useful. I have no idea why I deleted the second one. Benwing (talk) 18:01, 5 September 2014 (UTC)Reply

Entries created from the list of Arabic Quranic Verbs

[edit]

Hi, in case you haven't noticed, I created the verb مَكَثَ (makaṯa) from the second half (501-1000) of the aforementioned list. --Lo Ximiendo (talk) 13:17, 14 September 2014 (UTC)Reply

Also created نَفِدَ (nafida) some time ago. --Lo Ximiendo (talk) 02:27, 15 September 2014 (UTC)Reply

Arabic collective nouns and their category

[edit]

I wish {{ar-coll-noun}} gets its Category:Arabic collective nouns sorting back. Any thoughts about that? --Lo Ximiendo (talk) 13:04, 16 September 2014 (UTC)Reply

Fixed. Benwing (talk) 13:13, 16 September 2014 (UTC)Reply
Thank you. :) So it was just a simple bug... --Lo Ximiendo (talk) 13:22, 16 September 2014 (UTC)Reply
I don't know how the templates {{ar-coll-noun}} and {{ar-sing-noun}} now sort Arabic nouns into a single red link category now instead of Category:Arabic collective nouns and Category:Arabic singulative nouns. --Lo Ximiendo (talk) 04:01, 19 September 2014 (UTC)Reply
Oops. That is now fixed. Benwing (talk) 04:09, 19 September 2014 (UTC)Reply
Thank you again. Besides, I'm going on a vacation to Topsail Island and be back in about a week (I think). --Lo Ximiendo (talk) 04:34, 19 September 2014 (UTC)Reply
Have fun!!! Benwing (talk) 04:36, 19 September 2014 (UTC)Reply

Category:Old French verbs with partial overrides

[edit]

Were you intending to use this category for anything? —CodeCat 20:38, 25 September 2014 (UTC)Reply

I went ahead and created it. It's intended to signal a particular practice that should be avoided as much as possible. Benwing (talk) 21:13, 25 September 2014 (UTC)Reply

Arabic head parameters

[edit]

If I understand it correctly, all Arabic headword lines should eventually have this parameter? If so, then it may be more efficient to make it the first positional parameter. We've already done this for Russian, Ukrainian and Slovene, which need accent marks for most words. What do you think of this? —CodeCat 20:58, 5 October 2014 (UTC)Reply

Yes, all Arabic words should have it. However, there's a complication in that sometimes there are multiple possible vocalizations, which are currently implemented using head2=, head3=, head4=, etc. If we make head= the first positional parameter, what do we do about the remainder? One possibility is to allow multiple heads to be specified in a single head= parameter, separated by e.g. commas (this means in the unlikely case where a comma appears in a headword, it needs to be HTML-escaped, but that seems no big deal). It also shortens the typing effort. I suppose we could also have the first positional param be the head, and other ones still use head2=, head3=, etc.
Also keep in mind the effort required to fix all the various Arabic headword templates and usages of those templates if you make this change. Benwing (talk) 21:07, 5 October 2014 (UTC)Reply
Yes I was thinking the only change would be head= to 1=, but the other headword parameters wouldn't change. This kind of "paradigm" is relatively common in Wiktionary templates. I am considering making Module:ar-headword for this. —CodeCat 21:12, 5 October 2014 (UTC)Reply
If you're willing to fix everything up yourself, go ahead. Keep in mind there are many templates in Category:Arabic headword-line templates that make use of the param head= in various ways, and would all need to be fixed. Benwing (talk) 21:15, 5 October 2014 (UTC)Reply
Yes, I'm aware of that. But it's fairly easy to rename and move around parameters with a bot, combined with tracking categories. —CodeCat 21:18, 5 October 2014 (UTC)Reply
OK. Benwing (talk) 21:19, 5 October 2014 (UTC)Reply
I've made the change to all Arabic headword-line parameters, except (for now) {{ar-nisba}}, {{ar-verb}} and {{ar-verb-part}}. It turned out that none of the templates used the 1= parameter for anything yet, so I didn't need to shift anything around. This means that for now, both head= and 1= work. But of course the former is deprecated now. Could you update the documentation of the templates? —CodeCat 23:34, 5 October 2014 (UTC)Reply
Done. Benwing (talk) 00:08, 6 October 2014 (UTC)Reply
Thank you. We could change more of the parameters to positional too. g= is probably a candidate, and maybe other {{ar-noun}} parameters too. —CodeCat 00:11, 6 October 2014 (UTC)Reply
I'm wary of too much of this. At least, there should be some logic to parameters that are positional so it's not just a random collection in a hard-to-remember order (or to remember which are positional and which aren't). Benwing (talk) 00:15, 6 October 2014 (UTC)Reply
A lot of templates already have the gender as the first positional parameter, and I noted above that for some, the headword is the first; the gender is the second then. So this is not so hard to remember. —CodeCat 00:20, 6 October 2014 (UTC)Reply
OK, if you're gonna write the bot code to fix up the calls, go ahead. Benwing (talk) 00:24, 6 October 2014 (UTC)Reply
Just to add... On Wiktionary, a somewhat general practice in writing templates is that the most frequently used and non-optional parameters are positional, while more rarely used or optional ones are named. In principle, every call to {{ar-noun}} should have a gender specified, so it's a good candidate for making it positional. That's actually the same reason I offered to make the headword parameter positional too. —CodeCat 00:30, 6 October 2014 (UTC)Reply

──────────────────────────────────────────────────────────────────────────────────────────────────── As it happens, in the case of gender, it could be made optional. The large majority of nouns have their gender in accordance with their ending, and we could potentially list only the exceptions. This is what Arabic dictionaries typically do, for example. Benwing (talk) 00:37, 6 October 2014 (UTC)Reply

I think it's a good idea to add gender, anyway, even if it's largely predictable and can be loaded automatically. That way, Wiktionary will be better than other dictionaries, which don't show genders. I sometimes have doubts about nouns ending in ه‎ (which may be silent or stand for ة‎), ا or ء. The noun gender for humans are often determined semantically, not by endings and it's somewhat confusing for place and country names. --Anatoli T. (обсудить/вклад) 00:45, 6 October 2014 (UTC)Reply
We could automatically determine gender for nouns in quite a few languages. But in practice we don't do this because gender tends to be somewhat unpredictable even then. Every language has exceptions. —CodeCat 00:47, 6 October 2014 (UTC)Reply

Arabic adjective genders

[edit]

I noticed that {{ar-adj}} takes a gender parameter, but I'm not sure why. I imagine that Arabic adjectives, like those in Indo-European languages, take the gender of the thing they refer to. I examined the entries that provide this parameter, and apparently the vast majority specify g=m but a few have g=f. I don't know what the practices are regarding which form is considered the lemma, but if I assume right that it's the masculine singular form, then the entries which specify feminine gender should probably be looked at and converted into a {{feminine of}} type entry. Could you have a look at these? They're at Special:WhatLinksHere/Template:tracking/ar-head/adj g/f.

Regarding the remainder, would it be correct to eliminate the g= parameter altogether for {{ar-adj}}, and assume that all entries that use this template are masculine singular adjectives? —CodeCat 19:46, 6 October 2014 (UTC)Reply

Yes, you are right that masculine singular forms are the lemma, and gender in Arabic adjectives does work essentially like Indo-European languages. I think it's correct to eliminate the gender code from them. Those forms marked as feminine are non-lemma feminine singular forms and should be converted as you specify. Benwing (talk) 01:05, 7 October 2014 (UTC)Reply
But I don't know the corresponding masculine forms, so I don't know how to fix them. —CodeCat 01:07, 7 October 2014 (UTC)Reply
@CodeCat All terms, except أنثى and عليا are term minus the final ة, which acts as a feminine marker, e.g. أوروبية is a feminine form of أوروبي. Yes, they should use {{feminine of}}. --Anatoli T. (обсудить/вклад)
@CodeCat All the terms should be fixed up. أنثى appears to possibly be feminine tantum and so I listed it just as an adjective (no gender); the others are listed as "adjective form"s with g=f and use of {{feminine of}}. Benwing (talk) 09:28, 8 October 2014 (UTC)Reply
I've removed the genders from adjectives now. But can you check something? The template {{ar-adj-color}} had feminine and plural forms, but which plural form is this? Is it the masculine plural or the common plural? —CodeCat 19:25, 8 October 2014 (UTC)Reply

Entries where the xxhead= parameter is not the xx= parameter + vowels

[edit]

I'm working on removing redundant the xxhead= parameters now. But there are a few entries, listed at Special:WhatLinksHere/Template:tracking/ar-head/xhead/needed, where, if vowels are removed from the xxhead= parameter, the result is not identical to the xx= parameter. I don't know much at all about Arabic and my knowledge of the writing is very basic, so I'm not able to fix these. Could you have a look? —CodeCat 21:04, 6 October 2014 (UTC)Reply

Benwing, are we making headword with or without ʾiʿrāb? Also, do we need to add fatḥa before alif - إِمْتِحَان or إِمْتِحان? it seems to work without. I can fix some. --Anatoli T. (обсудить/вклад) 23:49, 6 October 2014 (UTC)Reply
There are cases with irregular transliterations خَطَر xaṭar (x should be replaced with ḵ, ħ with ḥ) or missing vowels صَوت (missing sukūn), should be صَوْت, etc. --Anatoli T. (обсудить/вклад) 23:54, 6 October 2014 (UTC)Reply
(Edit conflict) I have fixed most of them, not sure why سريع is still appearing there. I don't know the vowels for the second word in شبح ظل. Is it a SoP? --Anatoli T. (обсудить/вклад) 00:37, 7 October 2014 (UTC)Reply
We don't currently have any consensus about whether to put ʾiʿrāb in headwords. What do you think? There seems to be a sort-of convention to include ʾiʿrāb in noun headwords but not in the transliterations, but that will require some special-case hacking to distinguish verbs from nouns, since we do want the ʾiʿrāb in verbs. Convention in most dictionaries seems to be to omit the ʾiʿrāb in nouns, but then diptotes need to be marked in some special fashion. For example, the Hans Wehr dictionary puts a subscript 2 by diptotes. Adding ʾiʿrāb is one way of indicating this. I guess probably we should include ʾiʿrāb and if necessary go ahead and include it in the transliterations as well, but I'm not sure.
There are unfortunately various systems being used for transliteration. x in place of ḵ and ħ instead of ẖ are some of the most common substitutions. If you look in Module:ar-translit you'll see I handle lots of transliteration conventions in the code that generates vowels from transliteration. This should be fixed with a bot.
Missing vowels should be added. This can be done from the transliteration usually. As for fatḥa before alif, there is special casing in the transliteration code to handle this case and a few other cases where there's no ambiguity when the vowels are omitted, but they should be there still. Benwing (talk) 00:29, 7 October 2014 (UTC)Reply
CodeCat (talkcontribs), I fixed the two cases I saw in Special:WhatLinksHere/Template:tracking/ar-head/xhead/needed. Benwing (talk) 00:33, 7 October 2014 (UTC)Reply
I think it's OK to put ʾiʿrāb in headwords AND transliterate it. It was already agreed on, I think. Not sure why inflected forms are not transliterated. I don't like tāʿ marbūta transliterated as "-a(t)". Not sure it was discussed and agreed on. I think it's better to use "-a" or "-atun" if ʾiʿrāb is given. The page about Arabic can teach the actual (pausal, informal) pronunciations. --Anatoli T. (обсудить/вклад) 00:37, 7 October 2014 (UTC)Reply
I'm ok with ʾiʿrāb in headwords and transliterations. Inflected forms aren't transliterated because of changes that CodeCat made; I've asked her to undo these changes or incorporate them into {{head}}. The transliteration of tāʾ marbūṭa as "-a(t)" should occur only when it appears as the first word in a multi-word expression. When appearing at the end of text, it should appear as "-a", and as "-atun" with ʾiʿrāb vowels. Benwing (talk) 00:45, 7 October 2014 (UTC)Reply
Thanks. In ʾiḍāfa, the genitive construct, it should be "-at", not "-a(t)" and "-āt", if it follows an alif. We discussed this as well. --Anatoli T. (обсудить/вклад) 00:50, 7 October 2014 (UTC)Reply
I think the best compromise about ʾiʿrāb is to de-emphasize them, either by graying them out—سَنَةٌ (sanatun)—or by superscripting them—سَنَةٌ (sanatun). The superscript currently looks ugly and I am not sure why. The remaining question is what to do with the fatḥatān-ʾalif ending, which when omitted leaves behind a long -ā. I think the best solution is to simply transliterate all fatḥatān occurrences normally—مَعًا (maʿan). --WikiTiki89 00:51, 7 October 2014 (UTC)Reply
So, you oppose ʾiʿrāb transliteration? It's easier to arrive at pausal form than the other way around. ʾiʿrāb can be simply omitted in pronunciation but users will know the full form and know, which one is a diptote or triptote. Also, I just thought that it won't be possible to determine programmatically ʾiḍāfa or a noun + adjective. A flag could be used for that, I think. --Anatoli T. (обсудить/вклад) 00:55, 7 October 2014 (UTC)Reply
Graying them out is OK with me, if you don't like the normal way. --Anatoli T. (обсудить/вклад) 00:57, 7 October 2014 (UTC)Reply
The only thing about graying them or superscripting them is that it's a bit tricky to do this without adding manual tr= params everywhere, which defeats the purpose of automatic transliteration -- at least that's the case if we want to cite verbs with ʾiʿrāb. There are ways around this but they might not work consistently. (Alternatively, we could always gray, even for verbs.) As for fatḥatān-ʾalif, they should definitely appear normally as -an. Benwing (talk) 01:01, 7 October 2014 (UTC)Reply
Why is it tricky to do it without manual tr= params? The module can return html tags as part of the transliteration. --WikiTiki89 03:59, 9 October 2014 (UTC)Reply
The problem isn't with returning html from the module. What's tricky is if we want to gray out or omit ʾiʿrāb in nouns but not verbs, because the transliteration module doesn't know what's a noun and what's a verb. Verbs are traditionally cited in forms with full ʾiʿrāb -- certainly the dictionary form is. If you think we should gray out all ʾiʿrāb, including in dictionary-form verbs, then doing it automatically is not an issue. Benwing (talk) 04:26, 9 October 2014 (UTC)Reply
If we are graying out ʾiʿrāb, there is no reason not to also gray it out for verbs. The ʾiʿrāb on verbs is omitted in the same contexts as nouns (i.e. in pausal position or in colloquial speech). --WikiTiki89 04:32, 9 October 2014 (UTC)Reply
Anatoli -- The problem with transliterating -at in the genitive construct is that it's not programmatically obvious when such a construct occurs and when it doesn't, e.g. غُرْفّة البَيْت vs. الغُرْفَة الكَبِيرَة. Benwing (talk) 01:01, 7 October 2014 (UTC)Reply
Ah, I see you noticed this too. Benwing (talk) 01:08, 7 October 2014 (UTC)Reply
Yes, in that case, just "-a(t)" is fine, since it's not possible to determine if they are ʾiḍāfa or a noun + adjective. --Anatoli T. (обсудить/вклад) 05:24, 8 October 2014 (UTC)Reply
In fully vowelated text, ʾiḍāfa is easy to identify as it lacks both nunation and the definite article. --WikiTiki89 03:59, 9 October 2014 (UTC)Reply
Yes, you're right. That means we need to provide full vowels. For terms without ʾiʿrāb it's OK to leave "-a(t)" if greying out is not used. --Anatoli T. (обсудить/вклад) 04:13, 9 October 2014 (UTC)Reply
I agree. Graying out is only relevant with ʾiʿrāb anyway. --WikiTiki89 04:17, 9 October 2014 (UTC)Reply
The display of "-a(t)" already only occurs without ʾiʿrāb; it only occurs when ة is at the very end of the word followed by a space, or when ʾiʿrāb display is turned off and a space follows. If ʾiʿrāb vowels are supplied, ة is always displayed as a "t". However, your suggestion is useful when graying out to determine whether to gray out the "t". Benwing (talk) 04:26, 9 October 2014 (UTC)Reply
All the old xxhead= parameters have now been removed, with their old values transferred over to the regular parameter name. —CodeCat 19:18, 8 October 2014 (UTC)Reply

Arabic genders of numerals, collective and singulative nouns

[edit]

I've now converted all uses of the g= parameter to the second positional parameter. But I came across a few things that I wonder if you could clarify.

  • All singulative nouns I came across were feminine. If this is a rule, then I suppose it could be made the default. Are there any exceptions?
  • Most collective nouns were masculine, except for ذرة and بوم. Again is this something that could be made default, but there are apparently exceptions. Unless those are errors, but I don't know that.
  • Currently, numerals also have a gender parameter. Do numerals have inherent gender like nouns, or do they adapt their gender like adjectives? Most of them were masculine, but some were both masculine and feminine.

CodeCat 22:05, 8 October 2014 (UTC)Reply

As far as I know, singulative nouns are always feminine. They are formed from collective nouns by adding the feminine ending -ة. Collective nouns are generally always masculine, and are distinguished by having a singular form but plural meaning. (The corresponding singulative noun has a singular meaning.) The Wehr dictionary doesn't indicate ذرة as collective, so I'm not sure why it's marked as such, and it does indicate بوم as collective, but not as feminine. So it's possible those are both errors.
Numerals in Arabic are complicated. It's rather like Russian, where the cardinal numbers become progressively more noun-like and less adjective-like as they get higher. I went through them all recently and marked gender, which I think is correct, but it's questionable because some forms are in between nouns and adjectives. "One" and "two" are pure adjectives; "three" through "ten" behave like nouns in that the corresponding noun (e.g. in "three men") is in the genitive plural, but they also agree in gender with the governing noun. 11 through 19 are similar but govern the accusative singular. 20 through 90 again govern the accusative singular but don't agree with the governing noun, or alternatively their form is invariable in gender, which is why I marked them as both masculine and feminine. 100 and 1000 are clearly pure nouns and govern the genitive singular; 100 is feminine and 1000 is masculine, which can be seen by the agreement of smaller numbers in forms like 300 and 3000, where the word for 3 is feminine in 300 but masculine in 3000. The whole system is a huge mess. Benwing (talk) 23:02, 8 October 2014 (UTC)Reply
Maybe we should indicate numerals using the part of speech they actually belong to then, rather than "numeral". After all, if they really are nouns or adjectives, then we should mark them as such. Concerning collectives, I wonder if the template could have no gender parameter at all, and always assume that they are masculine with no way to override. That assumption is only valid if there are no exceptions of course. For singulatives, I've already done this. —CodeCat 23:06, 8 October 2014 (UTC)Reply
The issue with this is the forms that are partly noun-like and partly adjective-like, like 3 through 10 ... which do you declare them as? Benwing (talk) 23:09, 8 October 2014 (UTC)Reply
I don't really know. Numerals are always a bit strange that way in many languages, that's why we use the "numeral" part of speech. It's kind of a catch-all for all the weirdity that goes on with such words in various languages. Of course that doesn't mean every single cardinal number term in a language has to be called "numeral". For example, miljoen (million) is marked as a noun, while duizend (thousand) and honderd (hundred) are both noun and numeral, and tien (ten) is a numeral only. So I'd suggest using adjective or noun for those where those terms clearly fit, and use numeral for the remainder?
And what about collectives? —CodeCat 23:15, 8 October 2014 (UTC)Reply
What happens in Russian re. numerals? That's probably the closest to Arabic. As for collectives, بوم is apparently masculine in reality. No indication that it's feminine in any of the three dicts I looked in. ذرة is claimed to be simultaneously collective and singulative in Lane's comprehensive and verbose dictionary. I don't know what to do about that. I guess make the gender default to masculine but let it be overridden. Benwing (talk) 23:33, 8 October 2014 (UTC)Reply
OK, Russian has all numerals as just "numeral" or "cardinal number". 1 and 2 are given with masculine and feminine forms; 100, 1000, etc. are tagged with their inherent gender, and the in-between ones, which are gender-invariable like Arabic 20 through 90, are marked without gender. I think this is probably the right solution for Arabic as well. Most languages appear to be consistent in using "numeral" etc. for all numbers; Dutch is the odd case out apparently. The Russian entries are also very well documented, including extensive usage notes on all the complications, so I think they're a good model to follow. Benwing (talk) 23:45, 8 October 2014 (UTC)Reply
Thanks :) Arabic and Russian complexity of numerals are often used in debates and comparisons. They are a bit similar in usage, only feminine and masculine are confusing reversed in usage where feminine خمسة is used with mascline nouns and masculine خمس with feminine nouns. Russian numerals usually use genitive (singular or plural depending on the number). Number "one" is identical in usage, only Russian has also neuter. --Anatoli T. (обсудить/вклад) 23:50, 8 October 2014 (UTC)Reply
I think more care should be taken regarding the part of speech. Dutch being the odd one out is not a good thing for the other languages I would say. German closely parallels Dutch for example so the entries should be similar. Dutch numbers don't inflect for gender or number, but the noun-ness is apparent from other syntactical structures. 100 and 1000 have plurals, for example. And "million" must be preceded by an article like any other counting noun (such as liter, dozijn (dozen), stapel (pile)). The entries themselves are a bit sparse, but w:Dutch grammar#Numerals goes into some detail. I've also tried to be exact for Proto-Slavic entries, so accordingly 1-4 are adjectives with full three-gender paradigms, 5-10 are feminine nouns with paradigms for only that gender. —CodeCat 23:55, 8 October 2014 (UTC)Reply
See Cherine's second post here [1] Example:عشر نساء "ten women" (masculine numeral with feminine noun in plural), ستة أيام "six days" - feminine numeral with masculine noun in plural. --Anatoli T. (обсудить/вклад) 23:57, 8 October 2014 (UTC)Reply
I just think it's a bit clunky to try to assign noun or adjective to numerals that don't behave quite as either. To assign "numeral" to 3-90 whereas "adjective" to 1 and 2 and "noun" to 100 up seems really ugly. All are numerals; they also behave similar to adjectives and/or nouns, but with enough special cases that this should probably be treated as usage info. For example, the word مئة "hundred" behaves mostly as a noun, but irregularly has a plural that's the same as its singular, which no other feminine noun does. Benwing (talk) 00:04, 9 October 2014 (UTC)Reply
Yes, treat them as numerals regardless of behaviour. We'll need to explain why خَمْسَة (ḵamsa) (feminine-looking numeral) is a masculine and خَمْس (ḵams) (masculine-looking numeral) is a feminine. Usage notes, appendix, something else? --Anatoli T. (обсудить/вклад) 05:18, 9 October 2014 (UTC)Reply

Plural of inanimate nouns

[edit]

Hi,

Also @CodeCat I've edited رِيَاح شَمْسِيَّة (riyāḥ šamsiyya) and رِيَاح نَجْمِيَّة (riyāḥ najmiyya). What I don't like is the gender "m-pl". Inanimate objects and animals in plural are grammatically feminine, aren't they (which is reflected in the adjectives used)? And there's no distinction between masc. and fem. plural for objects. "m-pl" and "f-pl" should probably only be used for humans, IMO. Did I miss anything? I can't use simply "p" for plural. --Anatoli T. (обсудить/вклад) 22:56, 8 October 2014 (UTC)Reply

I've edited Module:ar-headword so that it recognises "p" as the plural gender, rather than "m-p" or "f-p". —CodeCat 23:02, 8 October 2014 (UTC)Reply
Yes, plural inanimate objects take feminine singular agreement in Arabic, regardless of what their singular gender is. Plural adjectives are used only for people. I'm not sure about animals, might depend on whether they are higher or lower animals, who knows? Probably just "plural" is correct as the gender. Benwing (talk) 23:06, 8 October 2014 (UTC)Reply
OK, I noticed you deleted m-p and f-p as possibilities. They still apply to animate nouns, so should remain as possibilities. Benwing (talk) 23:48, 8 October 2014 (UTC)Reply
@CodeCat Yes, please. Inanimate plural nouns are grammatically feminine singular (referred to as "she" - "هي" and use feminine adj. endings, have "broken" plural forms for nouns) but not humans or some animals, which use "they" pronoun (there is a masculine and feminine "they" - "هم" "m" and "هن" "f") and use plural noun and adjective endings (broken and sound). --Anatoli T. (обсудить/вклад) 03:52, 9 October 2014 (UTC)Reply
Why can't we simply consider non-human plurals to be grammatically feminine singular (f), rather than "plural" (p)? --WikiTiki89 04:22, 9 October 2014 (UTC)Reply
When plural nouns occur as dictionary entries, I think there should be some indication that these are plural rather than feminine singular. Perhaps they should be identified as plural inanimate. Benwing (talk) 04:30, 9 October 2014 (UTC)Reply
But what makes them plural other than their meaning? The meaning is indicated in the definition. Also, we should not use the word inanimate since this applies to animal plurals as well. --WikiTiki89 04:34, 9 October 2014 (UTC)Reply
The examples that Anatoli gave above were رِيَاح شَمْسِيَّة (riyāḥ šamsiyya) and رِيَاح نَجْمِيَّة (riyāḥ najmiyya), translated as "solar wind" and "stellar wind" even though the word "wind" in Arabic is plural. So the definition doesn't always indicate the plurality. The plurality is indicated in the fact that the word for wind is a broken plural. This explains why, e.g., a word that doesn't have a feminine ending has feminine agreement, and it also tells you that you can't pluralize these forms because they're already plural (contrary to English where terms "solar winds" and "stellar winds" exist and have the expected plural meaning). If you object to "inanimate" we could say "non-human" abbreviated "nonhum" or "non-hum" or something. Benwing (talk) 04:44, 9 October 2014 (UTC)Reply
So then other than their etymologies, what makes these examples "plural"? Take for example English crossroads, which is grammatically singular. Other than its etymology, there is nothing "plural" about it. The only thing in mind that makes رِيَاحٌ (riyāḥun) plural is that it has a singular رِيحٌ (rīḥun). If رِيَاح شَمْسِيَّة (riyāḥ šamsiyya) does not have a singular, there is no basis left for me to call it a plural. (Now if we were discussing colloquial Arabic, these would all be grammatically plural and there would be no further confusion.) --WikiTiki89 04:56, 9 October 2014 (UTC)Reply
I agree with Benwing's argument that it should be marked as plural, even if it's a plurale tantum, which doesn't have a singular by definition. Some consider them feminine singular but I think it's better to treat broken plurals as plurals. A note in "About Arabic" on non-human plurals would suffice, I think. --Anatoli T. (обсудить/вклад) 05:07, 9 October 2014 (UTC)Reply
But what is it that makes it plural? That's what I really want to know. Saying "it is a broken plural therefore it is a plural" is just a circular argument (and a "broken plural" is really just a singular noun that is used in place of the plural). --WikiTiki89 05:32, 9 October 2014 (UTC)Reply
Convention or agreement between dictionary creators, if you wish. What do YOU wish to make them? Feminine singular? It's just another option. What about the fact that you can't make it plural, anymore, the etymology (plural for "wind") or that ALL non-human plurals behave like that, e.g. بُيُوت (buyūt)? It's not feminine sg but plural, isn't it? رِيَاح شَمْسِيَّة (riyāḥ šamsiyya) just doesn't have singular, if we consider it a plurale tantum. --Anatoli T. (обсудить/вклад) 05:51, 9 October 2014 (UTC)Reply
Well here's the problem (yes, it's theoretical, but this whole discussion is pretty theoretical): Suppose we have a word whose etymology is unknown or ambiguous, it is used with feminine-singular agreement, it itself has no plural and no singular, and it does not exist in the colloquial language. What criteria do we use to determine whether it is a feminine singular noun or a non-human broken plural? --WikiTiki89 06:02, 9 October 2014 (UTC)Reply
Something to be added here is that broken plurals often have a form that tells you they're broken plurals, e.g. أَرْوَاح "souls" (plural of روح) is of a traditionally plural form. Other examples are صَحَارَى "deserts" (origin of Sahara) and كُتَّاب "writers". In this case, رِيَاح is less obvious because you have singular كِتَاب with the same construction. In any case, if you really have something that has all the characteristics you describe, plus the fact that its form doesn't tell you whether it's singular or plural, and that fact that its meaning doesn't tell you that either, then you have no call to say something is singular or plural, that's all, and you'd have to go by what the dictionaries say or just omit it entirely. Benwing (talk) 06:52, 9 October 2014 (UTC)Reply
What about رُمَّانٌ (rummānun)? But I think you are right about أَرْوَاحٌ (ʔarwāḥun) and صَحَارَى (ṣaḥārā). If we look to other dictionaries, then the question remains about how those dictionaries determine whether the term is plural. And then this raises another question: Why do we need to know whether it is plural? In other words, what will our readers do with this information? --WikiTiki89 07:25, 9 October 2014 (UTC)Reply

Providing gender and plurality is important, IMO, even if it's only for the etymology. Broken plural forms seldom look like feminine singular but are used grammatically as such. If we don't provide this info, then users may ask for it, even if it doesn't make much difference for communication. If the gender or plurality is not known, it's fien to show "?" - meaning it's not known. --Anatoli T. (обсудить/вклад) 01:25, 10 October 2014 (UTC)Reply

حَيِيَ

[edit]

Shouldn't the 3mp past of this kind of verb be حَيِيُوا (ḥayiyū) rather than حَيُّوا (ḥayyū)? --WikiTiki89 16:04, 21 October 2014 (UTC)Reply

The expected 3mp past would actually be حَيُوا (ḥayū). Take a look at رَضِيَ (raḍiya). The form حَيُّوا (ḥayyū) is explicitly given in John Mace's book on Arabic verbs. Barron's "201 Arabic Verbs" on the other hand has حَيُوا (ḥayū) without gemination; presumably one of these is a misprint. I can't find any other book that lists the full conjugation of this verb. Benwing (talk) 20:59, 21 October 2014 (UTC)Reply
Then are you sure that the conjugation at رَضِيَ (raḍiya) is correct? It seems to me that either the 3fs past should be رَضَتْ (raḍat) or the 3mp past should be رَضِيُوا (raḍiyū), but I may be wrong. --WikiTiki89 21:54, 21 October 2014 (UTC)Reply
I'm pretty sure the conjugation is correct. I'll take a look when I have access to my verb tables but I remember encountering this exact situation. The page on w:Arabic verbs also has this conjugation. Benwing (talk) 00:37, 22 October 2014 (UTC)Reply
I've verified that the conjugation is correct. Something similar happens with final-weak active participles ending in -in, where the -iy- drops before u and i in masculine plural -ūna and -īna but not before a in feminine plural -iyātun/-iyātin or dual -iyāni/-iyayni. Benwing (talk) 15:33, 22 October 2014 (UTC)Reply
In that case, I'm still confused why there is a shadda in حَيُّوا (ḥayyū); it makes sense in the conjugation of حَيَّ (ḥayya), but not in that of حَيِيَ (ḥayiya). --WikiTiki89 21:46, 25 October 2014 (UTC)Reply

Deletion requests

[edit]

Could you explain your deletion requests such as this one: https://en.wiktionary.org/w/index.php?title=%D8%A7%D9%84%D9%85%D9%84%D8%A7%D8%A6%D9%83%D9%8A&diff=prev&oldid=29152226

Could you confirm that the form exists, and that information provided was correct? Note that a separate page is normal for all forms of words... Lmaltier (talk) 20:57, 25 October 2014 (UTC)Reply

There has been an agreement that the lemma form does not include a definite article unless it's an inherent part of the lemma, and that we don't include forms with added definite article unless it has some special meaning. It's similar to not including "the cat", "the dog", "the octopus", etc. as lemma entries. Benwing (talk) 05:17, 26 October 2014 (UTC)Reply
It's not the lemma form. But is it a form of the word? In Bulgarian, forms including the definite article are actual forms of the word, just like a plural form. Is it the same? What agreement do you refer to? Lmaltier (talk) 18:45, 26 October 2014 (UTC)Reply
The definite article is a clitic attached to the beginning of a word. In formal Classical Arabic, sometimes the ending also changes slightly, although usually without changing the unvowelled spelling under which words are entered in the dictionary. I brought this issue up in the Grease Pit I think, and asked whether these forms should generally be deleted, and there was agreement to do so. The definite form is not like the plural form in Arabic because the plural is often highly unpredictable whereas the definite is totally predictable by fairly simple rules. I think the situation is different in Bulgarian because for Bulgarian the definite isn't always predictable from the indefinite, e.g. sometimes the stress moves onto the definite. I personally think that only the few cases where the unvoweled spelling changes in the definite should be included; in all other cases the lemma can be found from the definite by simply removing the al- (Arabic ال) from the beginning of the word. Including these forms for all words would seem to clutter things up needlessly. Benwing (talk) 20:30, 26 October 2014 (UTC)Reply
A discussion is here Wiktionary:Beer_parlour/2014/October#Category:Arabic_definitive_nouns.3F.3F.3F. It's been a general consensus not to include words with proclitic definite articles. Besides the definite article, monosyllabic prepositions, consisting of only one written consonant, question marker أَ (ʔa), enclitic pronouns are also written without a space. They don't belong to the word. It's different from Bulgarian/Macedonian, Albanian and Scandinavian languages, where these endings are considered inflections. Korean particles and copulas are the same story - written together but don't belong to words. --Anatoli T. (обсудить/вклад) 21:27, 26 October 2014 (UTC)Reply

Arabic ǰuna

[edit]

Hello. I am looking for an Arabic word transliterated as ǰuna, meaning perhaps “tanner, skin-dresser” or “hatter”. Does it exist and what is the spelling? It is needed for ճոն (čon). --Vahag (talk) 07:11, 26 October 2014 (UTC)Reply

It would be spelled جُنَة but I can't find any such word in any of my dictionaries. I looked at a lot of variations and there are words like jauna "disc of the sun" and jūn, jūna "bay" and junāh "sinners, gatherers" but nothing meaning "tanner" or "hatter". I even checked things like junʿa, junʾa, juʿna, juʾna, junẖa, juẖna, junha, juhna on the assumption that one of these weak consonants might have been omitted in borrowing but no such luck. Benwing (talk) 21:36, 26 October 2014 (UTC)Reply
Thanks, my source is possibly unreliable in this case. --Vahag (talk) 09:13, 27 October 2014 (UTC)Reply

Arabic phrasebook entries

[edit]

Hi,

I haved fixed صَبَاح الخَيْر (ṣabāḥ al-ḵayr) and صَبَاح النُور (ṣabāḥ an-nūr) as examples of SoP entries, such as phrasebook entries. It's cumbersome to add links to individual words, though. --Anatoli T. (обсудить/вклад) 00:31, 28 October 2014 (UTC)Reply

forte possible

[edit]

Wow...I must have been half-asleep! My source doesn't even say "fp", so I'm not sure where I got that from. But it does say "forte possible" without indicating the language. Quote: "forte possible. As loud as possible." This is from "The Modern Conductor" by Elizabeth Green. The source is a trusted standard for conductors. Bob the Wikipedian (talkcontribs) 13:33, 7 November 2014 (UTC)Reply

Oops, it says "possibile". Didn't even see the 'i' there! Bob the Wikipedian (talkcontribs) 22:01, 7 November 2014 (UTC)Reply
OK, well then forte possibile must be a real term (although it sounds odd to me). Benwing (talk) 09:26, 8 November 2014 (UTC)Reply

Automatic translit and entering Arabic vocalisaton

[edit]

Hi,

I noticed that you sometimes leave the manual transliterations, even on fully vocalised native Arabic words, why is that? Do you think it's still inaccurate, especially with tāʾ marbūṭa? Also, I'd like to share with you that I use Firefox plug-in "Character palette" to enter Arabic diacritics - highly recommended if you use Firefox. It's quite convenient and easy. :) --Anatoli T. (обсудить/вклад) 06:07, 14 November 2014 (UTC)Reply

Yeah, it's because of the tāʾ marbūṭa, so it gets rendered properly instead of as (t). I enter Arabic diacritics using the Arabic keyboard layout on the Mac, which has almost all the necessary stuff ... just missing dagger alif and hamzat al-waṣl. These ones, along with the left and right half-rings, get entered using the built-in Mac character palette (Control-Command-Space). If I find myself using Firefox, however, I'll definitely check out the "Character palette" plug in. Benwing (talk) 23:45, 14 November 2014 (UTC)Reply

ʾiʿrāb

[edit]

Hi,

What's the deal with ʾiʿrāb? Are we supposed to use it in headwords and translations from English? I can see both - with and without. Is it still undecided? Sorry, don't remember the outcome of discussions. --Anatoli T. (обсудить/вклад) 04:41, 18 November 2014 (UTC)Reply

Also, marking hamzat al-waṣl is usually problematic but I see the module can handle the elision without the diacritic. Do you mark it? --Anatoli T. (обсудить/вклад) 04:42, 18 November 2014 (UTC)Reply
I still haven't quite decided what to do about ʾiʿrāb. Mostly, I've entered words without ʾiʿrāb because it looks strange to me to include it, and most existing entries don't include it. (The main exceptions are in {{ar-nisba}}, which includes ʾiʿrāb in its auto-generated entries, and in verbal nouns for verbs, which always have ʾiʿrāb in them.) I like the solution used in Hans Wehr, which leaves triptotes unmarked and marks diptotes with a superscript 2; possibly we could adopt this solution, and I could fix Module:ar-translit to ignore a superscript 2 when transliterating. What do you think?
As for hamzat al-waṣl, you're right that the translit module can generally manage to elide it when necessary, although I've still been inserting it. I don't feel strongly about this, though, and we could choose to leave it out. Why do you think it's problematic? Benwing (talk) 08:08, 18 November 2014 (UTC)Reply
I think we should include them, and I have been doing so. --WikiTiki89 16:37, 18 November 2014 (UTC)Reply
I see we still have disagreement on this. Superscript 2 for diptotes is a great idea! Adding hamzat al-waṣl is no longer problematic but non-ligature form لله is not displayed correctly with any diacritic before or after or when alif is missing. --Anatoli T. (обсудить/вклад) 21:19, 18 November 2014 (UTC)Reply
I don't like the superscript 2 idea. If we want to be explicit in headwords, we can just put the word diptote with a link to a appendix. The ʾiʿrāb will look less weird once we start graying them out. Also, if we choose not to include ʾiʿrāb, should we make an exception for words like نَادٍ (nādin)? --WikiTiki89 22:58, 18 November 2014 (UTC)Reply
I still hesitate about ʾiʿrāb, undecided myself but if most entries and don't have them, then we won't get consistency. We won't be able to grey it out in translations or other places with automatic transliteration, will we? Hans Wehr and rare web references with vocalisation don't use them either. Terms like نَادٍ (nādin) could be exceptions.
A superscript 2 could link to a diptote appendix or About Arabic page. --Anatoli T. (обсудить/вклад) 23:28, 18 November 2014 (UTC)Reply
Yes we can gray them out with automatic transliterations. --WikiTiki89 23:37, 18 November 2014 (UTC)Reply
I implemented graying out ʾiʿrāb but I don't like the idea much because it can't really be done properly automatically. For example, adverbial accusatives need the ʾiʿrāb displayed normally, and in Koranic quotes we presumably want to do so as well. We also display ʾiʿrāb in verbs, among other things. I would rather display the ʾiʿrāb when it belongs in the translit and leave it out otherwise. I also think the graying out looks a bit strange in {{ar-nisba}} examples like عَرَبِيّ (ʕarabiyy). For cases like وَادٍ (wādin) we should probably make an exception and include the ʾiʿrāb; likewise for words like مُسْتَشْفًى (mustašfan). This is also the convention used in Wehr's dictionary; or at least, the translit includes the ʾiʿrāb, when it doesn't for most words. Likewise, this dictionary displays ʾiʿrāb in translit of verbs consistently, in adverbial accusatives and sometimes in phrases when it's necessary to clarify the case relations, but not otherwise. For cases like وَادٍ (wādin) I also try to put an entry at وَادِي (wādī) that says it's the construct state, given the way these words are normally pronounced.
As for displaying the word diptote, this isn't a bad idea although the problem is that it can't so easily be done for plural inflections, which are the most common cases of diptotes (well, I suppose it could, with some hacking of Module:headword, although it's not clear whether it will look bad). Benwing (talk) 23:48, 18 November 2014 (UTC)Reply
Plural inflections don't need to say that because they are a regular part of the grammar. It is only words that are lexically diptotes, such as مِصْرُ (miṣru) that need an indication. I don't think we should base everything off of Hans Wehr. Note that there is no logical reason why ʾiʿrāb should be included for verbs, but not for nouns. Also note that it is much easier for an Arabic beginner to remove the ʾiʿrāb than to add it. --WikiTiki89 00:06, 19 November 2014 (UTC)Reply
I don't think it would be easy in adding ʾiʿrāb to all entries consistently in practical terms, unless someone commits to make a bot to do this. Re: it is much easier for an Arabic beginner to remove the ʾiʿrāb than to add it. Yes, totally, that's the main pro argument. --Anatoli T. (обсудить/вклад) 02:00, 19 November 2014 (UTC)Reply
If we decide to do it, then we can worry about how to do it. But it shouldn't be too hard anyway. Arabic doesn't have nearly as many entries as English or Russian, for example. --WikiTiki89 02:04, 19 November 2014 (UTC)Reply
Why don't plural inflections need it? Some broken plurals are diptotes, some are triptotes. For example, 4-character plurals of the form CaCāCiC and CaCāCīC are generally diptotes (including words like فَوَاكِه and جرَائِد which are based off of 3-character singulars), as are plurals of the form ʾaCCiCāʾ (and ʾaCiCCāʾ for geminate roots) and CuCaCāʾ, and words of the form CaCCān, generally intensive adjectives (but not words of the form CiCCān or CuCCān), and masculine color/defect/elatives of the form ʾaCCaC (and ʾaCaCC for geminate roots) and feminine color/defect adjectives of the form CaCCāʾ, and probably other cases as well. This is independent of the predictable declension of words in -ūn, -āt and -in, which are technically diptotes because they have only two distinct case forms but which have their own declensions separate from the normal diptote declension.
As for following Hans Wehr or not, John Mace's book on Arabic Verbs likewise includes ʾiʿrāb for verbs but not nouns and adjectives (including verbal nouns and participles), and the book "Introduction to Koranic and Classical Arabic" by Thackston does something similar, where verbs are transcribed with ʾiʿrāb as e.g. rajaʿa "to return" but nouns and adjectives are written with ʾiʿrāb only if they are diptotes, e.g. ğarīb- pl. ğurabāʾu "strange" (with a hyphen in place of the triptote ending -un). So I think there's a lot of precedent for something like this. I'm not opposed to the idea of writing ʾiʿrāb only for diptotes, as Thackston does; this would be an alternative to using a superscript 2. All of these books are likewise consistent in writing ʾiʿrāb for prepositions and particles, which I think is a good idea. I imagine one reason for this is that spoken MSA may be more likely to drop the case endings of nouns and adjectives than the ʾiʿrāb of verbs. Benwing (talk) 05:42, 19 November 2014 (UTC)Reply
Now I'm learning something new. I thought that CaCāCiC/CaCāCīC and the others you've listed were ordinary triptotes. I thought that only sound plurals were diptotes. I'm assuming that the patterns you mention use the same declension as the sound -āt plural (i.e. -u(n) for nominative and -i(n) for genitive and accusative)? But in that case, indicating them with ʾiʿrāb is probably not a good idea. Maybe we should just explicitly indicate the accusative case of diptotes in the headword line. --WikiTiki89 13:33, 19 November 2014 (UTC)Reply
They don't use the same declension as the sound -āt plural. They have indefinite nominative -u (no nunation), indefinite genitive/accusative -a, while the definite uses -u/-i/-a like for triptotes. This is the same declension as diptotes like مِصْرُ (miṣru) and أَحْمَدُ (ʔaḥmadu). This is all documented in w:Arabic nouns and adjectives. I imagine few Arabic speakers actually know these rules nowadays. Benwing (talk) 21:21, 20 November 2014 (UTC)Reply
Thanks! That explains my previous confusion about words like أَحْمَدُ (ʔaḥmadu). --WikiTiki89 15:39, 21 November 2014 (UTC)Reply

I also have "Introduction to Koranic and Classical Arabic" with answer keys :). --Anatoli T. (обсудить/вклад) 05:52, 19 November 2014 (UTC)Reply

Arabic vowels and consonants

[edit]

If I'm not mistaken, Arabic normally writes only consonants, but three of the consonant letters can also be used to indicate long vowels. Assuming that the word is fully vocalised, I wonder if there is a reliable way to tell whether a given consonant represents an actual vowel or its consonantal equivalent? I am asking this because I would like to write a function that extracts the consonants or vowels from a word. This means knowing which letters are vowels and which are consonants, obviously. —CodeCat 19:08, 25 November 2014 (UTC)Reply

For ي and و, if there is another short vowel written on them, then they are consonants, otherwise they are long vowels. In the case there is a sukuun (the null vowel) on them, it is debatable whether to analyze them as a the second element of a diphthong or just as a consonant. For ا, the situation is a bit trickier. It almost always indicates a long vowel, but at the beginning of a word, it indicates an elidable epenthetic vowel before a consonant cluster. The tricky part is that if there is a prefix, the ا is still written but represents no sound at all (كَاسْمٍ (kasmin, like a name)) rather than a long vowel. This can be detected only by knowing that consonant clusters are forbidden after long vowels (except in the active participle of geminate roots, e.g. خَاصٌّ (ḵāṣṣun), but these have the very particular form C1āC2C2-). But I'm curious as to why you're writing this function. It may or may not be important to keep in mind that long vowels can interchange with semivowels within the same consonantal root (e.g. نُونٌ (nūnun) > تَنْوِينٌ (tanwīnun)). --WikiTiki89 19:33, 25 November 2014 (UTC)Reply
Re: why you're writing this function: pls, see Module_talk:ar-headword#Plural_forms.2C_dual_forms.2C_etc. --Anatoli T. (обсудить/вклад) 21:31, 25 November 2014 (UTC)Reply
One conceivable way is to use Module:ar-translit and then parse the transliteration. This already implements all the rules required to distinguish consonants from vowels. (Except that it doesn't handle cases like كَاسْمٍ (kasmin, like a name) but these won't show up in single lemmas -- this occurs only because ka- "like" is a clitic.) I don't know whether this is doable in reality, as you'd have to map back to the Arabic text somehow. If not then you should at least be able to reuse the code in Module:ar-translit that does the transliteration. Benwing (talk) 05:23, 26 November 2014 (UTC)Reply

Gender and nmber of adjectives

[edit]

Hi,

Re: diff. Normally, adjectives (lemmas) don't display gender in any language in the headword. If masculine singular is the lemma, so it's used as lemma, other forms use form templates. --Anatoli T. (обсудить/вклад) 07:19, 28 November 2014 (UTC)Reply

The template is for non-lemma plural adjective forms, e.g. أَغْبِيَاء (ʔaḡbiyāʔ), which is the masculine plural of adjective غَبِيّ (ḡabiyy), and the change is made to reflect the fact that the most common usage will be with masculine (broken) plurals. For non-lemma forms it seems reasonable to display their gender. Benwing (talk) 10:35, 28 November 2014 (UTC)Reply

حذاء

[edit]

Hi, What's the gender/number of حِذَاء (ḥiḏāʔ)? Hans Wehr says "(pair of) leather boots or shoes", plural أَحْذِيَة (ʔaḥḏiya). --Anatoli T. (обсудить/вклад) 01:42, 29 November 2014 (UTC)Reply

Hmmm, it's singular, I'm guessing masculine. Words ending in اء are often feminine but in this case it's not an ending but rather a form of the root consonant و, meaning that the word has the same pattern as كِتَاب. Benwing (talk) 05:09, 29 November 2014 (UTC)Reply

A few WingerBot clinkers

[edit]

ريالات, فرميونات, كامرات, كواركات, and ميكروبات all have the same module error. Chuck Entz (talk) 06:39, 30 November 2014 (UTC)Reply

Thanks. Fixed them. Happened when the singular noun entry had a blank head. Benwing (talk) 08:38, 30 November 2014 (UTC)Reply

ذكرى and دنيا

[edit]

Just curious etymologically, do you know why ذِكْرَى (ḏikrā) and دُنْيَا (dunyā) don't take tanween (i.e. why aren't they ذِكْرًى (ḏikran) and دُنْيًا (dunyan))? --WikiTiki89 16:57, 12 December 2014 (UTC)Reply

دُنْيَا (dunyā) is a nominalized feminine elative of دَنِيّ (daniyy, low), literally "the lowest (place)". It has the same pattern as كُبْرَى (kubrā), feminine of أَكْبَر (ʔakbar). It takes tall alif by the rule that alif maqṣūra is written as tall alif after yāʾ. I don't know what the proto-forms of these words are, nor of ذِكْرَى (ḏikrā), but I guess the reason for no tanween is that these words are underlyingly diptotes. Similarly, the masculine elative of دَنِيّ (daniyy, low) is أَدْنَى (ʔadnā) without tanween, underlyingly *ʾadnayu (cf. ʾakbaru) whereas a word like مَعْنًى (maʕnan, meaning) is underlyingly *maʿnayun (cf. maktabun). I'm not sure the reason why ذِكْرَى (ḏikrā) is a diptote. I'm also not sure why certain words like عَصًا (ʕaṣan, stick) have tall alif independently of a preceding yāʾ. (In the dialect of Mecca, alif maqṣūra was pronounced something like [e:] replace : with ː, invalid IPA characters (:) whereas tall alif was [a:] replace : with ː, invalid IPA characters (:).) Benwing (talk) 00:51, 13 December 2014 (UTC)Reply
I would guess that عَصًا (ʕaṣan) is underlyingly *ʿaṣawun, not **ʿaṣayun, which is why it has tall alif. You answered my question with "these words are underlyingly diptotes", "دُنْيَا (dunyā) is a nominalized feminine elative of دَنِيّ (daniyy, low)", and "I'm not sure the reason why ذِكْرَى (ḏikrā) is a diptote". Thanks! --WikiTiki89 02:47, 13 December 2014 (UTC)Reply
Good call on عَصًا (ʕaṣan). Benwing (talk) 08:13, 13 December 2014 (UTC)Reply

صري

[edit]

What's the transliteration of صري, if the term is real? --Lo Ximiendo (talk) 16:09, 16 December 2014 (UTC)Reply

I don't think this word actually exists. It's not in my dictionary. Benwing (talk) 18:58, 16 December 2014 (UTC)Reply
The closest in Wehr appears to be مُصِرّ (muṣirr, persistent, resolute). Lane has a noun something like صِرِّي (ṣirrī) or صِرِّيّ (ṣirriyy) (?) meaning something like "a serious assertion, not a jest", occurring in the expression هِيَ مِنِّي صِرِّي (hiya minnī ṣirrī) in various variants meaning approximately "It is a serious assertion from me" said of an oath. It's clearly an archaic word which is why it isn't in Wehr. I still think we should delete this word. Benwing (talk) 19:10, 16 December 2014 (UTC)Reply
Then what's the transliteration of حماقة? --Lo Ximiendo (talk) 21:58, 17 December 2014 (UTC)Reply
@Lo Ximiendo fixed. --Anatoli T. (обсудить/вклад) 22:14, 17 December 2014 (UTC)Reply
Can anyone verify the term مندثر and its transliteration? --Lo Ximiendo (talk) 12:15, 18 December 2014 (UTC)Reply
Added translit. Benwing (talk) 11:24, 19 December 2014 (UTC)Reply

Arabic for "to fry"

[edit]

Is the following term قلا really an Arabic verb that means "to fry"? --Lo Ximiendo (talk) 11:21, 16 January 2015 (UTC)Reply

@Lo Ximiendo Yes it is. I cleaned up the entry and added conjugation. Benwing (talk) 07:11, 17 January 2015 (UTC)Reply

شجرة التفاح and كوكا كولا

[edit]

I'm not sure whether the module errors on these entries are your fault- but I'm pretty sure you can fix whatever the problem is. Thanks! Chuck Entz (talk) 22:22, 18 January 2015 (UTC)Reply

Fixed. Benwing (talk) 01:34, 19 January 2015 (UTC)Reply
Since كوكا كولا is indeclinable like some loanwords and many words ending in alif, so may be the inflection table is unnecessary but the header should say so + indeclinable category? --Anatoli T. (обсудить/вклад) 01:39, 19 January 2015 (UTC)Reply
A Russian indeclinable example бюро́ (bjuró). A parameter in the header is "-" adds to Category:Russian indeclinable nouns. Just a suggestion, it may reduce the editing time. --Anatoli T. (обсудить/вклад) 02:02, 19 January 2015 (UTC)Reply

Requested entry

[edit]

Hi, are you able to make entry for مُفْتٍ (muftin), please? I'm not sure about the plural form and don't have my HW handy. :) --Anatoli T. (обсудить/вклад) 05:59, 19 January 2015 (UTC)Reply

Oops!

[edit]

Please take a look at Category:Pages with module errors (currently 55 entries) and fix the problem. Thanks! Chuck Entz (talk) 14:45, 21 January 2015 (UTC)Reply

Sorry about that. Stupid typo. I wish the stuff in Category:Pages with module errors showed up faster. It seems to take quite awhile for it to cycle through, longer than it used to. E.g. I fixed the error 20 minutes ago and still see all the pages listed. Benwing (talk) 07:41, 22 January 2015 (UTC)Reply
I just do null edits on all the entries. If you open a bunch of them in separate tabs, you can do things with the first ones you opened while the more recent ones are still getting around to responding, and keep doing each step that way until you're done- it averages out to only a second or two per entry on a reasonably fast computer. All those errors are cleared, but there's a single one with a new error. Chuck Entz (talk) 03:54, 23 January 2015 (UTC)Reply

schnitzel

[edit]

Hi,

Do you know why شْنِیتْزَل is not working? All internal diacritics are there, although I'm not 100% sure it's a sukūn or kasra after šīn. It should probably be manually transliterated as "š(i)nitzal" but the automatic test fails. --Anatoli T. (обсудить/вклад) 00:06, 6 February 2015 (UTC)Reply

You had FARSI YEH in place of the YEH. If you change that, you get شْنِيتْزَل (šnītzal) and it works. Benwing (talk) 04:00, 6 February 2015 (UTC)Reply
Oops. Thank you. :) I wonder how I managed to get it there... --Anatoli T. (обсудить/вклад) 04:04, 6 February 2015 (UTC)Reply

سجن

[edit]

Hi BW,

Why is the entry reporting "Arabic nouns with sound masculine plural" when it is a broken plural and not -ūn(a)/-īn(a)? Did I miss something or is it confusing سُجُون for an SMP? TIA. :) --Anatoli T. (обсудить/вклад) 13:02, 5 March 2015 (UTC)Reply

It thinks the -ūn ending indicates a strong plural. I fixed it by adding an explicit ":tri" (triptote) notation. This also occurs for a few other words ending in -n, e.g. عَيْن pl. عُيُون and قَرْن pl. قُرُون. Possibly I should add a check for the form فُعُون and treat it as a broken plural. Benwing (talk) 14:39, 5 March 2015 (UTC)Reply

Arabic etyma of Swahili terms

[edit]

I don't have any Arabic resources nor do I know Arabic script, so I have been having a hard time finding the etyma of Swahili words that I've been adding that (to me, at least) seem very strongly like they derive from Arabic. I was wondering whether you'd be willing to help me out with finding the Arabic origins of words in Category:Swahili entries needing etymology, or at least recommend a good online resource for Arabic that I can search in Latin script. —Μετάknowledgediscuss/deeds 21:50, 9 March 2015 (UTC)Reply

I can try to help you. There are a whole host of dictionaries here: [2] and you can search by Latin using the search button, although they tend to be sorted by Arabic root, which requires that you have some knowledge of how Arabic words are structured, because Arabic roots are generally three consonants, with the vowels inserted between them. I don't see too many Arabic-looking words among the category you linked to above, although maskini definitely comes from Arabic مِسْكِين (miskīn). Benwing (talk) 07:13, 10 March 2015 (UTC)Reply
aidha is from أَيْضًا (ʔayḍan). Benwing (talk) 07:18, 10 March 2015 (UTC)Reply
sahani is from صَحْن (ṣaḥn). Benwing (talk) 07:44, 10 March 2015 (UTC)Reply
Thank you! There are also some entries, at least one of which I see you have noticed, that need some improving of their etymologies, like tajiri.
I'm pretty sure the following words in that category are from Arabic or via Arabic: ghasia, hafifu, hodari, imara, karamu, laini, ruhusa, shikamoo. If any of those have deducible etyma, that would be very helpful. I'll try the dictionary you linked to. —Μετάknowledgediscuss/deeds 08:05, 10 March 2015 (UTC)Reply
hafifu is possibly خَفِيف (ḵafīf, light, slight, thin). ghasia is possibly غَاشِيَة (ḡāšiya, misfortune, faint, stupor, attendants) (?). Can't find any obvious etyma for hodari or karamu. imara is possibly إِمَارَة (ʔimāra, emirate, authority, power), although this is a noun not an adjective, and the power talked about is power of command rather than physical power. laini is possibly لَيِّن (layyin, soft, feeble, tender, gentle, supple). ruhusa is definitely رُخْصَة (ruḵṣa, permission). shikamoo I have no idea about. Benwing (talk) 09:06, 10 March 2015 (UTC)Reply
Most of those match the regular sound changes, but imara seems off and ghasia would have turned out as *ghashia if that were the etymon, unless there is a dialectal form in Arabic with /s/. Thank you for all your trouble! —Μετάknowledgediscuss/deeds 17:02, 10 March 2015 (UTC)Reply
You're welcome! Sorry I couldn't find better etyma. As for ghasia, Arabic is pretty strict about keeping /s/ and /ʃ/ apart so I don't think there are any dialectal forms with /ʃ/ -> /s/ in them. Benwing (talk) 18:39, 10 March 2015 (UTC)Reply
I got my hands on some better Swahili resources, including an etymological dictionary. That said, I may have to learn Arabic script if I want them to be of any use to me in this regard. —Μετάknowledgediscuss/deeds 08:05, 11 March 2015 (UTC)Reply
If you put up some screen shots I might be able to help. Benwing (talk) 23:08, 11 March 2015 (UTC)Reply
Once I'm not terribly busy, I'll learn Arabic script so I don't have to be as reliant. Perhaps next week. —Μετάknowledgediscuss/deeds 07:46, 13 March 2015 (UTC)Reply

تحمل

[edit]

Apparently the module disagrees with the conjugation type you gave it. Chuck Entz (talk) 08:57, 29 March 2015 (UTC)Reply

Thanks. The module was correct; I fixed the conjugation type. Benwing (talk) 09:27, 29 March 2015 (UTC)Reply

Automatic transliteration of بالـ

[edit]

I thought this worked before, but I may be wrong: بِالتَّوْفِيق (bi-t-tawfīq). --WikiTiki89 15:26, 15 May 2015 (UTC)Reply

It seems to work if you change the alif into an alif waṣla. I don't think I ever got it working in the case you give. Benwing (talk) 06:46, 16 May 2015 (UTC)Reply
Yes, Module:ar-translit/testcases has a case with an ʾalif waṣla - بِٱلتَّأْكِيد (bi-t-taʔkīd). --Anatoli T. (обсудить/вклад) 09:22, 16 May 2015 (UTC)Reply
Ok. That's probably why I remember it working. --WikiTiki89 19:01, 19 May 2015 (UTC)Reply
Considering that "ٱ" is such a rare symbol, perhaps the rule should be that if "ال" follows a kasra or a ḍamma, then it should be considered an ʾalif waṣla? I don't know if it's hard to implement for Benwing. Shall I add بِالتَّوْفِيق (bi-t-tawfīq) (or similar) to test cases? --Anatoli T. (обсудить/вклад) 01:22, 21 May 2015 (UTC)Reply
OK, I implemented this. 19:58, 21 May 2015 (UTC)
Cool! Maybe we should have it work for مِائَة as well? Or is that too risky and not common enough to be beneficial? --WikiTiki89 21:31, 21 May 2015 (UTC)Reply
I think that's too much work for just one single case, and every new regex slows things down and risks leading to module errors on certain long appendix pages. Benwing (talk) 23:02, 21 May 2015 (UTC)Reply
Well I was just thinking of making the regex you just added less restrictive (i.e. changing {"([\217\143\217\144])\216\167\217\132", "%1\217\177\217\132"}, to {"([\217\143\217\144])\216\167", "%1\217\177"},), but like I said, that might be too risky and not worth it for such a rare case. --WikiTiki89 17:15, 22 May 2015 (UTC)Reply

Parameters of Arabic headword-line templates

[edit]

I've been trying to figure out how the Arabic headword-line templates work and what parameters they take. It seems to me that many of them show a rather excessive number of forms on the headword line. For example, {{ar-noun}} apparently can list:

  1. construct state
  2. definite state
  3. oblique
  4. informal
  5. dual
  6. dual construct state
  7. dual definite state
  8. dual oblique
  9. dual informal
  10. plural
  11. plural construct state
  12. plural definite state
  13. plural oblique
  14. plural informal
  15. feminine
  16. feminine construct state
  17. feminine definite state
  18. feminine oblique
  19. feminine informal
  20. masculine
  21. masculine construct state
  22. masculine definite state
  23. masculine oblique
  24. masculine informal

This really is way way way too many forms to list on a single headword line. These templates should be trimmed down to only show the bare basics and the rest should be shown in an inflection table. —CodeCat 18:38, 19 May 2015 (UTC)Reply

Yeah, it's a lot of potential forms, but most of them aren't used. In practice only the forms that can't be predicted are listed, and that's a small number. At least, that's the practice I've been following, and it was more or less the same in the existing entries before I came along, so it's pretty consistent. Generally, for nouns, only the plural is given; dual is given only when it can't be predicted, which is fairly rare (basically, only nouns ending in -ā, where the dual can be either -awān or -ayān or sometimes both). Masculine is used only for feminine nouns referring to people, where there is a corresponding masculine noun. Construct state is given only for nouns ending in -in (which can appear in the singular or broken plural), and informal is similarly given for adjectives in -in (which can likewise appear in the singular or broken masculine plural; adjectives don't have a construct state). The reason for this is the -in is written with a diacritic ـٍ (two slanted lines below the letter), and hence doesn't appear in unvocalized text or in the unvocalized page title; whereas the construct state, informal and definite all appear with -ī, written with an extra letter ي. Giving both forms emphasizes and clarifies the relation between the two, esp. since many users may be more familiar with the version with attached ي. Overall, there are typically only a couple of forms listed in the headword line, and if there are more it's usually because there are multiple broken plurals (in the extreme case, رَاجِل (rājil, pedestrian, footsoldier) has 13!).
This means that at least the following could potentially be removed:
  1. oblique (always predictable)
  2. definite state
  3. dual construct and informal
  4. feminine construct and informal

Benwing (talk) 01:06, 20 May 2015 (UTC)Reply

The feminine and masculine forms probably have their own lemma page don't they? If so, then we don't need to list all the forms of them as they'll already be covered on that other page. As for the rest, I don't think we should be showing all possibly-unpredictable forms in the headword line. The idea of the headword line is to give a quick overview of the inflection, but listing all the irregularities is just too much. Consider for example what would happen if we tried that for Latin deus! So we really need to make a choice: which forms are the most essential and least predictable? Forms that are only unpredictable for a handful of words don't need to be in the headword line, that's what inflection tables are for. —CodeCat 01:12, 20 May 2015 (UTC)Reply
How about you find an example of Arabic entry that has too many forms in the headword line (other than plurals), and then you can complain. Arabic doesn't have words that are as irregular as Latin deus. --WikiTiki89 12:59, 20 May 2015 (UTC)Reply
I agree with Wikitiki; in practice this isn't really a big issue. As for feminine and masculine forms, in the case of nouns yes they have their own lemma page, but in this case we don't list the forms of them. Feminine plurals are only given for adjectives and even then only sometimes, generally when the dictionary gives them (which is only for adjectives that can modify people and generally only when the feminine plural is irregular). I'd say, things aren't broke so let's not try to fix them. Benwing (talk) 03:47, 21 May 2015 (UTC)Reply
BTW what is the need for your latest changes to Module:ar-headword? Why the need to explicitly list def/def2/def3/def4 etc.? This seems very hacky, and something similar won't work for plurals, where there may well be more than 4 possibilities. The current code works fine without needing to do any of this. Benwing (talk) 03:53, 21 May 2015 (UTC)Reply

آجر

[edit]

Do you know what the difference in meaning is between forms III and IV at آجَرَ (ʔājara)? I tried reading the definitions in both of the cited references, but the old-fashioned, dry, concise English makes no sense to me. --WikiTiki89 21:45, 2 July 2015 (UTC)Reply

Probably not that much. Form III usually takes a person as an object, so form III might have the meaning "hire out" rather than exactly "rent out", although according to Wehr (see this site, which has all the dictionaries: [3]) both "rent out" and "hire out" are possible form IV meanings as well. Benwing (talk) 09:23, 3 July 2015 (UTC)Reply
It seems that Wehr doesn't even have form III. And it seems that Lane was saying that there was confusion between the two due to the coincidence of the past tense forms. --WikiTiki89 17:17, 6 July 2015 (UTC)Reply

WingerBot

[edit]

I note that your bot can remove redundant transliterations. Can it also delete transliterations for all items in Category:Terms with redundant transliterations/hy and Category:Terms with manual transliterations different from the automated ones/hy? --Vahag (talk) 16:09, 5 July 2015 (UTC)Reply

It could potentially do this. With Arabic it uses some sophistication to decide whether the remove the translit or canonicalize it. It sounds like this isn't necessary for Armenian? Can it just unilaterally remove all the translits in the words in those two categories? Benwing (talk) 00:57, 6 July 2015 (UTC)Reply
Yes, all transliteration should be blindly removed from everywhere. --Vahag (talk) 07:05, 6 July 2015 (UTC)Reply
The same with Category:Terms with redundant transliterations/xcl, Category:Terms with manual transliterations different from the automated ones/ka, Category:Terms with redundant transliterations/ka. If the transliteration remains in unusual places (e.g. the headword format is not standard for xcl), don't bother with additional coding. Just pick the lowest hanging fruit. --Vahag (talk) 07:10, 6 July 2015 (UTC)Reply
@Benwing, in case, you wish to help other languages as well, to make things easier and generic, all languages, for which manual (hardcoded) transliterations don't override the automatic ones, all manual transliterations could and should be removed from everywhere when templates are used. The list of language codes of such languages (ONLY THOSE) is in Module:links just after the line starting with "local override_translit". You'll see that Russian (and other Cyrillic-based Slavic languages), Hebrew, Yiddish, Hindi are NOT in that list, so they shouldn't be removed automatically. The list in Module:links is the list of languages, for which nobody objects to use 100% automatic transliterations. --Anatoli T. (обсудить/вклад) 07:25, 6 July 2015 (UTC)Reply

──────────────────────────────────────────────────────────────────────────────────────────────────── @Atitarev @Vahagn Petrosyan I wrote the code to do it but I want to have you guys verify some things about it. I wrote the code so it can do all the languages listed in override_translit in Module:links, but currently I'm only having it do hy, xcl, ka, el and grc. I'm doing all the pages listed in the categories "Terms with manual transliterations different from the automated ones/X", "Terms with redundant transliterations/X", "X lemmas" and "X non-lemma forms". I have some logic to determine which parameters to remove from which templates (see below). Currently 275 lines of Python (not including a separate, already-existing generic helper library). A few issues:

  • Currently I have it written to remove sc=Armn from hy and xcl, and sc=Grek from el. I already do something similar with removing sc=Arab from Arabic-language templates, because it's definitely redundant in this case. Is it safe for hy, xcl and el? I don't remove sc=Geor from ka or sc=polytonic from grc because in each case there are two scripts listed in the relevant entry in Module:languages/data2 and Module:languages/data3/g. Does anyone know what these script parameters are used for? Is it only in the transliteration modules?
  • The code will change stuff in every namespace, including User, Talk, User talk, Wiktionary:Beer parlour/*, etc. Is this OK?
  • PLEASE VERIFY THE FOLLOWING (Vahagn): As well as removing tr= and sc=Grek and sc=Geor, I remove various numbered parameters from various hy and xcl templates, which appear to be unused translit params. Before removing any such param, I double-check to make sure non-Latin characters aren't present, but I'd like someone (e.g. Vahagn) to look over this list:
    • Even-numbered params from xcl-noun-*pl*
    • Odd-numbered params from hy-noun-* and remaining xcl-noun-*
    • Odd-numbered params from xcl-decl-verb as well as all the hy-* and xcl-* headword templates except for hy-letter (where param 1 doesn't appear to be used but is Armenian)
    • Even-numbered params from hy-conj
    • Odd-numbered params from xcl-conj
  • I have general code to remove all tr= parameters from FOO-* where FOO is any language code; but so far I need a special case for {{grc-alt}} with |dial=muk because in this case it makes a call to {{head|gmy}} for Mycenaean Greek, which isn't on the list of override_translit languages. Perhaps it should be? (Before I run the script with --save, I will check the full list of modified templates to see if there are any other such cases.)
  • Many templates still use the tr= parameter, passing it to {{head}}. I assume that it will be ignored and hence is safe to remove. These templates should be changed to ignore the tr parameter entirely. The list of templates I've found so far that do this are:
    • hy-particle, hy-personal_pronoun, hy-phrase, hy-postp, hy-postp-form, hy-prefix, hy-proper-noun-form, hy-suffix
    • xcl-adj, xcl-adj-form, xcl-adv, xcl-con, xcl-interj, xcl-noun-form, xcl-numeral, xcl-particle, xcl-postp, xcl-prefix, xcl-prep, xcl-pron, xcl-pron-form, xcl-proper_noun, xcl-proper-noun-form, xcl-root, xcl-suffix, xcl-verb, xcl-verb-form
    • axm-adj, axm-adv, axm-interj, axm-noun, axm-prefix, axm-suffix, axm-verb
    • ka-adj, ka-adv, ka-pron, ka-proper noun, ka-verb
    • oge-noun
  • A more complex issue is with the template ka-decl-noun. All the even-numbered parameters are translit parameters, but the template manually inserts them after the corresponding Georgian text rather than using auto-transliteration. As a result I don't touch these parameters; but the template should be rewritten to use auto-transliteration, so these params can be removed.


The list of params removed is:

  • No params removed whose value is - or contains non-Western chars, as described above
  • sc=Grek and sc=Geor as described above
  • odd/even params from some (Old) Armenian templates, as described above
  • tr= from all FOO-* templates where FOO is any of the processed language codes, except for some cases with grc-alt, as described above
  • tr= from all templates where 1=FOO or lang=FOO where FOO is any of the processed language codes, except for {{borrowing}}, where |lang= is ignored
  • tr1=, tr2=, etc. from {{suffix}}, {{prefix}}, {{confix}}, {{affix}}, {{compound}}

Benwing (talk) 07:31, 7 July 2015 (UTC)Reply


OK, I'm now ignoring stuff with these prefixes: "User:", "User talk:", "Talk:", "Appendix talk:", "Template talk:", "Wiktionary:Beer parlour", "Wiktionary:Translation requests", "Wiktionary:Grease pit", "Wiktionary:Etymology scriptorium", "Wiktionary:Information desk"
I also had to special case "xcl-noun-ն-pl", "xcl-noun-ն-2-pl", "xcl-noun-ն-3-pl", "xcl-noun-ո-ա-pl" ,"xcl-noun-հայր", "xcl-noun-տէր", "xcl-noun-այր", "xcl-noun-կին", "xcl-noun-collnum-*", "xcl-noun" and "xcl-adj", which don't follow the above odd/even rules. Benwing (talk) 08:23, 7 July 2015 (UTC)Reply

That is very impressive, thanks for agreeing to help.

  • Removing sc= is desirable. Please remove sc=Armn from hy, xcl, axm; sc=Grek from el; sc=Geor from ka; sc=polytonic from grc. It doesn't matter that the last two can be written in different scripts. Nowadays the scripts are auto-detected. They can be safely removed.
  • I confirm removing various numbered parameters from various hy and xcl templates as specified above.
  • I have added gmy to the transliteration override list. You don't need a special code.
  • Many templates still use the tr= parameter, passing it to {{head}}. I assume that it will be ignored and hence is safe to remove. Yes, please remove. I will change the templates gradually.
OK, thanks for adding gmy. I've changed things to remove all the scripts as above. It looks like the code is set now but there are an awful lot of pages; it may take a couple of days of running time to run in --save mode (which is a lot slower than not saving changes). BTW I'm impressed at the number of Armenian and Old Armenian lemmas (10,000 of the former, around 5,000 of the latter) and especially the quality of the pages, with the lengthy etymologies and references and all. I don't think I've added more than 1,000 or so Arabic lemmas, maybe 2,000 at the most (not counting around 3,000 auto-generated entries) and that took a whole lot of work, and the etymology sections are really crappy (often nothing more than "derived from such and such a three-consonant root"; it doesn't help that there simply don't exist any real Arabic etymological dictionaries, hard as that is to believe). Benwing (talk) 10:47, 7 July 2015 (UTC)Reply
I have noted your excellent work in Arabic. The poor state of Arabic lexicography is well-known. Do you have Badawi, Elsaid M.; Haleem, Muhammad Abdel, Arabic-English Dictionary of Qur'anic Usage, 2008? It is a recent and relatively good source. --Vahag (talk) 11:11, 7 July 2015 (UTC)Reply
PS. Please check your email. --Vahag (talk) 12:04, 7 July 2015 (UTC)Reply
Regarding the script tags: Nearly all script tags everywhere can be removed because the modules can infer them from the text itself. The only exceptions are when scripts are mixed in one template (which is a bad idea anyway) or when the entire text consists of punctuation marks or other characters that are shared by multiple scripts of that language. --WikiTiki89 11:39, 7 July 2015 (UTC)Reply
@Wikitiki89 Can you cite an example of a translit module that does script auto-detection? Mostly the ones I've looked either ignore the script entirely, or they check for unusual scripts and pass things off to a different module (e.g. Module:grc-translit does that with Cypriot script). I haven't seen any that auto-detect, although it certainly should be possible. Benwing (talk) 12:06, 7 July 2015 (UTC)Reply
The translit modules only get the script if it is specified as a parameter and they are the only module interface that retains this old-fashioned behavior (see Thread:User_talk:CodeCat/Module:jdt-translit_errors). However, I have used explicit script detection in Module:jdt-translit (using findBestScript from Module:scripts). This is often unnecessary if you don't need script-specific logic in the transliteration. For correctly displaying scripts, script detection is used implicitly everywhere when it is not overidden by sc=; for example, Aramaic in Hebrew script (עלמא) and in Syriac script (ܥܠܡܐ) display correctly without script tags. --WikiTiki89 12:21, 7 July 2015 (UTC)Reply
Thanks. Benwing (talk) 13:04, 7 July 2015 (UTC)Reply
Hi. Regarding {{ka-decl-noun}} you mentioned above: it is now obsolete. We use {{ka-infl-noun}} instead, which uses auto-transliteration and adds a postposition table. However, simply replacing all ka-decl-noun with ka-infl-noun is not a good idea, because we may lose contraction and get incorrect declension tables. This script by @Dixtosa correctly replaces the deprecated templates with the newer ones without losing any contraction info (while also ignoring manual transliteration). You might want to check that out. --Simboyd (talk) 12:04, 7 July 2015 (UTC)Reply
Thanks. I don't have time now to look into this; it also looks like that script needs to be rewritten so it can be run in a bot rather than interactively (I assume it's interactive since it's Javascript). However, if you need some translit parameters removed in any templates, let me know. Benwing (talk) 15:38, 7 July 2015 (UTC)Reply
correctDecl function should be working fine.--Dixtosa (talk) 16:43, 7 July 2015 (UTC)Reply
@Dixtosa Can you go ahead and run your bot to convert all the remaining deprecated Georgian declensions, and then delete the deprecated templates? Benwing (talk) 16:50, 7 July 2015 (UTC)Reply



30 changes in a row --Dixtosa (talk) 12:11, 9 August 2015 (UTC)Reply

@Dixtosa I suppose I should batch up the changes .... did it trigger some filter? Benwing (talk) 21:58, 9 August 2015 (UTC)Reply

Ͷ, ͷ

[edit]

Hello Benwing. Your bot removed manual transliterations from Ͷ and from ͷ; in doing so, it introduced errors: Ͷ, ͷ qua Arcadocypriot tsan should be transliterated Ś, ś, whereas Ͷ, ͷ qua Melian beta should be transliterated B, b. Is there a way to prevent your bot from removing the manually overridden transliterations in those cases? — I.S.M.E.T.A. 18:02, 8 July 2015 (UTC)Reply

Sorry about that! I thought that since grc is on the list of languages with override_translit set, that manual transliterations are always ignored. I know that tr=- is paid attention to and don't remove it. Is there an exception for {{head}} or something? If so, I won't remove them and will have to figure out which other cases need to be undone ... Benwing (talk) 21:48, 8 July 2015 (UTC)Reply
@I'm so meta even this acronym Evidently {{head}} doesn't pay attention to override_translit. I tried to look through the other cases where |tr= was removed from {{head}}. The only case I found that arguably shouldn't have been removed was , which has one form with a long vowel and one with a short, and the translit reflected the vowel length even though Greek translit normally doesn't do this, e.g. μακρά has one pronunciation with a long final vowel and another with a short final vowel, and they're both translitted the same (and were even before my bot attacked the page). Benwing (talk) 22:56, 8 July 2015 (UTC)Reply
@Vahagn Petrosyan I found a few cases with boldface or italics in the |head=, which is now reflected in the translit; should we remove the boldface from the head?
Page 7468 Κύριλλος: Removed tr=Cýrillos: {{head|el|proper noun|head='''Κύριλλος'''|tr=Cýrillos|g=m|sc=Grek}}
Page 7562 δράματα: Removed tr=''drāmata'': {{head|grc|noun form|head=δρᾱ́μᾰτᾰ|tr=''drāmata''|g=n}}
Page 5096 Հայաստանի Հանրապետություն: Removed tr=Hayastani Hanrapetut'yun: {{head|hy|proper noun|head='''[[Հայաստան|Հայաստանի]]''' '''[[հանրապետություն|Հանրապետություն]]'''|tr=Hayastani Hanrapetut'yun|sc=Armn}}
Page 8030 Ռուսաստանի Դաշնություն: Removed tr=Ṙusastani Dašnut’yun: {{head|hy|proper noun|head='''[[Ռուսաստան|Ռուսաստանի]]''' '''[[դաշնություն|Դաշնություն]]'''|tr=Ṙusastani Dašnut’yun|sc=Armn}}
Page 3338 ვახშამი: Removed tr=vaxšami: {{head|ka|noun|head='''ვახშამი'''|tr=vaxšami|sc=Geor}}
Page 10509 ფათერაკი: Removed tr=p'at'eraki: {{head|ka|noun|head='''ფათერაკი'''|sc=Geor|tr=p'at'eraki}}
Benwing (talk) 23:00, 8 July 2015 (UTC)Reply
I don't know about the mechanics here; I just saw an isolated error. I fixed μᾰκρά; ObsequiousNewt did the same for . — I.S.M.E.T.A. 23:33, 8 July 2015 (UTC)Reply
It's probably worth noting that the manual transliterations don't usually include vowel length of α/ι/υ (or, to be honest, any useful information that automatic translation gives.) Which is why override_translit is on. Which is why it's probably safe to remove manual transliterations. If you want to be sure, though, check if the manual transliteration contains one of ā, ī, ū, and leave it alone if it does. —ObsequiousNewt (εἴρηκα|πεποίηκα) 01:58, 9 July 2015 (UTC)Reply
@ObsequiousNewt OK, I added that check and did a run; it output 250 warnings about long a/i/u on pages with the manual translit different from the automatic one. They're listed here: User:Benwing/grc-long-vowel-warnings If you want to fix them up, please do. Note that I already removed manual translit from all the Ancient Greek lemma pages, so there are surely many more cases that have already disappeared (I can create a list of those pages if you want). Benwing (talk) 04:24, 9 July 2015 (UTC)Reply
User:Benwing/grc-more-long-vowel-warnings Benwing (talk) 04:55, 9 July 2015 (UTC)Reply
Those were formatting errors. I fixed them. The bot is doing a great job, thanks for running it. --Vahag (talk) 09:40, 9 July 2015 (UTC)Reply
You're welcome! Benwing (talk) 05:32, 10 July 2015 (UTC)Reply

Ancient Greek transcriptions

[edit]

Why is your bot reädding the 4th parameter to {{grc-noun}} & {{grc-proper noun}}? Neither template uses the 4th parameter at all, and AG has a sorting function. —JohnC5 05:41, 15 July 2015 (UTC)Reply

The 4th parameter here is an old transliteration parameter, not a sorting parameter, and the cases I'm re-adding are exactly those where the Latin has a macron over a, i or u. I have other code that will transfer the macron to the Greek in the appropriate place and then remove the translit again. Benwing (talk) 06:55, 15 July 2015 (UTC)Reply
Except that the templates don't do anything with the 4th parameter: see diff and diff. Chuck Entz (talk) 08:03, 15 July 2015 (UTC)Reply
My mistake: it's the bot, not the template, that's supposed to be using it- never mind... Chuck Entz (talk) 08:11, 15 July 2015 (UTC)Reply
Fair enough. Sorry to pester. —JohnC5 13:22, 15 July 2015 (UTC)Reply
Hi, ιυ isn't a diphthong but a sequence of two vowels across a syllable break, so the first edit here was wrong: the smooth breathing should be over the ι and the acute over the υ. —Aɴɢʀ (talk) 09:49, 16 July 2015 (UTC)Reply

Hello Benwing. Could you explain why, in this edit, your bot replaced four coronides: ⟨⟩ with 4 × ⟨ ̓⟩ (4 × [U+0020 SPACE, U+0313 COMBINING COMMA ABOVE])? That shouldn't happen, and I've rolled back its change. — I.S.M.E.T.A. 16:17, 16 July 2015 (UTC)Reply

@Angr, I'm so meta even this acronym Oops. I'm going to look through the changelogs and see if there are any other cases, and fix them. Benwing (talk) 06:13, 17 July 2015 (UTC)Reply
@I'm so meta even this acronym The coronides got changed because as part of the matching I first decomposed using Unicode NFKD form and then recomposed using NFKC form. The "K" in both of these does compatibility transformations, and it apparently includes transforming those coronides into the comma-above sequences. This compatibility transformation had the positive side effect of fixing a number of cases where Unicode U+00B5 was incorrectly being used to represent a Greek mu, but it also changed the coronides as above.

The only other error I found was in Ἀσία and Ἀσιανός, which had both a macron and a breve and my code wasn't expecting that, so the order ended up not quite right; fixed. Benwing (talk) 07:41, 17 July 2015 (UTC)Reply

Thank you, Benwing. My guess is that [U+0020, U+0313] is the NFKC form for ʼ (U+02BC MODIFIER LETTER APOSTROPHE), (U+1FBD GREEK KORONIS), and ᾿ (U+1FBF GREEK PSILI); however, IMO, we should never use [SPACE + COMBINING CHARACTER] where there exists a standalone character for what we want. — I.S.M.E.T.A. 12:03, 17 July 2015 (UTC)Reply

hajrr > هَجْر

[edit]

Just curious, in this diff, how did your bot know to ignore the double r? --WikiTiki89 15:26, 20 July 2015 (UTC)Reply

One of the many, many things my code does is to remove double consonants in the Latin when next to another consonant, except for certain cases I could think of where this might legitimately occur (e.g. mudhhib, a non-canonical representation for muḏhib). This occasionally removes double consonants when it maybe shouldn't, in some weird situations that are mostly errors anyway (one that arguably isn't is dunḡḡwān = Arabic rendering of the city of Dongguan) but it fixes a lot more errors than it creates. The code that does the matching-up of Arabic and Latin is about 1,500 lines of Python (this includes a Python version of the Lua code in Module:ar-translit, and a couple hundred test cases); it got this way because I kept going through the errors and accumulating test cases that I thought "should" work, and adding more code to get them to work. Not sure why I put so much effort into this; I guess it seemed an interesting problem. An early version of the matching code sits between lines 303 and 590 of Module:ar-translit; I will probably remove it because it's out of date and not used anywhere in Wiktionary itself. Benwing (talk) 11:04, 21 July 2015 (UTC)Reply
I'm curious about what causes these double consonant errors in the first place. What kinds of situations were you seeing with consonants erroneously doubled in the transliteration? hajrr I could ascribe to the syllabicity of the r in this position. --WikiTiki89 13:10, 21 July 2015 (UTC)Reply
Some other examples:
  • ʔela al-xalff
  • ḵalff
  • ḡusll
  • qawss
  • ʿalaa al-ʿakss
  • ʿabrr
  • ṭawdd
All of these occur at the end of a short word, but not all involve syllabic consonants. There are also some cases with the doubled consonant before another consonant:
  • ḵallf
  • ṣaffrāʿ
  • ḵassm
  • ḥarrba
  • tellk
Also some cases like fiyyyā with a triple consonant, which I also fix. I don't know where these errors originated. Benwing (talk) 13:04, 22 July 2015 (UTC)Reply

Adminship

[edit]

Hi,

Are you interested in becoming a Wiktionary admininistrator? Pls ping me and reply here. You'll need to allow to be contacted by email address (which you have) and set your time zone, if I'm not mistaken. --Anatoli T. (обсудить/вклад) 10:06, 22 July 2015 (UTC)Reply

I'd support your nomination, Benwing. — I.S.M.E.T.A. 12:22, 22 July 2015 (UTC)Reply
@Atitarev I'd be interested. I set my time zone; let me know if there's anything else I need to do. Benwing (talk) 12:48, 22 July 2015 (UTC)Reply
Thanks. Please accept the nomination at Wiktionary:Votes/sy-2015-07/User:Benwing for admin, check the Babel list and fix the start/end dates - usually two weeks from the acceptance date and the vote can begin, I guess :). Good luck and happy editing! --Anatoli T. (обсудить/вклад) 13:02, 22 July 2015 (UTC)Reply
@Atitarev Done. Benwing (talk) 13:20, 22 July 2015 (UTC)Reply
Your vote has passed, you are an Admin. Please add your name to WT:Admin. Also, see Help:Sysop tools. —Stephen (Talk) 21:17, 11 August 2015 (UTC)Reply
@Stephen G. Brown Thank you. Benwing (talk) 06:48, 12 August 2015 (UTC)Reply

FYI

[edit]

diff. Already fixed. Chuck Entz (talk) 04:07, 25 July 2015 (UTC)Reply

Oops. Thanks for fixing it. Those changes marked as "manual" are cases where I manually edited the template and pushed the changes using my bot, so the mistake is my editing mistake and not a bot screwup. Benwing (talk) 06:26, 25 July 2015 (UTC)Reply

Arabic terms with irregular pronunciations

[edit]

Hi,

I've started Category:Arabic terms with irregular pronunciations primarily for loanwards, similar to Russian, Thai categories and a Japanese category with a different name. Hopefully it's OK with you. --Anatoli T. (обсудить/вклад) 23:45, 10 August 2015 (UTC)Reply

BTW, what should be done with indeclinable terms, like إنجلترا? Does it need a "Declension" section at all? Russian nouns are automatically made indeclinable by providing "-" where genitive form should be, e.g. Алматы. --Anatoli T. (обсудить/вклад) 23:49, 10 August 2015 (UTC)Reply
In Arabic I've been creating declension sections for all nouns, even ones that are indeclinable. In this case there's some useful information in the table e.g. fact that it is definite even without preceding al-.
As for the category you've created, I have no problem with it, although I think words should be put there automatically when possible (e.g. by Module:ar-nominals; it's more complex than simply looking for manual translit, because of non-final tāʾ marbūṭa, but that should be fixable with a bit of work). Benwing (talk) 01:58, 11 August 2015 (UTC)Reply
Yeah, I also thought about that. Category:Russian terms with irregular pronunciations is also populated automatically, except for terms, which use generic {{head}}, not {{ru-noun}} and similar. --Anatoli T. (обсудить/вклад) 02:01, 11 August 2015 (UTC)Reply
@Atitarev I implemented this. Benwing (talk) 09:58, 11 August 2015 (UTC)Reply

داود

[edit]

I originally suspected that the "u" in داود was long because it's annotated as such in the tajweed Korans, but it is also listed as such in Wehr; it looks like the spelling داوود (the standard in Persian) also exists in Arabic, and that داود was retained for whatever reason, whether because it's the Koranic spelling or because داوود looks strange or something like that. In any event, I think we can say fairly confidently that dāwūd should be our favored transliteration. Aperiarcam (talk) 18:54, 11 August 2015 (UTC)Reply

If you go by the Quran, then you would have to transliterate ـه (the 3rd person masculine singular suffix pronoun) as -hū/-hī as well. What really matters is how it is pronounced today. --WikiTiki89 19:00, 11 August 2015 (UTC)Reply
Right, I recognize that (and there are plenty of other oddities in Koranic pronunciation and spelling); I just meant that that's what first made me think it was a long "u." But my suspicion is that the Koran is at least partly responsible for the un-orthographic spelling داود (interestingly enough ar:w:داود uses one waw in the header and both spellings in the text, often with the spelling داوُد (with damma) even when vowels are not otherwise annotated.) Aperiarcam (talk) 19:08, 11 August 2015 (UTC)Reply
That also reminds me of the spelling إبرهيم (or perhaps ابرهيم), which we should probably have since it seems to be common (or to have been common). --WikiTiki89 19:15, 11 August 2015 (UTC)Reply
Yes, I think we should include these spellings at least for the prophets, but I think (could be wrong) داود is a little unusual in its currency in modern Arabic; the absence of medial alif is a much more predictable feature of Koranic spelling and I've never seen it in modern writing, but again I may just be wrong on this count. We have an entry for صلوة so I figure we should include any of these peculiar Koranic spellings we think may prove useful to somebody. Aperiarcam (talk) 19:24, 11 August 2015 (UTC)Reply
That also reminds me of the spelling تورية, which I will add. --WikiTiki89 19:37, 11 August 2015 (UTC)Reply
@Benwing: Can we add a feature to the transliteration module to ignore an unvocalized yāʾ or wāw following a dagger ʾalif? --WikiTiki89 19:49, 11 August 2015 (UTC)Reply
I stand corrected about داود. Sorry about that, I thought it was a random mistake, like so many others I've fixed. @Wikitiki89 How many cases are there with dagger alif followed by unvocalized yāʾ or wāw? Benwing (talk) 21:38, 11 August 2015 (UTC)Reply
I don't know how many, but it is a rather common classical spelling of many nouns (mostly derived from Aramaic, I believe) that are now standardized to ـَاة (-āh). I don't know what other situations this occurs in. --WikiTiki89 02:46, 12 August 2015 (UTC)Reply

Root categories for Arabic

[edit]

(Notifying Benwing, Atitarev, Mahmudmasri): Check out Category:Hebrew terms by root and {{HE root}}. Do you think we should do the same for Arabic? --WikiTiki89 12:19, 12 August 2015 (UTC)Reply

Sure, why not. This should be easy enough to implement in {{ar-root}}. Benwing (talk) 12:22, 12 August 2015 (UTC)Reply
The idea is to replace the etymologies that say "From the root XXX" because most of the time, the root is not the actual etymological source of the word, but an after-the-fact classification. This will associate the word with the root without implying it is derived from the root. It may be easier to implement it as part of {{ar-root}}, but it may be better to have a separate template like we do for Hebrew and leave {{ar-root}} for simply creating links to roots. This would also allow us to use the syntax {{AR root|ء ه ل|س ه ل}} in entries like أهلا وسهلا, which {{ar-root}} does not support. --WikiTiki89 12:34, 12 August 2015 (UTC)Reply
Currently, {{ar-root}} supports either the syntax ك ت ب (k-t-b) or ك ت ب (k-t-b). The syntax with separate params is older. I could rewrite all the latter uses to the former ones; then we could use multiple params for multiple roots. I guess what you're referring to by separate templates is that one displays the root box on the right side and one would inline the root link; I could imagine implementing that with params to {{ar-root}} (e.g. presumably when you inline a root link you also want the box on the right side by default; you could e.g. have params |nobox= and |nolink= to turn off one or the other or both). Or separately named templates ... Benwing (talk) 12:44, 12 August 2015 (UTC)Reply
I forgot to mention that {{ar-root}} as a linking template could be useful even in entries that do no belong to that root (such as "See also {{ar-root|XXX}}"), which is not meant to categorize. So I still think it would be better to keep them separate. --WikiTiki89 12:47, 12 August 2015 (UTC)Reply
I see. This could be supported by a |nocat= param or a separate {{ar-root-link}}. One thing I'd like to avoid is having duplicate root link and root box templates in the common case where a root is linked in the etym section. (I understand you'd like to eliminate them but that is a long-term project unless it can be done automatically.) Benwing (talk) 13:02, 12 August 2015 (UTC)Reply
There's nothing wrong with having duplicate templates. This isn't any scarier than duplicating inflection information in {{head}} and in a table. My plan was that could use a bot to add {{AR root}} everywhere where {{ar-root}} appears in an etymology and then work manually to add {{AR root}} everywhere else and remove {{ar-root}} where it is unneeded. The latter process would have to happen either way and would have to be done manually either way. --WikiTiki89 13:37, 12 August 2015 (UTC)Reply

──────────────────────────────────────────────────────────────────────────────────────────────────── I think we need a better name than {{AR root}}, which doesn't follow normal naming conventions and is very confusable with {{ar-root}}. I know that it parallels {{HE root}} and {{PIE root}} but these are likewise misnamed (esp. {{HE root}}). Perhaps {{ar-root-box}}? Also, I really don't like the idea of duplicating the stuff; I know we do it for headword/etymology but it's a pain in the ass there, and better avoided if possible. I'd rather have e.g. {{ar-root-box}} to insert a root box and a category, and {{ar-root-link}} to insert a root link, and {{ar-root}} to do both, to be rewritten manually as we get around to it (but note, there isn't an Arabic etymological dictionary so it's often not obvious what to replace the etym with). Note that {{ar-root-box}} can be called automatically from {{ar-verb}}, which knows the root of the verb in question and is able to derive it from the lemma and form class (sometimes with a bit of manual help when the auto-derived root would be ambiguous; this occurs principally with 2nd-weak and 3rd-weak roots of verbs of form class II and higher). The same thing applies to {{ar-verbal noun of}} so that verbal nouns get root boxes/cats, with a bit of bot work to propagate those manually-assisted root consonants. Same thing could potentially be done for active and passive participles using (not currently existing) templates {{ar-active participle of}} and {{ar-passive participle of}} if/when I get around to running my bot to create those participle entries (the code is all written and a few entries have already been created -- check the links to {{ar-act-participle}} and {{ar-pass-participle}}). Benwing (talk) 14:06, 12 August 2015 (UTC)Reply

I don't care so much about the name. Having {{ar-verb}} call it, however, may clutter the page with too many boxes when only one is needed, especially when there are two of them on consecutive lines. --WikiTiki89 14:22, 12 August 2015 (UTC)Reply
Suppport (late reply). Good idea - Category:Arabic terms by root. I thought about it too. --Anatoli T. (обсудить/вклад) 22:59, 12 August 2015 (UTC)Reply
I guess {{ar-verb}} can simply categorize without creating the box. Benwing (talk) 04:50, 13 August 2015 (UTC)Reply

You are an admin now

[edit]

You no longer need to use this template. Either you use {{rfd}} if you want it to be discussed, or you can speedy delete it yourself. --WikiTiki89 18:03, 16 August 2015 (UTC)Reply

OK. I was doing this out of habit more than anything else. Benwing (talk) 02:13, 17 August 2015 (UTC)Reply

Belarusian and Ukrainian old adjective templates

[edit]

Can you orphan {{be-adj1}}, {{be-adj2}}, {{be-adj3}}, {{be-adj4}}, {{be-adj5}}, and {{uk-adj1}}? No logic is required, all you need to do is subst: them. --WikiTiki89 18:43, 17 August 2015 (UTC)Reply

OK. Benwing (talk) 23:35, 17 August 2015 (UTC)Reply
@Wikitiki89 Done. Benwing (talk) 11:16, 18 August 2015 (UTC)Reply
Thanks! --WikiTiki89 11:29, 18 August 2015 (UTC)Reply

Category:Pages with Module errors

[edit]

Your recent edit to Template:ru-decl-noun has added the page отруби to this category. --kc_kennylau (talk) 06:35, 20 August 2015 (UTC)Reply

Removing stress from monosyllabic terms

[edit]

Hi Benwing #2 :), also @Wikitiki89, Cinemantique

It's something nice to have and not urgent at all (!)

Do you think it would be possible to automatically remove stresses from monosyllabic terms in Module:ru-translit and the Cyrillic forms? The only reason why they exist in declension/conjugation tables is technical (the Russian Wiktionary also uses word stresses on in such cases). It doesn't really make sense to add a stress mark for monosyllabic terms.

Having to manually transliterate words with "ё" to change "(j)ó" for "(j)o" like чёрт (čort), лёд (ljod), ёж (jož), лёг (ljog) and supplying "|notrcat=1" is also annoying.

Please note, the word stress on monosyllabic terms is still important in expressions and shouldn't be removed, it's only for standalone terms or forms. --Anatoli T. (обсудить/вклад) 02:12, 21 August 2015 (UTC)Reply

@Benwing. --Anatoli T. (обсудить/вклад) 02:13, 21 August 2015 (UTC)Reply
I actually find them helpful in declenion/conjugation tables and would be opposed to removing them there. In headword lines and links, I don't care. --WikiTiki89 02:16, 21 August 2015 (UTC)Reply
How is a word stress helpful when there's only one syllable and only one word? I asked you recently the same question about Hebrew and you said you opposed the stress mark on monosyllabic terms. What's the difference? --Anatoli T. (обсудить/вклад) 03:20, 21 August 2015 (UTC)Reply
In a declension table, the monosyllable forms are there together with polysyllabic forms. Having stress marks on all of them helps you see the pattern in the declension. For Hebrew, the situation is different. Almost all words have final stress, so the stress mark is really only helpful when the stress is non-final. Also, Hebrew doesn't have declensions, or at least not the way Russian does and not with unpredictable stress patterns. --WikiTiki89 04:53, 21 August 2015 (UTC)Reply
I don't think that seeing "зу́б" in the table, as opposed to "зуб" helps you see the declension pattern any better - regular or irregular. If it does, then how? --Anatoli T. (обсудить/вклад) 04:59, 21 August 2015 (UTC)Reply
As for Hebrew, even if we do have tables, I still don't want to see the stress on monosyllabic forms in any language but I want to see it on inflected forms, derivations, feminine forms, etc. --Anatoli T. (обсудить/вклад) 05:04, 21 August 2015 (UTC)Reply
When you put one word by itself, it doesn't help much, but when it's in a table, you can more easily see that "зу́б" and "зу́ба" have stress on the same syllable than if it said "зуб" and "зу́ба". --WikiTiki89 05:07, 21 August 2015 (UTC)Reply
No, it gives nothing. Also, it's absolutely nontraditional, at least for Russian.--Cinemantique (talk) 06:37, 22 August 2015 (UTC)Reply

make_ending_stressed

[edit]

Wouldn't it be better to just re-create the dative ending with attach_stressed than to artificially move the stress? --WikiTiki89 20:05, 21 August 2015 (UTC)Reply

I did it that way in case the dative singular has an override, on the theory that the locative singular should always be stress-moved dative singular. Benwing2 (talk) 20:10, 21 August 2015 (UTC)Reply
But theoretically, if the dative were overridden, you wouldn't know that the last vowel is stressable. In such cases, it wouldn't be unreasonable to expect the locative to also be overidden. But I don't think there are very many irregular datives (or any at all). --WikiTiki89 20:15, 21 August 2015 (UTC)Reply
By the way, there seems to be a bug in the noun declension module that pre-reform declensions with loc=+ do not get their endings stressed. --WikiTiki89 14:43, 11 September 2015 (UTC)Reply
@Wikitiki89 Can you point me to an example that fails? Benwing2 (talk) 19:25, 11 September 2015 (UTC)Reply
Couldn't find one, so I made one. --WikiTiki89 20:15, 11 September 2015 (UTC)Reply
@Wikitiki89 Problem was the call to make_unstressed() when calling m_links.full_link() in make_table() in old-style declensions, when there's already a link present (which is the case with locatives). This is theoretically a bug in full_link() in that it ignores the alt text in these circumstances, but since I think the purpose of make_unstressed() here is just to convert ё to е, I fixed it by only making that change. Benwing2 (talk) 23:50, 11 September 2015 (UTC)Reply
But then there would still be a bug if the word had a ё in it, even though it is unlikely (but still possible) that such a case exists. The correct solution would be to not already have the link in entry. --WikiTiki89 19:41, 12 September 2015 (UTC)Reply

invariable declension

[edit]

Do we really need a whole declension table for invariable nouns? --WikiTiki89 21:39, 21 August 2015 (UTC)Reply

Well, I used it in a noun that was listed as invariable in its attested cases, but only attested for 4 out of 6 cases in the singular; so it's useful to have a table. Also, I'm planning on extending the module to handle multi-word expressions where each word is declined individually, and in that scenario some of the words have to be treated as invariable, e.g. in крем для бритья, where I'm thinking of a syntax like {{ru-decl-noun-multi|кре́м|для*|бритья́*}}; or in Сент-Винсент и Гренадины, it would be {{ru-decl-noun-multi|Сент-Ви́нсент|и*|Гренади́н:а|n=pl|n1=sg}}; or in Красная площадь it would be {{ru-decl-noun-multi|Кра́сн*ая|пло́щад:ь-f}}. (What I'm doing here is bunching up the separate arguments of the current template into one, with colons separating arguments for nouns in the order STEM:DECL:ACCENT:BARE:PL:BAREPL and taking advantage of default values, and adjectival declensions of the form STEM*DECL, and invariable words of the form WORD*. I think this is less ugly in these circumstances than the alternatives.) Benwing2 (talk) 22:01, 21 August 2015 (UTC)Reply
BTW the partly-attested invariable word in question is полпути. Benwing2 (talk) 22:08, 21 August 2015 (UTC)Reply
Well actually, in words with пол-, the second part is always in the genitive and they are really more adverbial than nouns. --WikiTiki89 22:18, 21 August 2015 (UTC)Reply
Some of them, like полсотни, seem to have full declensions. Benwing2 (talk) 22:25, 21 August 2015 (UTC)Reply
Well that's only because it replaces the other cases with полу-, which creates normal declinable nouns. полусотня is actually a word by itself too. --WikiTiki89 22:39, 21 August 2015 (UTC)Reply

Pre-reform declension of дерево

[edit]

Why doesn't this work: {{ru-noun-old|де́рево|-ья||дере́в|or|c|gen_pl=дере́вьевъ,дере́въ,дерёвъ}}. Also interestingly, {{ru-noun-old|де́рево|ъ-ья||дере́в|or|c|gen_pl=дере́вьевъ,дере́въ,дерёвъ}} does something strange. --WikiTiki89 21:53, 15 October 2015 (UTC)Reply

@Wikitiki89 Oops, fixed the bug. The second one is doing something strange because ъ-ья is a declension for nouns ending in -ъ rather than -о. Benwing2 (talk) 22:14, 15 October 2015 (UTC)Reply
Thanks. But you can say the same for -ья in the modern declension, that it is for nouns ending in - rather than -о. --WikiTiki89 00:32, 16 October 2015 (UTC)Reply
Well, -ья is overloaded in meaning; when it stands alone or following a gender, it's recognized specially and considered a "declension variant", otherwise it's considered an explicit declension. Perhaps I should have chosen a different signal for the declension variant, since -ья was already being used as an explicit declension; but it seemed to make the most sense that way. Benwing2 (talk) 05:10, 16 October 2015 (UTC)Reply
No, you're right. It's just I was confused by the bug and wanted clarification. --WikiTiki89 17:21, 16 October 2015 (UTC)Reply

паук-волк‎

[edit]

The declension has errors in the plurals. For example, current genitive plural пауко́в-во́лков (paukóv-vólkov) should be пауко́в-волко́в (paukóv-volkóv) instead. —Stephen (Talk) 05:29, 20 October 2015 (UTC)Reply

@Stephen G. Brown Fixed. Thanks for noticing it. Benwing2 (talk) 05:44, 20 October 2015 (UTC)Reply

Your bot edits regarding pre-reform entries

[edit]

I don't think pre-reform entries should have pronunciation sections, since they are just duplicates of the modern entries. I also think pre-reform declensions should use the categories for modern entries or should not be placed in categories at all. --WikiTiki89 17:58, 27 October 2015 (UTC)Reply

@Atitarev, Cinemantique What do you think? It is not very hard to change the categories and put the pre-reform entries in modern categories. It is a bit trickier to eliminate the pronunciation sections since it means writing a bot to undo the previous changes, and it's not clear to me it's a good idea in any case -- I think it might be useful to have pronunciation sections since otherwise people may be confused by the old characters. Benwing2 (talk) 21:47, 27 October 2015 (UTC)Reply
I support centralisation of contents. No need to duplicate. Soft-redirects don't need pronunciations. Actually the same should apply to Category:Russian spellings with е instead of ё. --Anatoli T. (обсудить/вклад) 22:03, 27 October 2015 (UTC)Reply

отсос пограничного слоя

[edit]

Error in the header line. I am not familiar with that encoding, so I can’t fix it. —Stephen (Talk) 05:07, 30 October 2015 (UTC)Reply

Thanks. Benwing2 (talk) 05:09, 30 October 2015 (UTC)Reply

Screenshot request

[edit]

Hi Benwing. Sorry to pester you, but would you mind responding to this request of mine, please? — I.S.M.E.T.A. 22:02, 10 November 2015 (UTC)Reply

@I'm so meta even this acronym Sorry! I seem to have problems reading. I saw your response but didn't manage to read it carefully and so missed your request. I'll post the screenshot in a second. Benwing2 (talk) 22:16, 10 November 2015 (UTC)Reply
Wonderful! Thank you so much. — I.S.M.E.T.A. 03:31, 11 November 2015 (UTC)Reply

WingerBot source

[edit]

The github link on User:WingerBot does not work anymore, is the source code no longer available? Jberkel (talk) 11:08, 11 November 2015 (UTC)Reply

отпереть

[edit]

Hi. The verb отпереть has a bad conjugation. The present tense and the imperatives should be like отопру́, etc., with отопр- at the beginning. —Stephen (Talk) 21:47, 14 November 2015 (UTC)Reply

@Atitarev, Cinemantique, Wikitiki89 I'm not really sure how the verbal templates work; Anatoli or others, can you fix it? Benwing2 (talk) 21:58, 14 November 2015 (UTC)Reply
Mostly done but I'll have to force |past_f=отперла́, like with some other verb types. --Anatoli T. (обсудить/вклад) 22:09, 14 November 2015 (UTC)Reply
Fixed. --Anatoli T. (обсудить/вклад) 23:30, 14 November 2015 (UTC)Reply
@Atitarev: please check Category:Pages with module errors. Chuck Entz (talk) 01:48, 15 November 2015 (UTC)Reply
@Atitarev: Are there also colloquial forms отпёр/отпёрло/отпёрла/отпёрли, or am I imagining them? Also, the module errors were caused by your recent edit to the module. --WikiTiki89 02:43, 15 November 2015 (UTC)Reply
Fixed the fix. Yes, отпёр/отпёрло/отпёрла/отпёрли are colloquial forms, especially in some vulgar senses. --Anatoli T. (обсудить/вклад) 08:11, 15 November 2015 (UTC)Reply

About the voice recordings

[edit]

Hi,

Our renovation is in full swing. I have to apologise again for not making the recordings still. I just don't have a quiet moment at home at the moment. I will do it as soon as I have some time when it's convenient. --Anatoli T. (обсудить/вклад) 01:45, 3 December 2015 (UTC)Reply

OK, sounds good. Good luck with your renovation. Benwing2 (talk) 06:46, 3 December 2015 (UTC)Reply

I am not sure if you got my ping at User talk:Atitarev/recording.--Anatoli T. (обсудить/вклад) 21:01, 5 December 2015 (UTC)Reply

@Atitarev I didn't get your ping, not sure why. Thanks very much for the recordings! I'll check them out ASAP. Benwing2 (talk) 05:31, 6 December 2015 (UTC)Reply
Hi. I am not 100% sure if it is a great idea to change the test cases and pronunciation rules based on my recordings and your findings right now. I've provided one way of pronouncing those words, which is not the only correct way in some cases, besides, perhaps it's better to check with some references as well and have a discussion. What do you think? @Cinemantique, Wikitiki89, Wanjuscha, Stephen G. Brown, you're welcome to comment on my recordings and pronunciation rules (final -е). Cinemantique will probably oppose some new test cases in Module:User:Benwing2/ru-pron/testcases. E.g. "дава́йте" can be both [dɐˈvajtʲe] and [dɐˈvajtʲɪ] but only the latter is referenced. --Anatoli T. (обсудить/вклад) 23:19, 7 December 2015 (UTC)Reply
@Atitarev Are you referring to the test cases in Module:User:Benwing2/ru-pron/testcases? I put them in about a week ago. They are almost all directly based on Cinemantique's references except for interpreting Avanesov's -ь as [-e] instead of [-ɪ]. They're not based off of your recordings, which I'm still going through. The code in Module:ru-pron is based off of the same thing as the testcases, but that code isn't enabled and I won't enable it until I get some sort of consensus. BTW I've asked both User:Cinemantique and User:Wikitiki89 for comments about final -е and haven't received any recently. I'm still hoping someone can look up and post the text of the references that Wikipedia uses to justify the pronunciation of final -е in жи́теле as [ˈʐɨtʲɪlʲɛ]: They are Avanesov 1975 pages 121-125 ("Фонетика современного русского литературного языка") and Avanesov 1985 page 666 ("Сведения о произношении и ударении", in Borunova, C.N.; Vorontsova, V.L.; Yes'kova, N.A., Орфоэпический словарь русского языка. Произношение. Ударение. Грамматические формы).
If you have time to do any more recordings, take a look at the new comments in Module:User:Benwing2/ru-pron/testcases. I've tried to create a bunch of minimal pairs or near-minimal pairs that should make it much clearer whether final -е in various circumstances is pronounced the same as -я, the same as -и, or neither. Without the direct comparisons, it's harder to say whether for example "дава́йте" is [dɐˈvajtʲe] or [dɐˈvajtʲɪ] or both.
BTW almost anything would be an improvement to what we have now, where most final -е's are rendered as [ʲə]. Benwing2 (talk) 23:44, 7 December 2015 (UTC)Reply
I listened to the recordings at User talk:Atitarev/recording and I thought they were excellent. They are clear and correct. —Stephen (Talk) 23:52, 7 December 2015 (UTC)Reply
Thank you. Benwing2 (talk) 23:53, 7 December 2015 (UTC)Reply

──────────────────────────────────────────────────────────────────────────────────────────────────── @Atitarev BTW, comments on your recordings: First, thank you very much for making them. In general, they are quite fast for me, so I'm probably not the best to be making comments on them. But for the most part, in most words I hear something like final [ɛ], or at least it's lower than cardinal [e]; the exceptions are primarily -ое, where (at least in some words) I clearly hear [-ə], and after hard шжц, where in some words it sounds like [-ə] and in some words [-ɨ] and in some words in between. For example, in your pronunciation of дунове́ние near the beginning, the stressed е sounds clearly like cardinal [e] and the unstressed е is clearly lower. Similarly for other words in -е́ние, and also in дружне́е near the end (and for the matter, I hear cardinal [e] rather than [ɛ] in the э́ in Тайбэ́е). In general, I don't hear [-ɪ], although it's hard to say for sure because the vowels are so short and because [ɪ] can mean various different things, e.g. English [ɪ] in words like bit and pin is quite different from [i] and is probably much lower and more central than what is intended by Russian [ɪ]. If Russian [ɪ] is intended as something that's not hugely different from [i], then the only times I can recall that I heard that sound is in девяно́сто четы́ре, два́дцать четы́ре and in да́йте in да́йте мне (in the latter case, it may be because the -е isn't phrase-final). Benwing2 (talk) 00:17, 8 December 2015 (UTC)Reply

линя́ет

[edit]

Should be [lʲɪˈnʲæ(j)ɪt], not [lʲɪˈnʲaɪt]. --WikiTiki89 18:53, 7 December 2015 (UTC)Reply

@Wikitiki89 I assume that in place of [lʲɪˈnʲæ(j)ɪt] you should have either [lʲɪˈnʲæjɪt] with /j/ and fronted [æ], or [lʲɪˈnʲaɪt] without /j/ and without fronted [æ], since fronting of [æ] occurs only between palatal consonants. Both versions are indicated. Is this not correct? Benwing2 (talk) 19:09, 7 December 2015 (UTC)Reply
Oh, I didn't even realize that that word is the reason that there were two transcriptions at волк каждый год линяет, а всё сер бывает. Anyway, it would be [lʲɪˈnʲæɪt], because the /j/ is phonemically there even if it is dropped phonetically. Same goes for -я́и-. --WikiTiki89 19:19, 7 December 2015 (UTC)Reply
@Wikitiki89 OK, that will simplify some things, although we will still need the double transcription for words like счастливо where there's optional palatal assimilation, unless something similar with underlying phonemic /sʲ/ or /s/ is going on there as well. Benwing2 (talk) 19:27, 7 December 2015 (UTC)Reply
Yes, счастливо will need two transcriptions. --WikiTiki89 19:29, 7 December 2015 (UTC)Reply
@Wikitiki89, Cinemantique I'm pretty sure the conjunction а is not reduced in волк каждый год линяет, а всё сер бывает and other cases. --Anatoli T. (обсудить/вклад) 03:41, 8 December 2015 (UTC)Reply
I'm not sure. The difference between [ɐ] and [a] is fairly small, and since it is at the beginning of a word, it would never be [ə] in either case. But I tried saying a few phrases to myself, and I think that it is in fact reduced. --WikiTiki89 16:58, 8 December 2015 (UTC)Reply

Some pronunciation issues I came across

[edit]
  1. In Комите́т госуда́рственной безопа́сности (Komitét gosudárstvennoj bezopásnosti), the /t/ should not assimilate to the /g/. Perhaps this sort of assimilation only occurs when the next syllable is stressed?
  2. In ни пу́ха ни пера́ (ni púxa ni perá), the ни (ni) should be reduced to [nʲɪ]. This also applies to many short function words.
  3. I think пролива́ть све́т (prolivátʹ svét) should produce [prəlʲɪˈvatʲ͡s ˈsvʲet]. That is, the onset of the [tʲ] is still palatalized, but the rest is not; even more precisely, the depalatalization occurs before the transition from plosive to fricative, but I don't know how to incorporate such details into IPA.
  4. I think that Сме́рть шпио́нам! (Smértʹ špiónam!), should be [ˈsmʲerʈ͡ʂ ʂpʲɪˈonəm]. There may or may not still be a trace of palatalization at the onset of the [ʈ], but even if there is, there is much less than in the case above.

I honestly think we should stop trying to provide these kind of overdetailed assimilation details. --WikiTiki89 22:12, 8 December 2015 (UTC)Reply

For #1, as it happens I recently added a feature to disable assimilations, an underscore between consonants. #2 is easily fixable with a tie bar. For #4 we can disable the code that converts palatalized retroflexes to alveolopalatals, if Anatoli agrees. #3 will require a bit more work (although not that much). I don't mind including detailed assimilation info but we can take some of it out if Anatoli agrees, eg maybe we don't need to show assimilation only of the first half of an affricate, maybe it's enough to show full palatalization. Benwing2 (talk) 22:46, 8 December 2015 (UTC)Reply
@Wikitiki89 Does vowel reduction occur in all instances of unstressed ни? The other expressions are ни к селу, ни к городу, ни крошки, ни о ком, ни пуха, ни рыба ни мясо, ни с того ни с сего, ни в пизду, ни в Красную Армию, ни хуя, ни за что, сколько волка ни корми, он всё в лес смотрит, несмотря ни на что, как ни в чём не бывало, во что бы то ни стало. If so, I can add ни to the list of unstressed particles and it will automatically be reduced. Note that this will make it sound like не. Benwing2 (talk) 23:09, 8 December 2015 (UTC)Reply
@Atitarev, Wikitiki89 As for #1, Anatoli do you agree that the /t/ doesn't assimilate to [d], and is there a rule for this (e.g. what Wikitiki suggested) or is it irregular? Benwing2 (talk) 23:12, 8 December 2015 (UTC)Reply
It think it can assimilate in fast speech to normal speech like in (@Wikitiki89, it's no different from "брат Ге́ны") but I understand the concern that maybe we are going too far with "overdetailed assimilations details"? --Anatoli T. (обсудить/вклад) 23:22, 8 December 2015 (UTC)Reply
In that case I think it's OK to indicate assimilation. I think maybe #3 above is beyond the level of what we need to worry about but the others can be handled without difficulty. Benwing2 (talk) 23:24, 8 December 2015 (UTC)Reply
@Atitarev: You're right, I would not usually say [brad ˈgʲenɨ] either. @Benwing: I cannot think of a situation where ни is not reduced, unless it is explicitly stressed (as in что́ бы ни́ было). --WikiTiki89 15:53, 9 December 2015 (UTC)Reply
@Wikitiki89 I added ni to the list of accentless particles. Benwing2 (talk) 13:14, 10 December 2015 (UTC)Reply
Is there a way to force reduction for other particles? Personally, I think that any word without an accent mark should be interpreted as accentless in the pronunciation module. That would make the behavior more consistent. I also found another bug: "ото все́х" should produce [ɐtɐ ˈfsʲex], not [ɐtə ˈfsʲex]. --WikiTiki89 16:23, 10 December 2015 (UTC)Reply
Yes, you can force reduction by putting a dot over the vowel. Benwing2 (talk) 21:52, 10 December 2015 (UTC)Reply
Oh, I was using the wrong Unicode character when I tried that. The problem is "да̇ здра́вствует" produces [də ˈzdrafstvʊ(j)ɪt], but how can I force it to produce [dɐ ˈzdrafstvʊ(j)ɪt]? (Interestingly, in most other phrases where да is followed by a stressed syllable, such as да ну́, it for some reason remains [də ˈ-].) --WikiTiki89 23:42, 10 December 2015 (UTC)Reply
In most cases да is stressed but not as a particle. Wikitiki89 is right, "да здравствует" should give [dɐ ˈzdrastvʊ(j)ɪt] (the first "в" is silent but there are cases when "вств" is [fstv]). --Anatoli T. (обсудить/вклад) 00:02, 11 December 2015 (UTC)Reply
You're right, I wasn't paying attention to the word здравствует there. I think that in the cases where да is stressed, even if it is a majority of cases, it should have a stress mark. --WikiTiki89 15:19, 11 December 2015 (UTC)Reply

There isn't currently a way to do that. Ifs there a way to predict which of the two unstressed a variants should occur? If not then I'll have to add some way to specify which one is needed. Benwing2 (talk) 00:42, 11 December 2015 (UTC)Reply

The pronunciation of unstressed vowel is the same, no different from other cases, ɐ for immediately pretonal, ə - otherwise. Maybe да should be added to the list of prefixes and an accent would be required when it's stressed? --Anatoli T. (обсудить/вклад) 00:53, 11 December 2015 (UTC)Reply
@Atitarev, Wikitiki89 I'll add да to the list of unstressed particles. This will affect вот это да, да здравствует, лакома кошка до рыбки, да в воду лезть не хочется, ходить вокруг да около. But as for the pronunciation of unstressed а, Wikitiki says that it's not always predictable, e.g. да здравствует with [dɐ] but да ну with [də]. Do you agree? Benwing2 (talk) 21:33, 11 December 2015 (UTC)Reply
It's [ɐ], especially in Moscow pronunciation.--Anatoli T. (обсудить/вклад) 01:28, 12 December 2015 (UTC)Reply
The [də] could just be a me-ism. I'll try to investigate. --WikiTiki89 16:13, 14 December 2015 (UTC)Reply
@Atitarev Can you check the pronunciation of слышал звон, да не знает, где он? Not sure that да should be stressed here. Benwing2 (talk) 21:40, 11 December 2015 (UTC)Reply
Checked. Could could you please hide (i.e. display as a space) the tie bar from the Cyrillic spelling when "phon=" is used, as in только что? --Anatoli T. (обсудить/вклад) 06:11, 12 December 2015 (UTC)Reply
@Atitarev Done. Benwing2 (talk) 09:51, 12 December 2015 (UTC)Reply
@Wikitiki89 OK, there is in fact a way of controlling whether you get [ɐ] or [ə], at least in some cases. If the word with unstressed а is considered part of the phonological word with the following stressed syllable, you'll get [ɐ], otherwise you'll get [ə]. So if you add a tie bar, e.g. ото‿все́х, you'll get [ɐ]. In this particular phrase it isn't necessary any more because I added ото to the list of unstressed particles, so it will automatically link to the next word. Benwing2 (talk) 21:47, 11 December 2015 (UTC)Reply
Great, thanks! --WikiTiki89 16:13, 14 December 2015 (UTC)Reply

муравей

[edit]

I wonder why you marked as "inherited" a term that is apparently not even supposed to have an entry? —CodeCat 20:29, 9 December 2015 (UTC)Reply

@CodeCat, Wikitiki89 This entry was done with a bot. I think what Wikitiki did was reasonable. Benwing2 (talk) 12:15, 10 December 2015 (UTC)Reply

Module error at Template:ru-verb

[edit]

This looks like a side effect to an edit at Module:ru-headword. Please check into it. Thanks! Chuck Entz (talk) 00:19, 11 January 2016 (UTC)Reply

@Chuck Entz Fixed. Benwing2 (talk) 04:26, 11 January 2016 (UTC)Reply

Auto-accenting

[edit]

I have explicitly opposed accenting, especially auto-accenting, direct quotations in discussions with you in the past. Was there any recent discussion about auto-accenting that I missed before you ran your auto-accent bot? --WikiTiki89 22:22, 11 January 2016 (UTC)Reply

@Wikitiki89 Why do you oppose? --Anatoli T. (обсудить/вклад) 22:27, 11 January 2016 (UTC)Reply
@Atitarev: I've already explained this in many discussions in the past: We shouldn't modify direct quotations. But the point is that Benwing should have discussed this before doing a bot run, especially (but not only) because he should have known that I expressed my opposition to this past. --WikiTiki89 22:31, 11 January 2016 (UTC)Reply
Ah, you're talking about quotations. I thought you oppose links to lemmas in inflected forms. --Anatoli T. (обсудить/вклад) 22:39, 11 January 2016 (UTC)Reply
Yes, I said "direct quotations". Here's an example edit: diff. But even ordinary links I would have liked to discuss before the bot run. --WikiTiki89 22:59, 11 January 2016 (UTC)Reply
@Wikitiki89 Apologies! I don't remember those discussions, otherwise I wouldn't have done it. I did an auto-accenting run once before (including quotations, I think; at the very least, my old auto-accenting script didn't have provisions to avoid direct quotations). I remember discussing it with you guys and getting assent to do it before I did it last time, that's why I didn't ask again. But it's quite possible that I somehow overlooked your opposition to auto-accenting direct quotes last time I did it. In any case, if you want, I'll write a script to undo the auto-accenting of direct quotes. If you want this done, it should probably be done to {{ux}}, {{usex}} and {{lang}}, because that's generally how direct quotes are formatted. Benwing2 (talk) 00:23, 12 January 2016 (UTC)Reply
BTW, note that quite a lot of existing quotations have accents in them (i.e. before my auto-accenting run), which almost certainly weren't in the original. I'm not quite sure why you oppose adding accents to direct quotes (we're a dictionary, after all), but I'm fine with reverting to the status quo ante. (Do you also oppose adding ё where it should be?) Benwing2 (talk) 00:27, 12 January 2016 (UTC)Reply
Benwing2, if you revert the accents, please restore manual (accented) transliterations there may be.
I still think it's silly to try to preserve unaccented original text in dictionaries. In texts for foreigners and children accents and "ё" are used. Besides, nobody quotes Pushkin's or Lermontov's original orthography. Chinese book publishers would have to cancel character simplification if they were to preserve the original spellings. --Anatoli T. (обсудить/вклад) 00:48, 12 January 2016 (UTC)Reply
@Benwing2: It would be great if you could revert your bot's additions, you don't have to remove them if it wasn't your bot that added them. Keep in mind that bot edits should not be controversial, even if manual edits can be. And yes, I also oppose adding ё when it was not in the source. I thought I had mentioned that accents should not be added to quotations when you first brought up this script here, but I guess my memory must be failing me. Anyway, here is a recent BP discussion we had on the topic, in which there was no strong consensus, but overall opinion was seemingly leaning toward not modifying quotations. @Atitarev: You made those same arguments in the BP discussion I just linked to, and I already responded to them there. I suggest you re-read my comments there. --WikiTiki89 01:14, 12 January 2016 (UTC)Reply
@Wikitiki89, Atitarev Wikitiki, as you might surmise, I agree with Anatoli, but I will revert, per bot policy. Anatoli, it will revert exactly to the state it was before my bot changed it, including any manual transliterations. (It textually substitutes the old text for the new one.) Benwing2 (talk) 04:05, 12 January 2016 (UTC)Reply
@Wikitiki89, Atitarev There were over 26,000 substitutions made by my auto-accenting script. Of those, about 1,000 involve one of the following four templates: {{ux}}, {{lang}}, {{usex}}, {{ru-ux}}. The large majority of these 1,000 aren't direct quotes, but other sorts of illustrating phrases or sentences. I'd like to only revert the actual direct quotes. The full set of 1,000 or so replacements involving those four templates is in User:Benwing2/ru-maybe-direct-quote. Can the two of you help go through this? The thing to do is to find and delete the lines that are direct quotes. Any remaining line will be left alone and not reverted. (It might be more logical to reverse things and have you delete the lines that aren't direct quotes, but there are many more of those.) Benwing2 (talk) 07:15, 12 January 2016 (UTC)Reply
@Wikitiki89, Atitarev BTW, if you put a line at the end of where you've gotten, it will help you and others not duplicate work. Benwing2 (talk) 07:17, 12 January 2016 (UTC)Reply
@Atitarev, Cinemantique, Wikitiki89 I've gotten through page 7502 (end of е section), and put ?? in a couple of places where I wasn't sure whether they were direct quotes or not. There are only a few direct quotes here, so going through them is fast. Benwing2 (talk) 10:22, 12 January 2016 (UTC)Reply
Thanks. Somehow I don't feel like going through big lists right now. I had a hard day. If it's no hurry I'll do some over time. Even if you revert them all I won't complain. It's too much work :).--Anatoli T. (обсудить/вклад) 10:48, 12 January 2016 (UTC)Reply
Instead of manually going through the lists, you can distinguish usage examples from citations by their context:
# definition
#: usage example
# definition
#* citation
#*: direct quotation
I can live with any errors remaining after that distinction is made. --WikiTiki89 14:02, 12 January 2016 (UTC)Reply
@Atitarev: I think a good win-win solution would be to develop a template (or additional feature in the {{usex}}/{{ux}} template) that (with the help of JavaScript) would allow readers to switch between the original text and the fully accented text (perhaps even with links). But this would mean having to manually input both versions of the text (because the original text may or may not already have sporadic accents). It could even allowing showing and hiding the transliteration. --WikiTiki89 22:16, 12 January 2016 (UTC)Reply
@Wikitiki89: Accented original texts are extremely rare. I think fully accented text should be added and converted to unaccented + ё -> е to get the original text. That method should also work for accented Arabic and the Japanese furigana. Accented Hebrew maybe not. Chinese already uses a semi-automated conversion. --Anatoli T. (обсудить/вклад) 22:37, 12 January 2016 (UTC)Reply
@Atitarev: Fully accented texts are of course rare, but sporadic accents are not so rare. A text that otherwise does not use "ё" might still occasionally use it in a few places for disambiguation. This is a bit rarer with stress marks, but I have seen it. In Arabic, texts very frequently add fatḥatān or šadda where applicable, or ḍamma to mark the passive. In Hebrew, I have seen רוצָה to make clear that it is feminine (see the first quotation at קומבינה). But because there isn't much consistency to this, there is no way that an automated accent remover can predict which accents were in the text and which ones weren't. But I believe I've already mentioned these things to you before. Do you still have anything against this idea? --WikiTiki89 22:54, 12 January 2016 (UTC)Reply
@Wikitiki89 I think we can work on the assumption that the original text is unaccented. Otherwise, some overrides should be required - more complexity. As for Hebrew, I meant cases when basic letters in the accented text don't match the number of letters in the unaccented variant. I can't think of an example but I hope you know what I mean. --Anatoli T. (обсудить/вклад) 00:18, 13 January 2016 (UTC)Reply
That's exactly the problem. I don't think we should work on any assumptions at all about the original text. There's no need for overrides either, I don't know why you're making this more complex than it was. It's pretty simple to just provide both the original text and the fully accented and linked text. --WikiTiki89 00:25, 13 January 2016 (UTC)Reply
IMO it's a pain in the ass to have to provide two versions of every text. And it's not necessary either in most cases. It should be set up so you provide the fully accented text, and it automatically derives the unaccented text unless you provide it yourself. I think this is what Anatoli is saying too, and it's compatible with having both versions available. It's similar to what's done now where the linked version of the text is derived from the accented version by default, but you can supply both if you want. Benwing2 (talk) 02:45, 13 January 2016 (UTC)Reply
Quotations take a good deal of work to prepare anyway. The only extra step here is to copy and paste the text before adding accents, which is insignificant compared to the work involved in transcribing, citing, and adding accents. Or if you were to do it by bot, the bot would do that part for you. The concern with duplication is not the effort required to press Ctrl+C and Ctrl+V, but that the cost of maintenance is doubled. Luckily, quotations don't need much maintenance. Anyway, if this feature were to be implemented, no one's gonna force you to use it you don't want to. But more importantly, did you read my comment above (14:02 UTC)? --WikiTiki89 03:25, 13 January 2016 (UTC)Reply
I still disagree; there's no point in duplication if it can be avoided. I did see your comment above (14:02 UTC) and I've been thinking today about how to implement it. It might end up taking less time for me to just go through the remainder of the list manually than to write the code to locate the quotations; on the other hand doing so will make life easier if I ever do another auto-accenting run. Benwing2 (talk) 03:28, 13 January 2016 (UTC)Reply
If it helps, I think you can simplify the rule to simply check whether the line begins with #* (with or without a subsequent string of colons). I'm also not a fan of duplication, but I see this as the only viable solution for the general case (in specific cases, we may be able to avoid the duplication). Keep in mind that I still see the original text as the main focus of the quotation and the default display, the accented version is only a bonus, and does not have to be added for every quotation. --WikiTiki89 03:53, 13 January 2016 (UTC)Reply
Well, we disagree also in whether the focus should be on the original or accented version, so it's not clear we can find a template that will satisfy both of us. As for checking for quotations, I think I can just use a capturing regex split and it will make it easy to check the context and also snarf the template. I don't think just checking for #* + colons is necessarily enough because some direct quotes are made to be always visible, but it might catch the large majority of them. I'll also look for ref= in the {{ux}} or {{usex}}. Benwing2 (talk) 04:00, 13 January 2016 (UTC)Reply
@Wikitiki89 I undid auto-accenting of quotes. It ended up a lot easier than I thought. I just looked for #* + possible colons, and this seemed to have caught most everything. If you see any other quotes that need fixing, let me know. Benwing2 (talk) 21:00, 13 January 2016 (UTC)Reply
I think that's good enough as a general rule for future auto-accenting bot runs. If a quote is not correctly placed under a bullet, that is the problem of whoever added it. --WikiTiki89 21:32, 13 January 2016 (UTC)Reply
I just remembered about subscences. If you ever plan on running your auto-translit bot again, make sure it takes into account that there could be more than one hash mark before the asterisk. --WikiTiki89 21:44, 21 January 2016 (UTC)Reply
@Wikitiki89 OK. Benwing2 (talk) 21:58, 21 January 2016 (UTC)Reply

игого and огого

[edit]

Hi,

Could you add two more exceptions to [ɡ] pronunciations (translit as "g"), please - interjections игого́ (igogó) (imitates the sound of horse neighing) and огого́ (ogogó) (variant of ого́ (ogó))? --Anatoli T. (обсудить/вклад) 01:45, 20 January 2016 (UTC)Reply

Automatic vocalization

[edit]

Automatic vocalization and transliteration of loanwords in Arabic is usually wrong and unnecessary. I advise you to exclude those from your robot. --Mahmudmasri (talk) 07:36, 26 January 2016 (UTC)Reply

@Mahmudmasri The automatic transliteration may be incorrect in some instances (because they can be a variant pronunciation) but it may be overwritten manually with "tr=". Manual takes precedence over automatic. I totally disagree about the vocalisation. It's the native method of showing vocalisation, e.g. مِتْرُو m (metru, metro). Arabic sources don't provide either IPA or a Roman transliteration. The only verifiable source may be a recorded evidence from a media report. --Anatoli T. (обсудить/вклад) 10:06, 26 January 2016 (UTC)Reply
@Benwing2 This message was meant for you but you probably didn't get a notification on this account.--Anatoli T. (обсудить/вклад) 20:45, 26 January 2016 (UTC)Reply
Anatoli, thanks for the ping. @Mahmudmasri I did that run 6 months ago and I probably won't do another such run, so don't worry. I agree with Anatoli that we should try to provide transliteration where possible. In any case, those transliterations were already there. I don't think the vocalization of loanwords is necessarily a problem; I agree that sometimes there are variant pronunciations but those mostly concern issues that won't be reflected in the vocalization (e vs. i, o vs. u, long vs. short vowels), and didn't prevent Hans Wehr from giving pronunciations for loanwords. Benwing2 (talk) 22:22, 26 January 2016 (UTC)Reply

Wiktionary:Beer parlour/2016/January#Arabic loanwords and vocalisations. --Mahmudmasri (talk) 20:36, 6 February 2016 (UTC)Reply

@Benwing2 --Anatoli T. (обсудить/вклад) 00:50, 7 February 2016 (UTC)Reply

New challenges?

[edit]

I would be bored if I only had to work with Russian, besides, I hate Russian politicians and the way things are going there but I work with Russian because I can. Anyway, are you interested in working with other complex languages? You've already done some amazing work with Arabic and Russian. Pity you stopped working with Arabic. Loanwords could be improved for words in Hans Wehr. What do you think of partially automating Thai transliteration? It's much less predictable than its relative Lao and is more complicated (more letters and consonant clusters). The challenge is not only transliterate but provide tones, which Lao module doesn't do (but could). I've got only basic Thai but I am keen to improve it. We also have a couple of active or semi active editors who know Thai. Resources are not great but can be used to some extent.--Anatoli T. (обсудить/вклад) 06:51, 27 January 2016 (UTC)Reply

I stopped working with Arabic for various reasons. I didn't really have enough knowledge of the language to feel competent to add new entries, and there are tons of missing entries, and I was concerned about copyright violation if I just use what Hans Wehr's dictionary says (there are out-of-copyright Arabic dictionaries but they are unreliable or hard-to-use, full of outdated senses and missing many modern senses). But also, there weren't any other active native-speaker editors working on the language, so I felt I was working more-or-less blindly. And there wasn't anything like ruwikt that I could find -- the Arabic Wiktionary is terrible. As for Thai, I'm about to start a new job so I'm wary of diving into a new language at the moment, but it's something I'll definitely consider for the future. Benwing2 (talk) 08:13, 27 January 2016 (UTC)Reply
Yes, sure. I can't afford too much time either and for this I would need some concentration and my books. Before you even consider this project (if you decide to) you would need to do a feasibility study. It may turn out too hard or impossible for objective reasons. --Anatoli T. (обсудить/вклад) 09:29, 27 January 2016 (UTC)Reply

algazarra, algazara

[edit]

Do we have entries for the Arabic origins? Transliteration is usually given as alḡazara, pertaining from ḡazārah (abundance). Thanks. – Jberkel (talk) 23:01, 27 January 2016 (UTC)Reply

We don't currently have an entry for غزارة. It's been a long time since I've created any Arabic entries; I might be able to manage this though. Benwing2 (talk) 00:40, 28 January 2016 (UTC)Reply
No rush. It already helps to have the correct lemma to link to. Jberkel (talk) 12:19, 28 January 2016 (UTC)Reply
@Jberkel I've started غَزَارَة (ḡazāra) using Benwing's templates and modules. الغَزَارَة (al-ḡazāra) is its definite form. The entry can be checked and improved but I think it's correct. --Anatoli T. (обсудить/вклад) 12:45, 28 January 2016 (UTC)Reply
Thanks! About the definite form, this is the one which should be displayed (3rd parameter to {{m}}) in etymologies, when al- is part of the derived word? Jberkel (talk) 13:38, 28 January 2016 (UTC)Reply
Thanks, Anatoli! Benwing2 (talk) 13:47, 28 January 2016 (UTC)Reply
You're welcome. @Jberkel It's up to you, I don't think we have a policy on that. Same with azáfama, definite Arabic form is الزَحْمَة (az-zaḥma). --Anatoli T. (обсудить/вклад) 22:13, 28 January 2016 (UTC)Reply
Yeah, I agree with Anatoli, we don't seem to have a policy, but I think it's a good idea. Benwing2 (talk) 22:20, 28 January 2016 (UTC)Reply
It depends what you want to show and how much information is reasonable to provide. In case of غَزَارَة (ḡazāra), users might also want to know that "ال" is just a definite article, not part of the word. Interesting enough, English borrowing from Arabic Riyadh is without the article, but Russian Эр-Рия́д (Er-Rijád) matches closer the original اَلرِّيَاض (ar-riyāḍ). --Anatoli T. (обсудить/вклад) 00:39, 29 January 2016 (UTC)Reply

Hello, Benwing2, all right? About "egl" and "eml" abbreviations...

[edit]

I noticed that in the page "", since I wanted to write about Emilian-Romagnolo language, I had to write the abbreviation "egl" (that conversely usually refers only to Emilian language) instead of "eml"=Emilian-Romagnol language. Do you think in the future I'll have to write "egl" again, or there is something to fix upstream? Thank you in advance, --Gloria sah (talk) 16:39, 2 February 2016 (UTC)Reply

Appendix:Russian pronunciation

[edit]

Hi,

Are you interested in expanding this a bit and describing what you learned about the Russian phonology? I meant to do it but procrastinated. Otherwise we just link it to Wikipedia. --Anatoli T. (обсудить/вклад) 12:09, 3 February 2016 (UTC)Reply

I'll try to add to it but I think linking to Wikipedia is a good idea, looks like it's already done in fact. Benwing2 (talk) 17:12, 9 February 2016 (UTC)Reply

Russian 5a verbs that start with вы́-

[edit]

It looks like the imperative verb forms should have -и(те) instead of -ь(те). See ru-wikt versions of предвидеть and вылететь. --KoreanQuoter (talk) 15:03, 9 February 2016 (UTC)Reply

@Atitarev Also summoning our prijatel'. --KoreanQuoter (talk) 15:04, 9 February 2016 (UTC)Reply
Thanks! (предвидеть is already correct, maybe you meant another verb?) Luckily there are only 5 verbs involved here (выглядеть, выгнать, выдержать, вылететь, выстоять) and one of them (выгнать) is already correct. I think this is fixable by adding the и argument to the other verbs. We'll have to delete the bad forms but it should only be 8 pages. Benwing2 (talk) 17:03, 9 February 2016 (UTC)Reply
Thanks for spotting! Good job, Taeho! Fixed the tables, sorry. Some forms need to deleted and some regenerated, some have more senses - indicative and imperative.--Anatoli T. (обсудить/вклад) 20:01, 9 February 2016 (UTC)Reply
Also fixed выкипеть. This really bugged me for several days. --KoreanQuoter (talk) 00:11, 10 February 2016 (UTC)Reply

pings

[edit]

Hi,

I just want to make sure you got the answers to all your latest questions (and making changes accordingly). I didn't ping you in my answers but I hope you're keeping track of your questions and answers to them. :) --Anatoli T. (обсудить/вклад) 02:04, 4 March 2016 (UTC)Reply

Yup, I saw all of them and fixed things up. Thanks! Benwing2 (talk) 02:56, 4 March 2016 (UTC)Reply

Fun With Parameter 6

[edit]

Please take a look at Cat:E. Chuck Entz (talk) 04:45, 9 March 2016 (UTC)Reply

Thanks, I'm fixing them right now. Benwing2 (talk) 04:46, 9 March 2016 (UTC)Reply

Clean up

[edit]

I think there is a good need to clean up User:Benwing2/russian-freq-redlinks. I think I have to contribute russian-freq-redlinks even more. --KoreanQuoter (talk) 05:11, 25 March 2016 (UTC)Reply

I'll rerun it soon. BTW if you want to contribute entries you should do it from the top. Benwing2 (talk) 05:14, 25 March 2016 (UTC)Reply
I haven't touched the list for a while, so I'll remind of myself of that. Thank you. --KoreanQuoter (talk) 05:19, 25 March 2016 (UTC)Reply
@KoreanQuoter I have to ask you ... why do you add obscure and/or obsolete terms, like претвори́ться (pretvorítʹsja)? Are you finding them among stuff you're reading? (My dictionary says претворить is obsolete and means "change, transform"; претвориться is presumably the intransitive equivalent.) IMO it would be much more helpful to add common terms. The obscure terms are filling up Category:Russian entries needing definition, making it rather less useful. Benwing2 (talk) 06:30, 25 March 2016 (UTC)Reply
If you're looking for common terms, start at entry 8000 in User:Benwing2/russian-freq-redlinks and go through the adjectives that are red. Benwing2 (talk) 06:31, 25 March 2016 (UTC)Reply
I've been very busy in real life for the past several months and I just can't concentrate distinguishing whether they're common or not. I've been thinking of 2-3 months of Wikibreak or just contributing only in the weekends, but I declined. So my editing style is literally messed up at this point. --KoreanQuoter (talk) 08:15, 25 March 2016 (UTC)Reply
@KoreanQuoter OK. Sorry to be short with you. It seems in any case that претвори́ться (pretvorítʹsja) might not be obsolete after all. But I'd still recommend choosing terms based on the top of the frequency list. Benwing2 (talk) 08:16, 25 March 2016 (UTC)Reply
Even if I don't make new lemma entries, I would often "organize" related terms of a single lemma entry. --KoreanQuoter (talk) 08:20, 25 March 2016 (UTC)Reply

Request

[edit]

Hello, I've seen that you started a new discussion on the Beer parlour, so I think you're a quite expert user of the Wikitionary and like to discuss. Would you like to join this discussion? It's about the improper use of asterisks for Italian words with "syntactic gemination", introduced by an Italian user without asking anyone's opinion and without a consensus, but admins say that now a consensus is needed to remove them since nobody noticed them and said anything about them during the last months. So far, the few users who commented agreed that the asterisk symbol shouldn't be used, but I think that we need more users to say that the community reached a consensus... If you want to say your opinion, you're welcome to the talk!

Hi. You should create a user account to make it easier to respond to you and such; you'll also probably get more respect that way. I did see the discussion. I'm not sure what the correct answer is; I don't work on Italian. Benwing2 (talk) 00:56, 12 April 2016 (UTC)Reply

Okay, no problem!

In case you weren't already aware of this...

[edit]

See 6 Russian entries in Cat:E. Thanks! Chuck Entz (talk) 03:20, 15 April 2016 (UTC)Reply

@Chuck Entz Oops. Somehow I haven't been checking for errors lately. How long have they been there? Must have been awhile ... Benwing2 (talk) 03:28, 15 April 2016 (UTC)Reply
A couple of days. With only 6, I figured I could wait to see if they might get fixed in the course of your ongoing edits to the module. Chuck Entz (talk) 03:34, 15 April 2016 (UTC)Reply
ОК thanks. They're all real errors of mine, fixed now. Benwing2 (talk) 03:35, 15 April 2016 (UTC)Reply

template error in Arabic jayb

[edit]

In the Related Terms of Arabic jayb (I don't know how to link directly to the Arabic script: it's linked from English sine), the template has two square brackets before the word 'sinuses' but only one after, so it's not showing up correctly. This was an edit done by your little AnthroPC, so it might need some retraining on this kind of link. -46.226.49.233 10:38, 15 April 2016 (UTC)Reply

Thanks. I fixed it. This was actually an error in a list I compiled by hand, which the bot then propagated; so it's not a programming error. Benwing2 (talk) 18:48, 15 April 2016 (UTC)Reply

Relocation of "was wotd"

[edit]

Hi, hope you're well! I was wondering if you could do a bot run to make the following simple correction, changing:

{{was wotd|2016|May|11}}
==English==

to:

==English==
{{was wotd|2016|May|11}}

(The date is just an example.) There are a number of occurrences where the {{was wotd}} template was placed outside rather than within the "English" section, where it should be. Thanks. — SMUconlaw (talk) 10:49, 11 May 2016 (UTC)Reply

@Smuconlaw I'll try to get to this soon, shouldn't be too hard. Benwing2 (talk) 06:08, 13 May 2016 (UTC)Reply
Sure, no rush! Thanks. — SMUconlaw (talk) 09:15, 13 May 2016 (UTC)Reply
Have you got some time to work on this? — SMUconlaw (talk) 17:22, 8 September 2016 (UTC)Reply
@Smuconlaw Oops, sorry! I totally forgot about this. I'll try to get to it tonight or tomorrow, it should be pretty easy. Benwing2 (talk) 17:45, 8 September 2016 (UTC)Reply
No worries. Thanks! — SMUconlaw (talk) 19:10, 8 September 2016 (UTC)Reply
@Smuconlaw Sorry again, I got sidetracked. Will do ASAP. Benwing2 (talk) 18:58, 30 September 2016 (UTC)Reply
@Smuconlaw Done. Benwing2 (talk) 03:03, 1 October 2016 (UTC)Reply
Thank you! — SMUconlaw (talk) 10:32, 1 October 2016 (UTC)Reply

I was off the grid

[edit]

By the way, I hadn't been contributing a lot in Wiktionary due to real life issues. Anyways, how is Anatoli? --KoreanQuoter (talk) 01:48, 19 May 2016 (UTC)Reply

He's been away for awhile. I've been pretty busy, with Wanjuscha's help. We've finished all the verbs in the 20,000-word frequency list and are working on the adjectives in the 12,000-12,999 range. Benwing2 (talk) 02:07, 19 May 2016 (UTC)Reply

Module:ru-pron

[edit]

Please see CAT:E. I notice that all of them have at least one space in them. Thanks! Chuck Entz (talk) 03:43, 2 July 2016 (UTC)Reply

@Chuck Entz Oops. My recent code is problematic with multiword expessions. I'm fixing it now. Benwing2 (talk) 05:32, 2 July 2016 (UTC)Reply

Elision of hamzat al-wasl after vowels

[edit]

I added these test cases to Module:ar-translit. I think it would be a good idea for it to work that way. If elision is desired, the vowel is not written. This solves the issues mentioned in the comments of the module. Could you make it work if you have time? --WikiTiki89 15:43, 6 July 2016 (UTC)Reply

OK, I'll take a look at it. Benwing2 (talk) 15:54, 6 July 2016 (UTC)Reply

Сан-Диего

[edit]

We have a problem with Сан-Диего. The г automatically becomes v. --KoreanQuoter (talk) 15:38, 7 July 2016 (UTC)Reply

@KoreanQuoter Good pickup! Диего should also be added as an exception.--Anatoli T. (обсудить/вклад) 20:57, 7 July 2016 (UTC)Reply
Fixed. Benwing2 (talk) 02:35, 8 July 2016 (UTC)Reply

French testcases module

[edit]

(moved to Module talk:fr-pron)

-ille

[edit]

(moved to Module talk:fr-pron)

опер1шийся

[edit]

I could be wrong, but this doesn't quite look like a form of опере́ться (operétʹsja)... Chuck Entz (talk) 03:03, 6 September 2016 (UTC)Reply

No, you couldn't be wrong. @Benwing2 --WikiTiki89 03:26, 6 September 2016 (UTC)Reply
Oops. I deleted this earlier. There's a module bug in generating this participle of опереться and запереться. I've been putting off fixing it for a while but it needs fixing now. Benwing2 (talk) 03:33, 6 September 2016 (UTC)Reply
@Chuck Entz, Wikitiki89 Benwing2 (talk) 03:33, 6 September 2016 (UTC)Reply
I fixed the forms, and the module fix is in the pipe. Benwing2 (talk) 04:55, 6 September 2016 (UTC)Reply
опереться still shows the incorrect form. --Anatoli T. (обсудить/вклад) 05:17, 6 September 2016 (UTC)Reply
Right, that's because I haven't copied over the module in Module:User:Benwing2/ru-verb to Module:ru-verb. I turned on the test-new-module feature, which compares the output of the two for all verbs, and I'm waiting till all the verbs get refreshed, which should be done soon. Benwing2 (talk) 05:23, 6 September 2016 (UTC)Reply
Thank you. --Anatoli T. (обсудить/вклад) 05:28, 6 September 2016 (UTC)Reply

газировать (gazirovatʹ), пузыриться (puzyritʹsja)

[edit]

These Russian terms are marked as alternative forms of the same title. Is this correct? The only thing I notice that's different is the pronunciation. DTLHS (talk) 00:24, 24 September 2016 (UTC)Reply

I did this intentionally because the stress patterns and conjugations of the two alternants differ. Logically, there's no reason they would need to be spelled the same way. Benwing2 (talk) 00:28, 24 September 2016 (UTC)Reply
They share the etymology. Just different pronunciations. I think they should be split by pronunciation sections only.--Anatoli T. (обсудить/вклад) 00:58, 24 September 2016 (UTC)Reply
I think the way I've handled it now is better (diff, diff). Incidentally, I think it's a bit redundant in the conjugation template to show "2a // 2a". Perhaps it should be smart enough to figure out that they're the same. --WikiTiki89 15:25, 26 September 2016 (UTC)Reply
Looks OK to me and I fixed the "2a // 2a" bug. Benwing2 (talk) 13:38, 27 September 2016 (UTC)Reply

No-noschwa module errors

[edit]

You removed the noschwa function from Module:fr-pron without checking to see what invoked it, leading to a number of module errors in entries using {{fr-IPA}}. Please fix. Thanks! Chuck Entz (talk) 21:56, 30 September 2016 (UTC)Reply

Oops, sorry! My bad. Fixed now. Benwing2 (talk) 22:08, 30 September 2016 (UTC)Reply
Quickly, too! Thanks! Chuck Entz (talk) 22:32, 30 September 2016 (UTC)Reply
There's a bunch of French headword module errors- something to do with sort keys. DTLHS (talk) 22:46, 22 October 2016 (UTC)Reply
@DTLHS Oops! Fixed. Benwing2 (talk) 22:49, 22 October 2016 (UTC)Reply

Spliting etymologies for nouns and verbs

[edit]

Just so you know, I agree with you that etymologies for nouns and verbs should not be split as long as they are almost the same. Thus, in these cases, I find it unfortunate to create two etymology sections in an entry layout. What I do find acceptable is to create a separate bullet "(noun) From ..." in the same etymology section in addition to "(verb) From ..." bullet, but even that could be overkill. --Dan Polansky (talk) 18:27, 8 October 2016 (UTC)Reply

Sorry, what is this in reference to? Benwing2 (talk) 18:29, 8 October 2016 (UTC)Reply
This is in reference to a note I thought saw you made in a Beer parlour discussion, although I cannot quickly find it any more. No action is required; I just wanted to say I agree with you. --Dan Polansky (talk) 19:42, 8 October 2016 (UTC)Reply
OK. I presume you're referring to English in particular. I'm not sure if I actually made that comment because I think we should often have two etym sections for related nouns vs. verbs, esp. if they go back to separate words in Old English. I agree it's more arguable if one of the two is a recent formation from the other, e.g. noun invite vs. verb invite. Benwing2 (talk) 19:51, 8 October 2016 (UTC)Reply
Well, maybe it was someone else. Next time around, I should have a specific link handy. --Dan Polansky (talk) 20:39, 8 October 2016 (UTC)Reply

More pronunciation modules

[edit]

Are you interested in making more of these? Specifically, I'm interested in working on MOD:ny-IPA for Chichewa, as part of my infrastructural preparation for creating a large number of entries in the language. I just don't have the skill to create it myself, but I can write out a series of rules to be executed. If you're too busy, however, I understand that. Thanks! —Μετάknowledgediscuss/deeds 22:46, 16 October 2016 (UTC)Reply

If you write out the rules I can look into implementing them. Benwing2 (talk) 01:28, 17 October 2016 (UTC)Reply
I realised that I can write them out if that's helpful, but the table at w:Chewa language#Consonants pretty much covers it all (ignore the placeholder vowels after each consonant's orthographic form in the table). The template will have to be provided with a respelled version of the word; in most cases, it will be the same as what is on the headword-line, so with acute accents /á é í ó ú ḿ/ for high tone (which can just be given thus in the IPA); one extra thing is <m'>, which should be syllabic /m/ (rather than forming a digraph with whatever comes after it). The vowels <a e i o u> are the same in IPA. The only other things to note are that <zy> should be /ʒ/, <ŵ> should be /w(ᵝ)/, and <w> in the combinations <awu ewu iwu owa uwa> should be /(w)/. —Μετάknowledgediscuss/deeds 03:37, 17 October 2016 (UTC)Reply
Oh, and syllables always end in a vowel, except for syllabic m (which will always be respelt as <m'> or <ḿ> when fed to the template). Stress is always penultimate, and no secondary stress needs to be marked. —Μετάknowledgediscuss/deeds 03:40, 17 October 2016 (UTC)Reply
OK, thanks. I'll look into it when I have a chance. Benwing2 (talk) 03:44, 17 October 2016 (UTC)Reply
I've had a start at Module:ny-IPA but I'm feeling a bit out of my depth- do you think you could finish it off? DTLHS (talk) 00:13, 13 November 2016 (UTC)Reply
@DTLHS Sorry for the delay! I will take a look at this. Benwing2 (talk) 15:19, 30 January 2017 (UTC)Reply
It's finished now- thanks though! DTLHS (talk) 16:50, 30 January 2017 (UTC)Reply
@Metaknowledge Likewise. Benwing2 (talk) 15:19, 30 January 2017 (UTC)Reply

Elative of حي

[edit]

Just wanted to let you know that your bot has an oversight in generating elative forms (see diff) and forgets to take into account the rule that alif maqsuura becomes a plain alif after yaa'. I only noticed because an anon corrected it today. It's probably not relevant since this bot run was a long time ago and this is a rather rare scenario, but if it were me, I'd want to know. --WikiTiki89 13:33, 4 November 2016 (UTC)Reply

Your note on the pronunciation of French sens

[edit]

Please see this. --Barytonesis (talk) 20:18, 5 January 2017 (UTC)Reply

@Barytonesis I didn't make up the info about the former pronunciation of sens and its occurrence in sens dessus dessous; I'm pretty sure I read these things in "The Romance Languages" by Harris and Vincent. I don't have the book handy now but if I ever unpack my linguistics boxes I'll look in it. What is your evidence that sens dessus dessous comes from c'en dessus dessous (and similarly for sens devant derrière)? That sounds like a folk etymology to me. Benwing2 (talk) 21:01, 5 January 2017 (UTC)Reply
Fair enough. I noticed after I posted my comment that the TLFi agrees with you about that pronunciation. But it also agrees with me that c'en dessus dessous is probably the original spelling ("Les loc. adv. sous B, sont prob. dues à des altér. graph. d'apr. sens de sen, lui-même altér. de cen, contraction de ce en (cf. sen dessus dessouz, mil. xves., Charles d'Orléans, Rondeaux, 98, éd. Champion, p. 404; c'en dessus dessoubz, 1511, Gringore, Farce à la suite du Jeu du Prince des Sots, éd. D'Héricault et Montaiglon, t. 1, p. 281). En a. fr. et m. fr., on rencontre les formes ce devant derriere (1268, Claris et Laris, 11802 ds T.-L., s.v. devant), ce dessus dessoubs (1342, Jehan Bruyant, Pauvreté et Richesse, 30b, ibid., s.v. desus"); that's why your mentioning it to make your point sounded ad hoc to me. --Barytonesis (talk) 22:05, 5 January 2017 (UTC)Reply
@Barytonesis OK. Thanks for the reference. I still think the comment is fairly relevant as the sen -> sens respelling wouldn't have happened if sens had a pronounced final /s/ at the time. Benwing2 (talk) 00:32, 6 January 2017 (UTC)Reply
This might also be relevent: https://offqc.com/2011/02/02/informal-french-expression-pas-de-bon-sens/ Andrew Sheedy (talk) 02:55, 10 January 2017 (UTC)Reply

Share your experience and feedback as a Wikimedian in this global survey

[edit]
  1. ^ This survey is primarily meant to get feedback on the Wikimedia Foundation's current work, not long-term strategy.
  2. ^ Legal stuff: No purchase necessary. Must be the age of majority to participate. Sponsored by the Wikimedia Foundation located at 149 New Montgomery, San Francisco, CA, USA, 94105. Ends January 31, 2017. Void where prohibited. Click here for contest rules.

Ogoneks in Slovene

[edit]

What would these be used for? —CodeCat 17:22, 22 January 2017 (UTC)Reply

@CodeCat The vowels indicated e and o in Slovenian are ambiguously either low-mid or high-mid. Derksen 2008 writes ę ǫ for low-mid, ẹ ọ for high-mid. Wikipedia Slovenian phonology appears to use ẹ ọ for high-mid and leave the low-mid unmarked, although that's potentially ambiguous since it might indicate an unspecified quality. I have seen the same notation as Derksen's used elsewhere; I forget where but maybe for Romance languages. Benwing2 (talk) 17:26, 22 January 2017 (UTC)Reply
The practice of Slovene is to indicate high-mid vowels as é ó and low-mid as ê ô. We don't need ogoneks. —CodeCat 17:28, 22 January 2017 (UTC)Reply
According to who is this the practice? This is not consistent with anything I've seen. Also, that notation is problematic in that é and ê are tone marks, not quality marks, and potentially either quality could appear with either tone. Benwing2 (talk) 17:56, 22 January 2017 (UTC)Reply
Not in Slovene. The quality distinction is restricted to accented syllables. See WT:ASL. —CodeCat 18:03, 22 January 2017 (UTC)Reply
Please read that more carefully. You are confusing the stress-based and tonal-based systems. Benwing2 (talk) 18:05, 22 January 2017 (UTC)Reply
Wiktionary exclusively uses the stress-based orthography in links. é indicates long high-mid and ê indicates long low-mid. See WT:ASL. —CodeCat 18:10, 22 January 2017 (UTC)Reply
I disagree with that. In Proto-Slavic entries, the tone of daughter languages is extremely important and must be shown. Benwing2 (talk) 18:20, 22 January 2017 (UTC)Reply
Well, this is the practice for regular dictionary entries, and it follows that of other Slovene dictionaries, so it's not going to change. If you want to use a different notation on Proto-Slavic pages that would only lead to confusion, so you should discuss that first. Moreover, ogoneks are not a part of the standard tonal orthography anyway, and when they are used, they indicate nasal vowels in the few Slovene dialects that retain them. They should absolutely not be used to indicate a height distinction, the underdot serves that purpose. —CodeCat 18:24, 22 January 2017 (UTC)Reply
"Not going to change" according to who? You don't own Wiktionary. Also, you yourself wrote WT:ASL, so quoting it as a supposedly authoritative source is completely disingenuous. I'm going to continue using tonal notation in Proto-Slavic pages, but I will write (tonal) by it to avoid confusion. Your point about ogoneks and nasality is well-taken so I will consider not using them in the future. Benwing2 (talk) 18:47, 22 January 2017 (UTC)Reply
My point is that it's a standard adopted by more dictionaries than Wiktionary alone. It's unlikely that many people will accept the use a different standard when this is not the common practice for Slovene dictionaries. Also, WT:ASL does document the current practice for Slovene, regardless of whether I wrote it or not. So to change that practice requires a consensus to do so. —CodeCat 18:51, 22 January 2017 (UTC)Reply

──────────────────────────────────────────────────────────────────────────────────────────────────── @CodeCat I was wrong above in my interpretation of Slovene ogonek. I don't actually know what this symbol indicates. Derksen uses all three of ẹ́, é and ę́. Generally ę corresponds to close-mid ẹ in Standard Slovene (not open-mid, as I suggested above). But it clearly indicates something different from the other two in at least some dialect. The same notation is found in the Pleteršnik dictionary; for example, see [4], which has mlẹ́ti, 1sg. mę́ljem. Proto-Slavic ę pretty consistently becomes ę in Derksen, which suggests it might indicate a dialectal nasal vowel, but ę also occurs in many words that did not have ę in Proto-Slavic, so I don't really know what it means. Derksen also has close-mid ẹ and ọ in unstressed syllables. Another unexpected symbol found in many dictionaries is ā ū etc., which seems to indicate that both long tones are possible. Benwing2 (talk) 07:56, 29 January 2017 (UTC)Reply

For example, sę́dəm "seven" < PSl. *sedmь. Benwing2 (talk) 08:33, 29 January 2017 (UTC)Reply
I would like to see an actual source on Slovene phonology that describes a height contrast in unstressed syllables. Everything I've seen so far indicates that it doesn't exist, the contrast occurs only on stressed syllables. In fact, it was a tonal retraction that created the contrast. —CodeCat 14:36, 29 January 2017 (UTC)Reply

Incorrect date formats

[edit]

Hi! When you have time, could you run your bot to fix the following problem? Dates in the format "1 January, 2012" (with a comma after the month) in all citation and quotation templates ({{cite-book}}, {{cite-journal}}, {{quote-book}}, {{quote-journal}}, etc.) are wrong and will be incorrectly interpreted as lacking a year. They need to be changed to "1 January 2012". (Dates like "January 1, 2012" are fine.) See the discussion at "Wiktionary:Grease pit/2017/January#Template:quote-journal". Thanks. — SMUconlaw (talk) 17:52, 22 January 2017 (UTC)Reply

I'll let you know when I get to this. Benwing2 (talk) 17:57, 22 January 2017 (UTC)Reply
OK, thanks! — SMUconlaw (talk) 18:54, 22 January 2017 (UTC)Reply
@Smuconlaw Should be done now. Benwing2 (talk) 15:18, 30 January 2017 (UTC)Reply
Great, thanks! — SMUconlaw (talk) 15:33, 30 January 2017 (UTC)Reply

Updates to Template:RQ:RBrtn AntmyMlncly

[edit]

Hello! The template {{RQ:RBrtn AntmyMlncly}} has been updated. However, there are quite a number of uses of the template in this form:

#* {{RQ:RBrtn AntmyMlncly}}, I.2.4.vi:
#*: In Japonia 'tis a common thing to stifle their children if they be poor, or to make an '''abort''', which Aristotle commends.

Could you do a bot run and change them to the following?

#* {{RQ:RBrtn AntmyMlncly|part=I|section=2|member=4|subsection=vi|passage=In Japonia 'tis a common thing to stifle their children if they be poor, or to make an '''abort''', which Aristotle commends.}}

If there are only three characters after the template (e.g., "I.2.4"), this means there is only a part, section and member but no subsection. Thanks! — SMUconlaw (talk) 12:08, 7 March 2017 (UTC)Reply

OK, I'll get to this shortly. Benwing2 (talk) 15:31, 7 March 2017 (UTC)Reply
@Smuconlaw I finally got to this one and the one below. However, there are a lot of cases where the characters after the template don't match the formats you give above and below. I listed those pages in User:Benwing2/rq-templates-unable-to-parse (there are 267 cases). Some of them are probably parsable (e.g. II.ii.1.2 should maybe be treated as II.1.2.ii, and II.ii.3 should maybe be treated as II.3.ii) but I'm not sure what to do with the remainder. Can you look over them and let me know which ones can be bot-handled and how to handle them? The remainder you'll probably have to do by hand. Benwing2 (talk) 15:37, 23 April 2017 (UTC)Reply

──────────────────────────────────────────────────────────────────────────────────────────────────── Hi, @Benwing2, I think we need to revisit the template {{RQ:RBrtn AntmyMlncly}} (now renamed {{RQ:Burton Melancholy}}) and see if we can resolve the outstanding issues, as some other editors have highlighted it. The text in the work is subdivided as follows: first into partitions, then sections, then members, and finally subsections. Not all parts of the text use all the levels of subdivision; for example, some end at the member level. Thus, could you please do a bot run and convert occurrences in the following form:

#* {{RQ:RBrtn AntmyMlncly}}, I.2.4.vii:
#*: Next to sorrow still I may annex such accidents as procure {{l|en|fear}}; for besides those terrors which I have before '''touched''', {{...}} there is a superstitious fear {{...}} which much trouble many of us.

to the following?

#* {{RQ:Burton Melancholy|partition=I|section=2|member=4|subsection=vii|passage=Next to sorrow still I may annex such accidents as procure {{l|en|fear}}; for besides those terrors which I have before '''touched''', {{...}} there is a superstitious fear {{...}} which much trouble many of us.}}

As mentioned before, if there are only three characters after the template (e.g., "I.2.4"), this means there is a partition, section and member, but no subsection. I realize there will be uses of the old template that don't conform to the above (and which may eventually have to be dealt with by hand), but let's try to tackle the obvious cases first. Thanks! — SGconlaw (talk) 03:47, 30 October 2018 (UTC)Reply

@Sgconlaw Sure, I'll get to this tomorrow or the next day. Benwing2 (talk) 06:27, 30 October 2018 (UTC)Reply
Thanks! — SGconlaw (talk) 06:29, 30 October 2018 (UTC)Reply
@Sgconlaw Done. I also did Template:RQ:Flr Mntgn Essays, as you had previously requested. But, as I mentioned previously, there are over 200 instances that can't be parsed, which I've put in User:Benwing2/rq-templates-unable-to-parse. Some of them could potentially be parsed automatically, but I'm not sure how. Can you take a look at the examples and let me know which ones can be automatically parsed, and how to do it? Benwing2 (talk) 13:59, 1 November 2018 (UTC)Reply
Hi, and thanks! Looking at the problematic entries, I think not much can be done with those which refer to later reprints like "New York 2001". Thes will have to be manually corrected. However, would it be possible to fix the following?
  • stomach: II.ii.1.2 → |partition=2|section=2|member=1|subsection=2
  • chaos: II.ii.3 → |partition=2|section=2|member=3
  • form: I.iii.1.2 → |partition=I|section=3|member=1|subsection=2
(And so on.) If it is too difficult to convert the Roman numerals to Arabic ones, I think it would be acceptable to leave them as Roman numerals, e.g., |partition=II|section=ii|member=1|subsection=2. — SGconlaw (talk) 15:32, 1 November 2018 (UTC)Reply

Updates to Template:RQ:Flr Mntgn Essays

[edit]

I've also updated {{RQ:Flr Mntgn Essays}}. When you have time, kindly convert quotations in this format:

#* {{RQ:Flr Mntgn Essays}}, II.12:
#*: I saw him at ''Ferrara'', in so pitteous a plight{{nb...}}; '''misacknowledging''' {{transterm|mesconnoissant|lang=frm}} both himselfe and his labours, which unwitting to him, and even to his face, have been published both uncorrected and maimed.

to

#* {{RQ:Flr Mntgn Essays|chapter=12|book=II|passage=I saw him at ''Ferrara'', in so pitteous a plight{{nb...}}; '''misacknowledging''' {{transterm|mesconnoissant|lang=frm}} both himselfe and his labours, which unwitting to him, and even to his face, have been published both uncorrected and maimed.}}

Also (and this is not at all urgent), I've noticed that {{citations}} now redirects to {{citation}}, so we could convert all occurrences of {{citations|lang=???}} to {{citation|lang=???}}. Thanks for all your help! — SMUconlaw (talk) 17:38, 26 March 2017 (UTC)Reply

Sure. Sorry about not getting to your previous request! I'll prioritize both of them. Benwing2 (talk) 03:05, 27 March 2017 (UTC)Reply
No worries. Take your time! — SMUconlaw (talk) 16:23, 27 March 2017 (UTC)Reply

──────────────────────────────────────────────────────────────────────────────────────────────────── Hi, thanks also for working on {{RQ:Florio Montaigne Essayes}}! As for the problematic entries, I wonder if it is possible to convert some of them as follows:

  • conjunction: vol.1. ch.29|book=I|chapter=29
  • pedant: vol. 1 ch. 24|book=I|chapter=24
  • foe: vol.1, ch.23|book=I|chapter=23
  • artist: vol.1, ch.24|book=I|chapter=24
  • ay me: II. 8|book=II|chapter=8

One difficulty is that the original spacing and punctuation, as indicated above, is a bit erratic. All the other entries which refer to modern editions like "Folio Society" will have to be manually replaced, I think. — SGconlaw (talk) 15:56, 1 November 2018 (UTC)Reply

Cite-web problems

[edit]

Your recent change has made a lot of pages display this template wrongly, as well as give (hidden) module errors. Example diff: [5] @SmuconlawΜετάknowledgediscuss/deeds 18:50, 12 April 2017 (UTC)Reply

It looks like you are referring to changes I did at the behest of Smuconlaw and presumably he should fix up the template appropriately to eliminate the bug. However I'm on vacation now on a ship with very little bandwidth and not able to fix anything myself for another week till I get home. Benwing2 (talk) 21:59, 12 April 2017 (UTC)Reply
Looking at the old code, these appear to be situations in which the original template was incorrectly used. The solution would be as follows: in {{cite-web}} templates where |title= is used but |work= is absent, change |title= to |work=. — SMUconlaw (talk) 22:13, 12 April 2017 (UTC)Reply
OK, I'll fix these soon. Benwing2 (talk) 22:28, 16 April 2017 (UTC)Reply
@Smuconlaw I tried to implement this but it looks like this has to be done by hand. According to the defn, title= is the "web page name" and work= is the "web site name". In some cases with title= but no work=, title= is in fact the web site name, but in some cases it's the web page name and the site name is missing, and in yet other cases it's a combination of the two. I put all the places that need fixing here: User:Benwing2/cite-web-replace. Can you fix them up by hand? This will take a while, as there are 856 cases. To speed things up, only fix the cases where bot-changing title= to work= would *not* work; leave the remainder alone and when you're done I'll run my bot on the remainder. Note also that there were some uses of trans_title= that needed to be changed to trans-title=; I did those by hand. Some of these will need to be changed to trans-work= in turn (please do all these by hand; there are only maybe 4 of them). Thanks! Benwing2 (talk) 13:38, 23 April 2017 (UTC)Reply
Oh, that's annoying. You don't think it's a good idea to convert all templates that indicate |title= but not |work= to |work=? — SMUconlaw (talk) 15:01, 23 April 2017 (UTC)Reply
@Smuconlaw Take a look at the examples on the page and you'll see what I mean. Benwing2 (talk) 15:26, 23 April 2017 (UTC)Reply
There are still hidden module errors in the Template:quote-book/testcases and Template:quote-journal/testcases. These are the result of feeding output directly into a parser function without testing it first. If such an error happens in a large mainspace page it's impossible to tell where the error is without going through all the template usage on the page. Oh, and while I'm talking about module errors: @Smuconlaw, would you be so kind as to modify your comment in the Grease Pit to get rid of the module error? If you think about it, demonstrating that something incorrectly doesn't cause a module error with a live example is guaranteed to cause a module error later when the problem is fixed. By then everyone has gone on to other things. The module error then hangs around forever in CAT:E because no one wants to modify other people's comments. Thanks! Chuck Entz (talk) 15:42, 23 April 2017 (UTC)Reply
Not sure what the hidden module errors are – can you point them out? (Or was this comment directed at Benwing?) I have updated my comment in the Grease Pit, but do wonder how one is to explain problems without giving examples! I guess we just have to remember to edit our comments to remove the errors afterwards? — SMUconlaw (talk) 16:38, 23 April 2017 (UTC)Reply
The comment about live examples was prompted by the Grease Pit comment that you fixed, so that was the only example I had in mind. I spoke in a general way in hopes of persuading you not to do that in the future. My philosophy is that setting up a situation where you have to check back on something later is setting yourself up to either have one more thing to keep track of, or one more thing to go wrong.
As for the hidden module errors- that's addressed to whoever can fix the problem- probably Benwing. I can't give an example, because they're hidden. The problem with feeding output directly to parser functions is that they're designed to quietly take any input without complaint, so error messages are just so much bad input for the parser function to ignore. The page as a whole goes in CAT:E, but finding where the error is on that page will probably require working through the inputs and the code that processes them to figure out where a module error ought to be (there may be debugging tools that would show it, suppose). I know this because I had to deal with an earlier error in another template that took a lot of staring at wikitext to figure out why the page was in CAT:E when there was nothing visibly wrong. Chuck Entz (talk) 21:26, 23 April 2017 (UTC)Reply
Unfortunately I have no idea how to proceed because I didn't write any of the code in question. Which parser function are you referring to that has output fed to it? Benwing2 (talk) 21:36, 23 April 2017 (UTC)Reply
I think I found it: there's an instance of |chapter=Beauty, where the code in {{quote-meta/source/sandbox}} is looking for a chapter number. If I'm right, R2A is feeding module-error text to #expr, which produces an error, but the expression it's in just eats the error. In case you're wondering, the output of {{{#if{{#expr:{{R2A|Beauty}}*0}}|A|B}}} is A, with no error message visible (I just tried it in preview). Now I think you can see what I was talking about. Chuck Entz (talk) 23:35, 23 April 2017 (UTC)Reply
Hmm. Yes. That's very subtle and rather unfortunate. Benwing2 (talk) 01:24, 24 April 2017 (UTC)Reply
Perhaps @Erutuon and @JohnC5 can comment on this, as the error that is turning up in Category:Pages with module errors is probably due to the new Module:roman numerals and {{R2A}}. In any case, the sandbox {{quote-meta/source/sandbox}} was using an older version of the template used for testing purposes, so I have pasted the new version of {{quote-meta/source}} into the sandbox which may solve the problem. — SMUconlaw (talk) 07:54, 24 April 2017 (UTC)Reply
The Roman-numeral part was working as designed- it gave an error message when fed input ("Beauty") that wasn't readable as Roman numerals. The problem is that {{#if: absorbs error messages. That's okay if your code is perfect, but it makes debugging very difficult (with this particular bug, you had to notice that the chapter name didn't display). You should make a habit of checking for the hidden Category:Pages with module errors while you're testing code.
At any rate, replacing the code fixed the module error, so I'm good. Thanks! Chuck Entz (talk) 12:44, 24 April 2017 (UTC)Reply
I have fixed {{R:WebMineral}}. Looking at the history of {{cite-web}}, I appear to have revised it along with other citation and quotation templates in January and February last year, while {{R:WebMineral}} was created on 30 August 2016, so it appears that the problem was due to the creator of the latter template not having fully followed the documentation at {{cite-web}}. — SMUconlaw (talk) 07:54, 24 April 2017 (UTC)Reply

Updates to Template:R:MED Online

[edit]

Hello, I have updated {{R:MED Online}} using {{R:Reference-meta}}. When you have time, could you make two minor changes?

  • Replace |title= with |entry=, thus bringing the use of the template in line with other reference templates.
  • The template now automatically adds a full stop to the end. Thus, if there are any manually added full stops (e.g., {{R:MED Online|entry=traunce, n.|id=MED46844|accessdate=1 January 2017}}.), please remove them.

Thanks! — SMUconlaw (talk) 18:00, 16 April 2017 (UTC)Reply

Will do. Benwing2 (talk) 22:29, 16 April 2017 (UTC)Reply
@Smuconlaw This one at least should be done. Benwing2 (talk) 06:22, 23 April 2017 (UTC)Reply
Thanks! — SMUconlaw (talk) 13:29, 23 April 2017 (UTC)Reply

утешительный and -тельный

[edit]

Russian grammar doesn't include every single possible ending of the adjective.

masculine forms picked because most words are formed from m forms. ("красив" "красивая")

But it has nothing to do with actual morphology (крас-ив; крас-ив-ая)

Please don't claim every sequence -ив and -ивая as a separate morph. d1g (talk) 06:30, 18 April 2017 (UTC)Reply

@D1gggg I don't know what you're referring to. I reverted you because you derived it from утешать not утешить. BTW -тельный is reasonably considered a single morpheme, since -ый is the lemma form. There's no need to analyze it into -тельн- + -ый. Benwing2 (talk) 02:35, 19 April 2017 (UTC)Reply
> -тельный is reasonably considered a single morpheme
Is this a wiktionary agreement to threat them as morphemes? I never saw such agreement.
What grammar book says -тельный is a morpheme? d1g (talk) 03:42, 19 April 2017 (UTC)Reply

-тель is for nouns (стро-и-тель) -тельн- is for adjectives стро-и-тельн-ый

[edit]

"н" is always included in adjective and never in noun, therefore 2 distinct morphs. d1g (talk) 06:54, 18 April 2017 (UTC)Reply

@D1gggg See above. Your analysis makes no sense. Either you analyze -тель as -тел- + nominative -ь and analyze -тельный as -тельн- + -ый, or you analyze both as -тель and -тельный, which is what I'd prefer. -тель and -тельн- make no sense. Please stop for the moment making wholesale changes to grammatical analyses, since there clearly isn't consensus to do so. Thanks. Benwing2 (talk) 02:37, 19 April 2017 (UTC)Reply
It makes perfect sense for a native speaker, analysis by Тихонов is exactly as mine.
> Please stop for the moment making wholesale changes to grammatical analyses, since there clearly isn't consensus to do so.
This is not English grammar.
I don't need consensus or opinion of a single editor on codified and regulated grammar.
-тельн- is a moprh of adjectives because reference says so.
Please consider this. d1g (talk) 03:45, 19 April 2017 (UTC)Reply

العين بالعين والسن بالسن

[edit]

Hi,

Do you know why الْعَيْن بِالْعَيْن وَالسِّنّ بِالسِّنّ is not transliterated automatically, although we have a working case for بِالتَّأْكِيد (bi-t-taʔkīd)? --Anatoli T. (обсудить/вклад)

It's وَالسِّنّ that isn't working because of the wa-. Hopefully it can be fixed. --WikiTiki89 14:21, 18 April 2017 (UTC)Reply
This is very hard to get right. For example, something like وَالْقَمَر (wālqamar) is transliterated "incorrectly", without the hyphen in it, because there conceivably could in fact be a single word wālqamar (no wa- prefix or article). Similarly, وَالسِّنّ could in fact be a partly vocalized version of وَالَسِّنّ (wālassinn). I felt more comfortable introducing special casing for بِالتَّأْكِيد (bi-t-taʔkīd) because there's no other possible interpretation given the kasra followed directly by alif. Benwing2 (talk) 02:45, 19 April 2017 (UTC)Reply

красноречивый - красн+-о-+речь+-ив-+-ый

[edit]

кра́сный is a Slavic root with "beautiful" meaning among others: "красна девица". I don't know a complete form of красн in modern Russian yet.

Same issue as above, -ив-/-лив- are suffixes.

d1g (talk) 07:02, 19 April 2017 (UTC) Надстрочный текстReply

@Wikitiki89, Atitarev Can you weigh in here? D1g is a native Russian speaker without much linguistic experience who was been going through making wholesale changes to the Russian etymologies and suffixes and such. I believe many of these are unhelpful and contrary to a proper linguistic analysis but he claims (see further up on this page) that he doesn't need consensus. He edit-wars whenever I revert his brokenness. Benwing2 (talk) 09:05, 19 April 2017 (UTC)Reply
His analysis of красноречивый is typical: He segments out all the possible morphemes without respecting the linguistic structure, and gets confused with nouns vs. verbs. In my analysis, this is красно- (a combining form of красный) + -речь (a verb that is no longer attested as such in modern Russian but still found in prefixed form; note that -ивый is always added to verbs) + -ивый, an adjective-forming suffix. You could further analyze e.g. -ивый into -ив- + -ый, but I don't think it's helpful to do so at the top level; if this is to be done at all, do it on the -ивый page. D1g doesn't seem aware that morphemes can be analyzed into smaller morphemes and wants to do it all at the top level, and confuses the noun речь with the verb -речь. Benwing2 (talk) 09:06, 19 April 2017 (UTC)Reply
Question was about -ивый, not about whatever речь/красный is a verb or noun.
речь is a noun that describes process, not adverb, not adjective, not a verb.
the only? verb with "речь" sequence is a изречь. d1g (talk) 10:53, 19 April 2017 (UTC)Reply
> he claims (see further up on this page) that he doesn't need consensus
@Benwing2 be watchful in what you state. How about to answer questions before you claim about my incompetency or my unresponsiveness?
1. I never saw such agreement as you claim repeatedly
2. I will stop whatever it means
3. I don't need to ask anyone whatever -тельн- is a moprh of adjectives or not
4.
> -ивый, an adjective-forming suffix
Again, what is your source for that claim? d1g (talk)

@Benwing2 Sorry to see your sudden silence.

I wrote a page (морф (morf)) about not 3 (prefix, interfix, suffix) but 5 classes of suffixes.

флексийный морф (fleksijnyj morf) ("-ый") represents grammatical categories, nothing else.

This means that every grammatical category will multiply all possible endings: -тельный has over 20 "флексийный морф" glued together with only 1 real adjective-forming suffix -тельн (-telʹn)/-тельн- (-telʹn-).

Now you need to copy rules at 20 pages or make 20 redirects... You also need to disambiguate 19 prefixes from that 1.

To me, your actions are equal to creation of -ings -ing (* 10 times).

@Benwing2 Any single reason to do this? d1g (talk) 21:24, 19 April 2017 (UTC)Reply

@D1gggg The reason I am being "silent" is simply that I have limited time to work on Wiktionary, and that doesn't include when I'm working my normal job. I'll try to respond to your responses shortly. Benwing2 (talk) 01:09, 20 April 2017 (UTC)Reply
@D1gggg You'd be wise to try to learn the ways of Wiktionary and try to seek consensus before continuing any more wholesale changes. Note that User:Wikitiki89 and User:CodeCat both apparently disagree with your approach, and both are established editors who have been contributing for several years. Wikitiki is furthermore a native Russian speaker (as are User:Atitarev and User:Cinemantique). Benwing2 (talk) 01:20, 20 April 2017 (UTC)Reply
@D1gggg Hi. I suggest you become more cooperative, reach consensus for your controversial edits before starting mass editing. --Anatoli T. (обсудить/вклад) 02:10, 20 April 2017 (UTC)Reply
The only way for me to broke something is to broke old consensus. I was never pointed to prior discussion I apparently broke. d1g (talk) 11:42, 20 April 2017 (UTC)Reply
I'm not question competency of Wikitiki89 or CodeCat.
I'm asking why is done for Russian -тельный (-telʹnyj)/-тельные (-telʹnyje) and not done for English -ings
@Benwing2, can you answer that question? d1g (talk) 11:42, 20 April 2017 (UTC)Reply
It's not about creating random combinations of suffixes. It's about how derivation actually works. A Russian speaker will the word продо́лжить (prodólžitʹ) and create продолжи́тельный (prodolžítelʹnyj), even though the word продолжи́тель (prodolžítelʹ) doesn't exist (even though it could exist). This means the whole suffix -тельный (-telʹnyj) is added as a whole. --WikiTiki89 12:03, 20 April 2017 (UTC)Reply
@Wikitiki89 Problem I see here is that very few readers have interest in real word deviation.
1. The only sequence of characters that has non-grammatical definition is "-тельн"
2. They are not "suffixes" (according to the grammar) and it will confuse language-learners (wiktionary claims them as suffixes).
"sequence" would be a better word, isn't? d1g (talk) 12:18, 20 April 2017 (UTC)Reply
What you call -тельн-, we call -тельный, that is just a matter of using different lemma forms. The -ый is not interesting, but since the suffix creates adjectives we include it as the lemma form of adjectival suffixes. How do you know what our readers are interested in? --WikiTiki89 12:21, 20 April 2017 (UTC)Reply
@D1gggg Wikitiki's point is that, by convention, Wiktionary includes suffixes in lemma form. See for example the verbal suffix -ывать. We don't by convention analyse them into an infix -ыва- plus the infinitive -ть, although that would be possible, but in our view it's not terribly useful to do so, as Wikitiki notes.
Overall, the problem with many of your changes is that
  • They go against normal Wiktionary conventions.
  • The quality is poor; e.g. you often leave out English definitions and translations.
Many established editors are trying to make the same point, that you need to follow existing Wiktionary conventions and learn from the way that other editors do things. No one wants to block you, but people also have limited amount of time to work on Wiktionary, and limited time to clean up after other editors. Blocking is a last resort to deal with editors whose changes are judged to be more harmful than helpful, and who refuse to work with other editors to improve their contributions. In a collaborative project like Wiktionary, everyone has to work with everyone else. Part of this is following existing conventions and seeking consensus before changing conventions. You seem somewhat unwilling to do that, which is a problem. The various edits that I made a year or more ago that added many suffixes and etymological analyses followed Wiktionary conventions, and I tried to keep the quality high, which is why no one complained about them. The "consensus" I'm referring to is that they were accepted by the other main editors who work on Russian, for this exact reason. Your edits are clearly not accepted, which means that you should (1) stop making more of them, (2) figure out what changes you need to make that will be acceptable to the other editors, and (3) make them. If you don't do all three of those, then eventually I will have to go through and revert all your changes, because I don't have time to figure out what portions are good and what portions are bad. I'd rather not do that; I'd rather instead that you do the work. But you have to be willing to cooperate. Benwing2 (talk) 13:04, 20 April 2017 (UTC)Reply

──────────────────────────────────────────────────────────────────────────────────────────────────── @D1gggg Let me give you a specific example. Your page флексийный морф (fleksijnyj morf) has a lot of issues:

  1. The term itself, which appears to mean "inflectional morph" or "inflectional morpheme", is probably what we call "sum of parts" (see WT:SOP), which means it likely shouldn't be included at all.
  2. The entry lacks a {{ru-noun+}} headword template as well as a {{ru-noun-table}} declensional template. As a result it won't be properly categorized into CAT:Russian nouns.
  3. The definition in English is lacking.
  4. Instead you include a whole lot of untranslated Russian text that appears to explain (in too much detail) what this term means. Keep in mind that Wiktionary is a dictionary, not an encyclopedia, so you don't need this much detail, and it needs to be in English.
  5. The explanatory Russian text is in the form of usage examples, but you seem to be confusing usage examples with explanatory text. We don't need so many usage examples, and they absolutely need to have English translations accompanying them.
  6. You put a whole bunch of sample morphemes under a "Further reading" header, but this isn't a standard header in Wiktionary. Instead, you need to use "See also".

In general, Wiktionary pages are highly structured, much more so than a Wikipedia page, and you need to follow that structure. Benwing2 (talk) 13:12, 20 April 2017 (UTC)Reply

"Further reading" is standard per a recent vote. It replaces "External links". So the header is at least ok, but not the way that D1g used it. —CodeCat 13:20, 20 April 2017 (UTC)Reply
OK thanks ... didn't realize that. Benwing2 (talk) 13:22, 20 April 2017 (UTC)Reply
@Benwing2
1. 5 morphs are defined at морф (morf). I had no time to check it, but IMO флексийный (fleksijnyj) is not used outside флексийный морф (fleksijnyj morf).
2. I'm not adding templates when they produce an error with default settings. I tried many combinations and none of them work. I was told to never add module errors to the pages at my talk page. I saw that message.
3. because it is untranslatable to English?
4-5. Some users claim that wiktionary should include enough information to understand the word. If 2 words were defined with 3 paragraphs, we could and should include there 3 paragraphs or have links to them. d1g (talk) 13:29, 20 April 2017 (UTC)Reply
@D1gggg This is precisely the problem. You "had no time to check" something that I checked in about 2 minutes, which revealed that you can say флексийный способ, флексийный язык, флексийный класс, etc. You also apparently had no time to learn how the templates in question work, so you could add them properly. They are well-documented if you look at Template:ru-noun-table/documentation (I know because I wrote that documentation). Of course this term is translatable; I already gave you the translation ("inflectional morph" or "inflectional morpheme"; Russian appears to conflate the two concepts of "morph" and "morpheme"). флексийный means "inflectional"; you could figure this out if you took a bit of time to Google around. In general you seem to have lots of time to add problematic entries to Wiktionary but no time to fix up the entries properly, which means IMO that you don't want to bother to fix them up properly and expect someone else to do it. Benwing2 (talk) 13:42, 20 April 2017 (UTC)Reply
@Benwing2 I did the right thing when not created флексийный (fleksijnyj). I'm not sure if you're serious:
  • "флексийный класс" - 1 hit
  • "флексийный язык" - 3 hits
  • "флексийный морф" - 143
I have to add that "флексийные формы" - 607 hits
But still, this adjective is used only with 2 strictly linguistic words.
It indicates professional terminology, not a general adjective "флексийный" "able to flex". d1g (talk) 14:54, 20 April 2017 (UTC)Reply
> They are well-documented
As I said, none of examples work User:D1gggg/sandbox флекси́йныЕ морфы" plural sg
It is too much hassle for any editor to use these error-prone templates.
The best I could do is to put one stress in Template:ru-IPA.
Wrong declensions are terrible, I won't add these. d1g (talk) 14:28, 20 April 2017 (UTC)Reply
If you don't know how to use the templates, that's fine, just note that every entry needs to have a headword line, and a minimum you can add just {{head|ru|noun}}. --WikiTiki89 15:11, 20 April 2017 (UTC)Reply
If he starts using these primitive templates en masse, then someone will have to fix his entries. I oppose. He's got time to observe and learn from existing entries. --Anatoli T. (обсудить/вклад) 22:00, 20 April 2017 (UTC)Reply
There are many other problems with his entries, so I still oppose him making edits. But in theory if the rest of the content is good, we can't discourage people from editing just because they can't figure out our rather complicated inflection templates. --WikiTiki89 22:04, 20 April 2017 (UTC)Reply
@D1gggg I added the correct usage to your sandbox page. But this is still a SOP usage, which will eventually get deleted for this reason. Benwing2 (talk) 00:14, 21 April 2017 (UTC)Reply
@D1gggg I am trying to clean up some of your changes. It's a lot of work to fix things up. Please don't make any more changes unless you're sure they're correct. Some comments:
  1. -тельный, not -тельн- + -ый.
  2. Please don't remove entries from "Related terms".
  3. Please don't remove glosses in etymologies, e.g. in сногсшибательный.
  4. Etymologies should be formatted as readable text, rather than with bullet points.
  5. Pay careful attention to parts of speech. For example, -тельный is always added to verbs, not to nouns (contra your etymologies in обаятельный and образовательный).
  6. Pay careful attention to verbal aspect. E.g. your etymology for окончательный derived it from окончить rather than окончать (which is a synonym of оканчивать and is attested in Dal's dictionary). This and the previous point indicate that you tend to be sloppy, which is a bad tendency to have when working on a dictionary. Benwing2 (talk) 03:19, 21 April 2017 (UTC)Reply
  7. Yet more sloppiness: in оправдательный you added an extra -а- for no obvious reason. Benwing2 (talk) 03:33, 21 April 2017 (UTC)Reply
оправдать: оправд is a root, not a оправда; -а is a suffix. Trivial russian morphology. d1g (talk) 02:49, 25 April 2017 (UTC)Reply
oправда- is a stem. оправдательный is correctly analyzed as оправда(ть) + -тельный. You need to learn the way we do things here at Wiktionary. Benwing2 (talk) 04:45, 25 April 2017 (UTC)Reply


(because I was repeatedly accused in something I never did, text will be in red)
Benwing2: Yet more sloppiness: in оправдательный you added an extra -а- for no obvious reason. @Benwing2 show us where it is at the page. @Benwing2 collaboration with you is simply disgusting. d1g (talk) 07:09, 2 May 2017 (UTC)Reply

To add more context why I was sure about -тельный, but not about rest of the word, ГР-80 says:
  • in 784: оправдать + -тельн(ый) = оправдательный
  • in 652.1: оправдать + -тельн(ый) = оправдательный
I don't understand why I got blame for unfinished parts
Page оправдательный was created by me in 18/04/17 not by any single editor here. d1g (talk) 07:26, 2 May 2017 (UTC)Reply

полвторого is a пол+второго

[edit]

1. Only with genitive case , not infinitive. 2. Declension is variable.

Fixed mess left after your mass edits again. d1g (talk) 10:05, 19 April 2017 (UTC)Reply

@D1gggg I have no idea what you're referring to. Benwing2 (talk) 01:12, 20 April 2017 (UTC)Reply

"-ивый, an adjective-forming suffix"

[edit]

I will reset discussion because you ignored question.

You said @19 April 2017: > -ивый, an adjective-forming suffix

My question: Again, what is your source for that claim? d1g (talk) 02:57, 25 April 2017 (UTC)Reply

@D1gggg Words in -ивый are adjectives, and it's obviously a suffix, so it's obviously an adjective-forming suffix. I don't see how this can be remotely controversial. Benwing2 (talk) 02:59, 25 April 2017 (UTC)Reply
words with "-ings" are something, therefore "-ings" is a suffix.
ending of -ивый corresonds to many rules (which you unable to cite) even I gave source source to them.
  • -ив(ый)
  • -лив(ый)
  • -овлив(ый)
  • -елив(ый)
  • -члив(ый)
  • -чив(ый)
@Benwing2, stop inventing rules in the codified language. d1g (talk) 03:09, 25 April 2017 (UTC)Reply
@D1gggg -ивый is the suffix in words like правдивый. -ливый, -чивый, etc. are separate suffixes. Not all words ending in -ивый should be morphologically analyzed as having a suffix -ивый. Benwing2 (talk) 03:15, 25 April 2017 (UTC)Reply

@Benwing2, given that now you aware of 6 rules, analyze these 2 words, how they were derived:

d1g (talk) 03:31, 25 April 2017 (UTC)Reply

@D1gggg уда́чливый (udáčlivyj) is obviously уда́ча + -ливый. драчли́вый (dračlívyj, pugnacious) is probably дра́ка + -ливый, with palatalization of the final -к in дра́ка. Benwing2 (talk) 04:04, 25 April 2017 (UTC)Reply
You said "note that -ивый is always added to verbs)"
What should I think when you say that? d1g (talk) 04:35, 25 April 2017 (UTC)Reply
I may be mistaken in this regard. Benwing2 (talk) 04:50, 25 April 2017 (UTC)Reply
You would make less mistake if you based claim not just on intuition, but on actual rules what can/cannot done in some language. d1g (talk) 07:34, 2 May 2017 (UTC)Reply

Unbelievable

[edit]

you have been warned not to create rules in Russian 20 minutes ago, yet you invent grammar on the fly.

Which rule defines suffix -ный?

According to grammar, -тельный is a -тельн(ый) (-тельн-). d1g (talk) 04:22, 25 April 2017 (UTC)Reply

You are getting on my nerves. I and others have tried to explain to you the way things are done here at Wiktionary but you refuse to listen. You need to keep in mind that there are multiple ways to analyze words. We do things one way, your particular grammar book might do things another way, and another grammar book might do things a third way. That doesn't make any one of them right or wrong. Benwing2 (talk) 04:49, 25 April 2017 (UTC)Reply
You're one who insisting on -тель + -ный, not me.
Different?
Illiterate.
I'm holding a grammatical reference in hands and -ный is written nowhere here.
d1g (talk) 06:24, 2 May 2017 (UTC)Reply

Module:ru-verb

[edit]

There are a couple of module errors in CAT:E that seem to be connected with your edits to this module. Please fix. Thanks! Chuck Entz (talk) 01:29, 7 May 2017 (UTC)Reply

@Chuck Entz Fixed. Benwing2 (talk) 03:58, 7 May 2017 (UTC)Reply
Thanks! Chuck Entz (talk) 04:24, 7 May 2017 (UTC)Reply
@Metaknowledge Fixed. Benwing2 (talk) 08:23, 13 May 2017 (UTC)Reply

A messy Russian verbs appendix

[edit]

Hi,

Are you interested in cleaning up/fixing this messy appendix- Appendix:Common Russian verbs? It can use any improvement, starting from duplicated transliterations. --Anatoli T. (обсудить/вклад) 02:20, 19 May 2017 (UTC)Reply

I can write a script to remove the duplicated transliterations. With some more work I might be able to clean it up more. Let me think about it. If you have any suggestions for how it should look, go ahead and format a few entries and I'll try to make the script copy it. Benwing2 (talk) 02:29, 19 May 2017 (UTC)Reply
User:Erutuon has fixed a lot of it. Thanks! I'm not planning to do much work on it, so I'll leave it up to you, if you want to improve it or make it useful. I imported from some list of Russian verbs on the web. User:Cinemantique made a great list of frequently used nouns at User:Cinemantique/2, which is good after the main frequency lists are all filled (Appendix:Russian Frequency lists and 20,000 in Appendix:Frequency dictionary of the modern Russian language (the Russian National Corpus) but there is no similar list for other parts of speech of a similar quality. --Anatoli T. (обсудить/вклад) 03:05, 19 May 2017 (UTC)Reply
You're welcome. It was quite simple with gedit and search-and-replace. Now it needs accent marks. Those will be quite tedious to add, unless it can be done by bot or with JavaScript. — Eru·tuon 03:39, 19 May 2017 (UTC)Reply
I have a bot script to add accents to words. It needs a bit of work to handle some corner cases (e.g. constructions of the sort до́ смерти (dó smerti)) but I can run it on this page no problem. Benwing2 (talk) 03:42, 19 May 2017 (UTC)Reply

Old French noier

[edit]

I don't know much about Old French, but I'm sceptical about the first etymology, which was added by Renard Migrant. Could you confirm it? --Barytonesis (talk) 23:55, 23 May 2017 (UTC)Reply

@Barytonesis: It looks plausible to me. What about it seems dubious to you? Keep in mind that Renard Migrant is probably our top expert on Old French. And also keep in mind that lemma form in Old French is the infinitive, while the lemma form in Latin is the first-person singular present tense. The Latin infinitive of this verb is inodīre, and losing intervicalic -d- was a regular change. --WikiTiki89 01:18, 24 May 2017 (UTC)Reply
"sceptical" was a bit strong; but I'm bothered by the absence of an initial vowel, especially in the light of Modern French ennuyer, which seems to be a direct reflex of inodiāre (notice that the entry of inodiō in its current form makes no sense: it says that in Classical Latin, it was a fourth conjugation verb; but it's apparently found only in Late Latin). Also, I'm a bit confused by the various (sometimes overlapping) spellings of several different verbs (Latin negāre, necāre, nocēre, natāre, nōtāre, nōdāre). --Barytonesis (talk) 02:03, 24 May 2017 (UTC)Reply
I think nocēre is a more plausible ancestor for noier, in terms of both phonology and semantics. — Eru·tuon 02:22, 24 May 2017 (UTC)Reply
Or not... nuire is apparently the child of noceo. — Eru·tuon 02:28, 24 May 2017 (UTC)Reply
Actually, the regular reflex is nuisir, as I discovered earlier this evening (nuire is an analogical form). But this link does speak of noceō ("The second sense (‘to harm, hurt’) shows some semantic overlap with (and probably influence of) ⇨nuisir (derived from Latin nocere rather than in odio)"), so... --Barytonesis (talk) 02:33, 24 May 2017 (UTC)Reply
I'm also skeptical that noier comes from inodio. The loss of the initial vowel from putative inodio is not necessarily a problem (it's called aphesis and it's common) but the strong present stem if derived from inodio ought to be nui- and not ni-. More generally I'm skeptical that noier is etymologically distinct from noiier; I think both are probably derived from neco, which would explain the strong stem ni- in both, and is semantically plausible in both cases. Benwing2 (talk) 04:04, 24 May 2017 (UTC)Reply
That is the etymology given at noyer, which must be the Modern French form. The development of ec to oi puzzled me, but I guess it went ec > eg > ei > oi, which is pretty typical. — Eru·tuon 05:38, 24 May 2017 (UTC)Reply
Maybe I'm reading too much into this, but the conjugation table is under the second etymology, not the first. Maybe the first one had a different conjugation. --WikiTiki89 13:56, 24 May 2017 (UTC)Reply
That's a good point. Also, the link to the Anglo-Norman Online Hub gives "to annoy" as the first meaning and "to harm" as the second, which is consistent with being derived from inodiare. Benwing2 (talk) 14:02, 24 May 2017 (UTC)Reply
I noticed we have an entry for enoiier. It is easily possible that aphetic and non-aphetic forms coexisted for the same word. --WikiTiki89 15:08, 24 May 2017 (UTC)Reply
Seems very plausible, but I'd still like confirmation that this is indeed the case. And the Latin entry needs to be corrected; I'm not even sure the verb is attested. Gaffiot and L&S –admittedly not the most comprehensive resources for Late Latin– only mention the past participle inodiātus, and Du Cange nothing! --Barytonesis (talk) 22:12, 27 May 2017 (UTC)Reply

Wingerbot is causing errors

[edit]

аннексировавший and see CAT:E. —JohnC5 06:49, 4 June 2017 (UTC)Reply

Your Russian edits

[edit]

Hi,

Your Russian edits are just getting better. Your resources or your knowledge or both are impressive. I hardly need to correct anything. You're quite pedantic too, which pays off and you're asking the right questions. I hope you're also learning it, not just for the sake of Wiktionary. What's your motivation, if it's OK to ask? Well done! @Wikitiki89, Cinemantique, KoreanQuoter, Stephen G. Brown, Vahagn Petrosyan, Wanjuscha. --Anatoli T. (обсудить/вклад) 07:10, 7 July 2017 (UTC)Reply

@Atitarev Thanks! In general I like languages and Wiktionary is an interesting way to learn languages. I used to work more on Arabic but the resources aren't so good and there aren't really any regular native Arabic speakers who contribute to Wiktionary. It has been good to work on Russian because of people like you, as well as the wealth of online resources available. Mostly I've been working off of a combination the following resources:
  • for definitions:
    1. ruwikt (with Google translate, which has gotten far better recently, at least if you cut and paste the Russian text into the Google translate website, rather than relying on Chrome's auto-translate);
    2. Kenneth Katzner's English-Russian/Russian-English dictionary;
    3. dic.academic.ru (Efremova's dictionary seems the most trustworthy, followed by Ozhegov and Ushakov; Ushakov seems highly trustworthy but is somewhat out-of-date);
    4. sometimes the best way to find the English equivalent of Russian technical words is to find the word in ru.wikipedia.org and look at the correspondingly linked English article.
  • for grammar and pronunciation:
    1. Zaliznyak (see also udarenieru.ru, which is an updated on-line version of Zaliznyak);
    2. Ivanova's "New Russian Orthoepic Dictionary" for pronunciation.
  • for etymology:
    1. Vasmer;
    2. Yepishkin's "Historical dictionary of Russian-language Gallicisms" for borrowed "international" words (also linked from dic.academic.ru).
  • Also noteworthy:
    1. gramota.ru for definitions and stress locations, somewhat for grammar;
    2. Avanesov for pronunciation;
    3. Reznichenko for pronunciation;
    4. Derksen 2008 "Etymological dictionary of the Slavic inherited lexicon" for etymology of inherited, non-derived words;
    5. ESSJa ("Etymological dictionary of the Slavic languages", in c. 40 volumes, by Trubachev and others) for etymology of inherited words, but it only covers A-O;
    6. Chernykh for etymology.

Vasmer, Derksen, ESSJa, and Chernykh are good for getting lists of Slavic cognates of Russian words (which is primarily useful for Proto-Slavic pages on Wiktionary).

Benwing2 (talk) 19:56, 8 July 2017 (UTC)Reply

@Benwing2, Atitarev: What's your opinion on the Большой толковый словарь русского языка of С. А. Кузнецов? Per utramque cavernam 20:19, 7 December 2018 (UTC)Reply
@Per utramque cavernam: I am not familiar with it. --Anatoli T. (обсудить/вклад) 21:47, 7 December 2018 (UTC)Reply
@Per utramque cavernam this is one of my go-to online dictionaries, along with Efremova. Benwing2 (talk) 00:16, 8 December 2018 (UTC)Reply

дитё

[edit]

Hi, how are you?

I wonder what your sources are for the declension of дитё (ditjó). It is a "просторечие" and is declined as in the ruwikt. --Anatoli T. (обсудить/вклад) 02:46, 2 September 2017 (UTC)Reply

упоительный isn't formed from a verb, but from a noun

[edit]

Please provide sources when you make changes like this. d1g (talk) 07:18, 12 October 2017 (UTC)Reply

-ный

[edit]

Please provide sources when you make changes like this. d1g (talk) 07:29, 12 October 2017 (UTC)Reply

or this d1g (talk) 07:38, 12 October 2017 (UTC)Reply

Koran decorative diacritics

[edit]

If your bot user:WingerBot still adds superscript alef (and connecting alef), please stop it! [6]

That is not a standard way to write even the most diacriticized Arabic. That is only used in decorative Koran. --Mahmudmasri (talk) 20:18, 10 January 2018 (UTC)Reply

@Mahmudmasri: Benwing hasn't been active for several months; he probably won't see your message. I'm not sure his bot is still active, for that matter. --Per utramque cavernam (talk) 20:24, 10 January 2018 (UTC)Reply
That's absurd. That means there's missing information about how to read the text. — Eru·tuon 20:57, 10 January 2018 (UTC)Reply
What's really absurd is wrītĭng līke thăt! --Mahmudmasri (talk) 22:41, 18 January 2018 (UTC)Reply

хотеться

[edit]

I understand only very little about russian pronunciation, but this edit apparently erroneously made the s longer and the IPA apparently was already and is still missing /j/. Shouldn't it be xɐˈtʲet͡sjə? --Espoo (talk) 17:48, 27 January 2018 (UTC)Reply

The IPA is correct. 00:12, 28 January 2018 (UTC)
I don't understand. According to я, this letter is pronounced /ja/, which may be reduced to /jə/, and the soundfile
Audio:(file)
clearly has something between /ja/ and /jə/. And where does the long /sː/ in the IPA xɐˈtʲet͡sːə come from? It was not there before your edit, and isn't explained in the russian philology article. Thanks for any info. --Espoo (talk) 06:43, 28 January 2018 (UTC)Reply
@Espoo: Endings -ться and -тся are pronounced irregularly. /t/ and /s/ are fully assimilated and /t͡s/ is very seldom palatalised in native words. In the audio /t͡s/ is pronounced too quickly and it's more naturally to pronounce it as /t͡sː/, especially after a stressed syllable. You will find that all verbs with -ться and -тся are pronounced similarly. Besides, the ending -ся (-sja) is often pronounced as /sə/ (without /j/) in other positions as well but this is optional. I am a native Russian speaker and I closely checked Benwing(2)'s work on the Russian phonology. --Anatoli T. (обсудить/вклад) 12:21, 28 January 2018 (UTC)Reply
Thanks! Could you please make a new audio file? --Espoo (talk) 23:38, 28 January 2018 (UTC)Reply
@Espoo: I'm sorry but I normally don't do the audio recordings. I'm afraid you have to trust me on this or get someone else to do it. BTW, the Russian Wiktionary entry doesn't contradict ours and I think Russian entries are in a good shape overall, this includes the pronunciation handling. --Anatoli T. (обсудить/вклад) 06:47, 29 January 2018 (UTC)Reply

Gfarnab

[edit]

If you see a bunch of indiscriminate edits in non-Latin-script languages by an IP, check the geolocation before you spend a lot of time fixing things. We've had a lot of trouble with Gfarnab: after being asked and finally told not to edit in languages they don't know, we finally blocked them, then their IPs. Now they're using proxies to bypass the blocks and leaving grandiose "resistance is futile, I will destroy you all" messages on talk pages alternating with inept obscenities that even an 8-year-old would find lame. Simply put, they're not as good as they think they are at anything (their English is pretty bad, too), and they're adding lots of crap for other people to clean up. The fact that they're churning out stuff in more difficult languages means that the time they're wasting is more valuable, so I've finally decided to delete and revert all their edits- good and bad- to discourage them.

Like I said, if you see a bunch of edits by a single IP in various languages and they geolocate to proxies, just mass revert them and tag new entries for deletion. I haven't had much time to work on this for a while, so they've gotten the impression that everyone has given up- all the more reason to lower the boom now. Thanks! Chuck Entz (talk) 05:53, 20 April 2018 (UTC)Reply

OK, in this case the edits you reverted were actually correct but I understand your reasoning. Benwing2 (talk) 05:59, 20 April 2018 (UTC)Reply
@Chuck Entz: We need an administrative tool to mass-undo edits by a user, just like the Nuke for new creations. --Anatoli T. (обсудить/вклад) 06:06, 20 April 2018 (UTC)Reply
If you have your browser set to not switch the focus to new tabs, you can right-click on the rollback link and open it in the new tab, then continue down the contributions list while the rollbacks are running in the other tabs- it only takes a few second per rollback. After that, you just close all the tabs and you're done. It won't work on entries that are edited by others after the ones you want to revert, but I don't like to undo third-party edits, anyway. After all, they've already wasted some of there their time cleaning up after edits that shouldn't have been there in the first place- why waste the rest of it? Chuck Entz (talk) 08:56, 20 April 2018 (UTC)Reply
If you middle-click (or control-click if you have an ancient mouse without a middle click), it's even faster. --WikiTiki89 10:30, 20 April 2018 (UTC)Reply

пыхать

[edit]

Any idea what happened there? Sense 3 seems to be missing a definition. This, that and the other (talk) 10:25, 11 May 2018 (UTC)Reply

@This, that and the other Fixed. Benwing2 (talk) 18:57, 11 May 2018 (UTC)Reply

withtext replacement

[edit]

Hello Benwing --

I noticed Wingerbot doing a lot of replacements of withtext in {{bor}} instances. The replacement is currently just straight text:

Borrowed from ...

I'd like to suggest that you tweak the replacement to also include the link to our Glossary, as generated by withtext, like so:

[[Appendix:Glossary#loanword|Borrowed]] from ...

Cheers, ‑‑ Eiríkr Útlendi │Tala við mig 16:11, 4 June 2018 (UTC)Reply

If I make such a change, what I'd like to do instead is use something shorter and easier to type. What I'm thinking of is this:
{{lg|Borrowed}} from ...
Where {{lg}} (= "link to glossary") is a newly-created alias for {{glossary}}; I'd like to use {{gl}} but that is already an alias for {{gloss}}. (Other short suggestions are welcome, e.g. {{gly}}.) This generates a link like this:
[[Appendix:Glossary#borrowed|Borrowed]] from ...
And we can ensure this works by adding the appropriate anchor to the glossary. But I'd like to see what others have to say before doing this. Benwing2 (talk) 01:02, 5 June 2018 (UTC)Reply
I'd used the Appendix:Glossary#loanword link above as that's what withtext previously generated; and, actually, now that I look at [[Appendix:Glossary#borrowed]], I see it's just a soft redirect: See loanword.
Template shorthand and other implementation details I'm happy to leave to you.  :) ‑‑ Eiríkr Útlendi │Tala við mig 04:43, 5 June 2018 (UTC)Reply

hi

[edit]

thanks for creating words with your bot do can creat persian plural and third person and other fourms example رفت رفتم رفتی رفتیم رفتید رفتند میروم میرود میرویم خواهد رفت خواهیم رفت میرفت میرفتیم میرفتند رفته ام رفته اند رفته ایم and more

I did make my bot create Arabic non-lemma forms before but it's a lot of work to do it for each language, and I don't have any support for Persian. Benwing2 (talk) 18:06, 10 June 2018 (UTC)Reply

Odd Edit

[edit]

I'm not sure what it was supposed to be, but this just causes a module error. Chuck Entz (talk) 02:08, 28 June 2018 (UTC)Reply

@Chuck Entz That ought to have worked; not sure why it didn't, I'll have to investigate. Benwing2 (talk) 02:09, 28 June 2018 (UTC)Reply

Replacement of {{quote-Fanny Hill}} with {{RQ:Cleland Fanny Hill}}

[edit]

Hi, when you are free, could you please run your bot and do the following replacement?

{{quote-Fanny Hill|part=2|[passage]}}{{RQ:Cleland Fanny Hill|passage=[passage]}}

The original template was renamed to bring it in line with other quotation templates, and there is a change in the order of the parameters (like other quotation templates, |passage= is no longer the first parameter). The |part= parameter is no longer needed as it referred to an apparently incomplete version of the work at Wikisource. Once the replacement is complete, the original template can be deleted. Thanks! — SGconlaw (talk) 08:53, 29 June 2018 (UTC)Reply

@Sgconlaw No prob, I'll try to do this tonight. Benwing2 (talk) 13:14, 29 June 2018 (UTC)Reply
@Sgconlaw Done. It should now be possible to delete the old template. Benwing2 (talk) 12:59, 30 June 2018 (UTC)Reply
Thank you! — SGconlaw (talk) 14:21, 30 June 2018 (UTC)Reply

More replacements

[edit]

Hi, another job for you when you are free. Please replace:

#* {{RQ:Birmingham Gossamer|chapter=I|passage=It is never possible to settle down to the ordinary routine of life at sea until the screw begins to revolve. There is an '''hour''' or two, after the passengers have embarked, which is disquieting and fussy.}}

with:

#* {{RQ:Birmingham Gossamer|chapter=I|passage=It is never possible to settle down to the ordinary routine of life at sea until the screw begins to revolve. There is an '''hour''' or two, after the passengers have embarked, which is disquieting and fussy.}}

The second parameter ("01" in the example above) is no longer used and should be deleted.

Also, please replace:

#* {{RQ:Fielding Tom Jones|book=IV|chapter=I|passage=That our work, therefore, might be in no danger of being likened to the labours of these historians, we have taken every '''occasion''' of interspersing through the whole sundry similes, descriptions, and other kind of poetical embellishments.}}

with:

#* {{RQ:Fielding Tom Jones|volume=[TO BE INSERTED]|book=IV|chapter=I|passage=That our work, therefore, might be in no danger of being likened to the labours of these historians, we have taken every '''occasion''' of interspersing through the whole sundry similes, descriptions, and other kind of poetical embellishments.}}

The work is published in six volumes, and the volume numbers can be determined based on the book number: see "Template:RQ:Fielding Tom Jones/documentation". Thanks in advance. — SGconlaw (talk) 17:46, 2 July 2018 (UTC)Reply

@Sgconlaw I will do it, no problem. But maybe you should fix the {{RQ:Fielding Tom Jones}} template to automatically infer the volume based on the book ... it's a simple switch statement in the template. Benwing2 (talk) 01:33, 3 July 2018 (UTC)Reply
That's a good point. In that case, leave out |volume= and I will deal with that in the template. Thanks! — SGconlaw (talk) 02:20, 3 July 2018 (UTC)Reply
@Sgconlaw Should be done. There may be unconverted templates that my code couldn't match. Benwing2 (talk) 02:52, 3 July 2018 (UTC)Reply
Thank you! — SGconlaw (talk) 07:09, 3 July 2018 (UTC)Reply

reading of айе etc

[edit]

in майя i see ма́йе máje , ( while ма́йя májja ). i think, майе, мийе, муйе, мейе, мойе, мыйе should be májje etc - with double j. reason for this is that я, ю, е letters are pairs for а, у, э , and, мае would be maje, and майе should be majje, like it happens with мая and майя. (btw, i also thought for long time that things like айе are read as aje with single j, i have come to this idea only 1-2 years ago). if you do not belive to this proof let's ask it somewhere... --Qdinar (talk) 10:26, 29 August 2018 (UTC)Reply

i have asked about this here: https://russian.stackexchange.com/questions/17725/should-%d0%b9%d0%b5-after-vowel-be-pronounced-jje --Qdinar (talk) 07:25, 23 December 2018 (UTC)Reply
people pronounce different cases of vowel+й+е differently. in case of майя they pronounce majja: https://www.youtube.com/watch?v=KkJ21e3Qt5c , https://forvo.com/word/%D0%BC%D0%B0%D0%B9%D1%8F/ . --Qdinar (talk) 18:41, 15 June 2019 (UTC)Reply
i have come to idea/conclusion that writting vowel + й + я/е/ю/ё should be avoided because they are ambiguous, it should be replaced with vowel + я/е/ю/ё for single j, vowel + й + ь + я/е/ю/ё for double j. i have written this as an answer at https://rus.stackexchange.com/a/449850 . --Qdinar (talk) 18:48, 15 June 2019 (UTC)Reply

RQ:Wodehouse Offing replacements

[edit]

Hello, could you please run a bot to make the following replacement? There's been a change in the template to bring it into line with other templates, which means that the second parameter is no longer the passage quoted.

{{RQ:Wodehouse Offing|I|“I'll tell you what you're going to do. Have you a clean shirt?” “Several.” “And a toothbrush?” “Two, both of the finest '''quality'''.” “Then pack them. You're coming to Brinkley tomorrow.”}}
{{RQ:Wodehouse Offing|chapter=I|passage=“I'll tell you what you're going to do. Have you a clean shirt?” “Several.” “And a toothbrush?” “Two, both of the finest '''quality'''.” “Then pack them. You're coming to Brinkley tomorrow.”}}

Thanks very much! — SGconlaw (talk) 03:58, 2 September 2018 (UTC)Reply

@Sgconlaw Apologies, I've been gone for awhile. I'll take care of this soon. Benwing2 (talk) 14:12, 28 September 2018 (UTC)Reply
Thanks! — SGconlaw (talk) 15:09, 28 September 2018 (UTC)Reply
@Sgconlaw Done. Sorry it took so long for me to get to it. If you have any more subs, I will get to them much faster. Benwing2 (talk) 02:00, 13 October 2018 (UTC)Reply
Thanks, and no worries. — SGconlaw (talk) 05:24, 13 October 2018 (UTC)Reply

FYI

[edit]

diff Chuck Entz (talk) 02:32, 14 October 2018 (UTC)Reply

@Chuck Entz Oops. That was due to an error in the specification file that I used to generate those etymologies. Thanks for fixing it. Benwing2 (talk) 02:50, 14 October 2018 (UTC)Reply
And thanks to @Atitarev for fixing my fix... Chuck Entz (talk) 02:55, 14 October 2018 (UTC)Reply

Affix entries

[edit]

Hi. Please don't forget that we usually use {{suffixusex}}, {{prefixusex}} for those; an example. Per utramque cavernam 16:43, 14 October 2018 (UTC)Reply

@Per utramque cavernam What about cases like this:
  1. за́яц (zájac, hare) (oblique stem зайц- (zajc-)) → зайчи́ха (zajčíxa, female hare)
Benwing2 (talk) 19:29, 14 October 2018 (UTC)Reply
Also see -ать (-atʹ) for more examples of things that are currently difficult to express using {{suffixusex}} (e.g. they have more than one arrow and/or a parenthetical expression). Benwing2 (talk) 22:12, 14 October 2018 (UTC)Reply
Indeed, my bad. Per utramque cavernam 21:45, 12 November 2018 (UTC)Reply

Template:BWpingsla

[edit]

Loving your personal template...--XY3999 (talk) 00:17, 22 November 2018 (UTC)Reply

Replacement of uses of {{seeSynonyms}}

[edit]

Hello! Following a discussion at "Wiktionary:Requests for deletion/Others#Template:seeSynonyms", it was decided that that template should be deleted. (I was not involved in the discussion; I merely closed the discussion.) Could you please do a bot run to replace uses of this template? I haven't examined many examples of how the template has been used, but the basic one would be:

  • {{seeSynonyms|hate|sense=1}} → ''See'' [[Thesaurus:hate]]

I'm not sure how sophisticated the replacements can be. For example, if {{seeSynonyms}} appears after a list of other synonyms, it would be preferable to use see also instead of See, like this:

  • ABC, DEF, GHI; {{seeSynonyms|JKL}} → ABC, DEF, GHI; ''see also'' [[Thesaurus:JLK]]

But if that proves too complicated, I think the basic replacement would be sufficient for now. Once the replacements have been done, I will delete the template. Thanks in advance. — SGconlaw (talk) 03:26, 10 December 2018 (UTC)Reply

Update: no need to do anything at the moment. It seems that the discussion concerning {{seeSynonyms}} has reopened. — SGconlaw (talk) 03:56, 10 December 2018 (UTC)Reply

ѣть and Module:ru-verb

[edit]

See CAT:E. As I'm fond of saying: there's always one more detail... Chuck Entz (talk) 00:06, 16 December 2018 (UTC)Reply

@Chuck Entz Thanks. Yeah, I always forget about those pre-reform verbs. Benwing2 (talk) 03:00, 16 December 2018 (UTC)Reply
Thanks. BTW, I couldn't read the italic form of "ѣ" in the Watchlist and I thought I'm fluent in Cyrillic :) --Anatoli T. (обсудить/вклад) 03:16, 16 December 2018 (UTC)Reply

Accentuation of *sьrdьce

[edit]

In diff you changed the accent of this word. The form that was given before is found in Derksen's dictionary. What is your form based on? —Rua (mew) 16:45, 31 December 2018 (UTC)Reply

@Rua: Boy, it's been so long since then and Proto-Slavic accentuation is so damn complex that I've forgotten my reason, but I'm sure it wasn't a random change. I have a feeling that I was normalizing to a more "underlying" accentuation; a sequence like *ьr should be underlyingly long, as with all diphthongs, and I think (I'm guessing here) that Derksen assumes a rule that shortens long accents (maybe just long circumflexes?) in the 3rd-to-last syllable. Evidently I disagreed with that, although I'm not quite sure why now. Maybe Derksen assumes that sequences like *ьr are always short? One thing I do remember that may be apropos: I remember adding length marks to some pretonic syllables in accent b, I think of *ьr sequences or maybe *er sequences, because this is paradigmatically expected in accent b and some daughter languages (e.g. Polish?) clearly show long-vowel reflexes; in whichever cases these were, Derksen systematically failed to mark the liquid diphthongs as long. So maybe he's assuming that all *ьr sequences were short, despite the evidence from daughter languages, and maybe I changed it because of the daughter language evidence. All of this however is pure speculation.
I'd have to go back and look at my contributions right around that period and see what else I changed that is similar. Given how long ago that is, it won't be easy ... do you know of any easy way to look at one's contributions around a given date? Benwing2 (talk) 00:54, 1 January 2019 (UTC)Reply
You can add from and to dates in the search. --Anatoli T. (обсудить/вклад) 02:29, 1 January 2019 (UTC)Reply

Bot request - Category:French words suffixed with -ment

[edit]

Hello, and Happy New Year.

I have a bot request for you (see this old BP topic). As there are two different suffixes -ment (verb → noun vs. adjective → adverb), I've split this category in two (I'm not terribly satisfied of the naming, so I'm open to suggestions):

Would it be possible to run a bot that would assign all the entries using the POS header ===Noun=== to the first one, and the ones using ===Adverb=== to the second? There are almost 3000 thousand entries concerned, and it would be extremely tedious to do that by hand.

The changes needed would look either like this or like this, depending on the name chosen for the categories.

Please let me know what you think. Per utramque cavernam 19:08, 2 January 2019 (UTC)Reply

@Per utramque cavernam This can be done, I think. Benwing2 (talk) 04:08, 3 January 2019 (UTC)Reply

Thanks for help with quotations

[edit]

It looks easier to format quotations the way you've shown me. Sorry about using the ux template inappropriately. I tried redoing another one in стать. I also realized I'd forgotten to add the translated title of the story for a quote added to загривок, which I've now tried to rectify. I couldn't find a separate field for that so I appended it in square brackets. Does that seem ok? For stories originally published in journals, if you use the story title in the title field, it appears in italics, which doesn't seem right. I suppose I should give the journal title in the title field, and move the story title to the chapter field, but perhaps that wouldn't be quite honest since in most cases I wouldn't be able to get the text directly from the journal. Anyway, I'll try to learn as much as I can from the quotes you've redone, and will be grateful for any comments or instruction you'd care to provide. — Mudbringer (talk) 17:23, 4 February 2019 (UTC)Reply

@Mudbringer Cool. The reformatting of quotations is something I'm doing as part of a more general cleanup of Russian terms. I didn't realize you had added many of them; thanks for doing it! There is in fact a separate field for translated title, it's trans-title. As for italics vs. quotes, there's a current discussion in Template talk:quote-meta that references this issue among others; you might want to contribute. Keep in mind also that I'm myself just learning how to use these templates; I can provide more guidance in a little while as I figure them out more, and User:Sgconlaw can definitely also help. Benwing2 (talk) 01:09, 5 February 2019 (UTC)Reply
Thanks! I did notice the trans-title field, but there seems to be no field for a translation of the chapter title, which is why I simply appended a story title whithin the field like this: “Глухая Даша [Deaf Dasha]”. I'll try to get up to speed reading the thread you pointed out. — Mudbringer (talk) 01:44, 5 February 2019 (UTC)Reply
@Mudbringer Sorry, I failed to notice that you wanted the chapter text not the title text. There is in fact a |trans-chapter= param as well; see the documentation to Template:quote-book. Benwing2 (talk) 01:46, 5 February 2019 (UTC)Reply
I left a comment on Template talk:quote-meta in which I put an example of a short story quote for which I have full data, done in two different ways, neither of them ideal. Unfortunately I pinged you by the wrong username, sorry. ... It's a lot simpler when I do my own translation, but I prefer to use existing translations when possible. — Mudbringer (talk) 11:05, 5 February 2019 (UTC)Reply
Perhaps you noticed that Sgconlaw added translator2 for us, which is a big help. I made a few adjustments to the quote in как у тёщи на печи. For the typical case in Russian novels where chapter numbers start from 1 with each new part, perhaps the best way to show the chapter is in a form like part=part 2 chapter 1. How does that seem to you? — Mudbringer (talk) 16:13, 6 February 2019 (UTC)Reply
@Mudbringer Cool. I'm not sure the best way to specify chapters/parts, I've been using volume_plain=part 2|chapter=1 but if you think using part= is better, that's fine with me. Benwing2 (talk) 16:15, 6 February 2019 (UTC)Reply
I think using |volume_plain= to specify parts of a work is fine. Personally, I have been using |section=, like this: |section=part 2 ([name of part, if any]). — SGconlaw (talk) 18:18, 6 February 2019 (UTC)Reply

Auto-accenting inside interwiki templates

[edit]

Please don't auto-accent inside interwiki templates like {{w}} and {{w2}}; it breaks the links. Tetromino (talk) 02:33, 8 February 2019 (UTC)Reply

@Tetromino Sorry. The issue is only with {{w2}}; {{w}} is handled correctly. I read the docs and thought it was safe to auto-accent |2= because it appeared to work like {{l}}. I see from the defn that it's not, and only |3= is safe to auto-accent. Benwing2 (talk) 02:48, 8 February 2019 (UTC)Reply
@Tetromino BTW, I checked and the only place where w2 got auto-accented wrongly was the one place you just cited. Benwing2 (talk) 02:53, 8 February 2019 (UTC)Reply
[edit]

Hi. I think we've already had some disagreement about those before.

My view is the following: that generally speaking, only base words should sport "related terms" sections, that entries for derived terms don't need "related terms" sections and should make use of etymology sections instead.

IMO, having related terms on derivative entries is generally pointless.

1) по́весть is mentioned as a related term of повествова́ть; what I think when I see this is "duh, of course it's related! повествова́ть is directly derived from it! I haven't learnt anything new; why am I being fed that information a second time?". Same deal with колыбе́льная: of course it's related to колыбе́льный, and further on to колыбе́ль, since it's derived from it.

2) The etymology section does a better job of explaining how words are related.

3) Through the etymology section, the reader can easily go up the chain of derivation, find the base word, and all the related/derived terms in that entry.

4) Having long lists of derived terms everywhere can even be confusing. Take перегово́рный for example; we learn that говору́н is a related term. Well, yes it is, but is it supposed to help me in some way?

Sorry, my thoughts are a bit scattered at the moment.

Per utramque cavernam 09:50, 10 February 2019 (UTC)Reply

I think morphological derivations should be made explicit in lists of related terms/derived terms by the way. Here's what I have in mind: User:Per utramque cavernam/Related terms:Russian/говорить. Per utramque cavernam 10:02, 10 February 2019 (UTC)Reply
@Per utramque cavernam My general principle is as follows: (1) for all terms, include all the "closely related" terms, e.g. for перегово́рный include перегова́ривать/переговори́ть, перегова́риваться, перегово́ры, (2) for verbs, also include all the "closely related" terms of the base verb. I don't think base verbs should in general include indirectly derived terms, e.g. нагова́ривание shouldn't be included under говори́ть, because otherwise the list becomes overwhelming; if you want to see the derived/related terms of наговори́ть, look there. In the case of перегово́рный I did include the "closely related" terms of the base verb говори́ть but I'm not sure why; that's not my general principle and I wouldn't have any problem deleting them. In terms of ordering and grouping, I first put the closely related terms of the term itself, followed by the closely related terms of the base verb if they're included; I put more closely related terms before less closely related terms and more common terms before less common ones; and on a single line I group impf/pf pairs, adj/adv/noun groups of the form FOO-ый, FOO-о, FOO-ость, nouns with their derived adjectives, diminutives and/or feminines, and/or a few other combinations where the terms are especially closely related. I do think that a term in the etymology should also appear in the related-terms section because otherwise it's confusing; someone who looks in the related-terms section and sees various closely-related terms may not think to also look in the etymology. I'm not opposed to showing the chain of morphological derivations the way you suggest, but if you follow my principle of not including indirectly derived terms, it's not normally needed. I'm not convinced that it's sufficient to trace up the chain of etymology for various reasons, one of which is that the etymology section may not exist all the way up the chain, and also that sometimes it's not always clear which way the derivation proceeds (e.g. is a noun derived from a verb or vice-versa?). Note that ruwikt has the same concept of listing "closely related" terms as I do; they call it "Ближайшее родство". Benwing2 (talk) 23:09, 10 February 2019 (UTC)Reply
I still think the result looks a bit silly or over the top in certain cases, but all right; I'll try to keep that in mind and refrain from deleting those from now on. Per utramque cavernam 15:42, 16 February 2019 (UTC)Reply

adding 'to' to verbal definitions

[edit]

The bot messed up here: "or thing" is not a verb... Tetromino (talk) 21:50, 23 February 2019 (UTC)Reply

@Tetromino Thanks. I forgot about the case involving "or" after a comma. I fixed a few other similar places. Benwing2 (talk) 01:57, 24 February 2019 (UTC)Reply

Template:quote-news-preload

[edit]

Hi Mr Wing. I believe you deleted the above template too hastily. It is used on loads of Tracking pages (always substed). Please consider restoring --Wonderfool early February 2019 (talk) 12:12, 7 March 2019 (UTC)Reply

OK, so I made a redirect, but that isn't sufficient. There was some information lost (date). See the entry for conspirólogo - with the old template the date was automatically included but not anymore. --Wonderfool early February 2019 (talk) 12:17, 7 March 2019 (UTC)Reply
A couple of things: (1) I did a bot run to eliminate random redirecting templates. I didn't change people's user pages, because they're mostly junk. If User:DTLHS wants those tracking pages fixed up, to use {{quote-journal}}, I can do so. (2) I'm not sure how the date was automatically inserted; where would that info have come from? Benwing2 (talk) 15:32, 7 March 2019 (UTC)Reply
@Wonderfool early February 2019 Also, please don't create pages that directly call {{quote-meta}} and directly #invoke quote|source_t. Mainspace pages should never #invoke directly. Instead use {{quote-book}}, {{quote-journal}}, etc. You did this on at least five pages: biofarmacéutico, chaineadita, chaviza, chingadazo and conspirólogo. Benwing2 (talk) 15:39, 7 March 2019 (UTC)Reply
Yes, you should probably fix my pages so that they function as intended again before the preload template was deleted. DTLHS (talk) 20:23, 7 March 2019 (UTC)Reply
@DTLHS I took a look at your pages, and I have no idea what Wonderfool is talking about. Template:quote-news-preload is used absolutely nowhere. Your pages use User:DTLHS/quote-news-es, which appears to be working fine still. What sort of work would you like done? If you're referring to the "please enter a language" message, that can be fixed by simply editing User:DTLHS/quote-news-es to add |lang=es. Benwing2 (talk) 23:56, 7 March 2019 (UTC)Reply
It's "not used" anywhere because it's meant to be substituted. Please refer to the source of User:DTLHS/quote-news-es. DTLHS (talk) 00:03, 8 March 2019 (UTC)Reply
@DTLHS OK. You'll need to explain its workings to me as I'm not super familiar with how substing works, and I've never seen the <sub>...</sub> tags (unless this is just subscripting?). Why do you need to subst, and why do you need a special template to do so? If this is only used in your userspace, can the special template not also live in your userspace? It seems non-optimal to have a special, undocumented template like this living in the mainspace. Benwing2 (talk) 00:11, 8 March 2019 (UTC)Reply

Error: When archiveurl= is specified, url= must also be included

[edit]

Doesn't the archiveurl always contain the original url? Why have the module stop everything and throw an error when it can figure it out for itself? Chuck Entz (talk) 15:54, 10 March 2019 (UTC)Reply

@Chuck Entz That is a good point. If the archiveurl is specifically from web.archive.org, the original URL is indeed embedded in it. I'll fix the module so it handles those cases automatically. Benwing2 (talk) 19:16, 10 March 2019 (UTC)Reply
Please also cater for the situation where some other archive site is used, though. Thanks. — SGconlaw (talk) 19:29, 10 March 2019 (UTC)Reply
@Sgconlaw I did. If the original URL isn't specified, the code looks in the archive URL for an embedded URL of the form ".../http:..." or ".../https:...". If so, it pulls it out, otherwise it throws an error. It will work for any archive site that embeds the original URL the way that web.archive.org does it. Benwing2 (talk) 19:31, 10 March 2019 (UTC)Reply
SGconlaw (talk) 01:59, 11 March 2019 (UTC)Reply

nodot on robo-signer

[edit]

Hello. Wingerbot removed nodot from a back-form template on robo-signer, then added a period a few minutes later, citing "template without nodot". Is this a result some kind of conflicting tasks? Cnilep (talk) 03:37, 12 March 2019 (UTC)Reply

@Cnilep Ugh. Thanks for letting me know. I'll have to back out my changes on several pages, I think. The problem is that sometimes the same page gets processed multiple times. Benwing2 (talk) 03:42, 12 March 2019 (UTC)Reply
@Cnilep I've undone the damage on all the pages where it happened, which was about 189 pages (AFAIK). Benwing2 (talk) 04:18, 12 March 2019 (UTC)Reply

|2= in {{unk}}

[edit]

Benwing, why did you remove |2= from {{unk}}? It was very convenient. Can you please add it back? --{{victar|talk}} 03:30, 15 March 2019 (UTC)Reply

@Victar For one thing, it wasn't documented. For another it's not really necessary; neither is |title= for that matter, you can use |notext=1 and just write the text next to the template. Yet another reason is that half the time it was abused and contained a language code, not the actual text. But my biggest concern is that if we ever want to convert one of these templates to have a link attached to it (as I did with various other templates), it's impossible if we've coopted |2= to override the text of the template. Benwing2 (talk) 03:42, 15 March 2019 (UTC)Reply
Who are you to judge that's "not really necessary"? And instead of asking the people that use it on the regular, you just removed it without discussion? That's not cool, Benwing. I'm going to ask again that you restore it, and if you want to start a discussion to remove it, go ahead. --{{victar|talk}} 03:47, 15 March 2019 (UTC)Reply
@Victar I really don't appreciate your hostility and I'd suggest in the future you take a different approach when making requests -- rather than listening to my reasons for removing a parameter that in reality was undocumented and not much used, you started throwing unfounded accusations at me. I will restore this param for {{unknown}} only but don't expect any favors from me in the future. Benwing2 (talk) 03:59, 15 March 2019 (UTC)Reply
Sorry, but when you said "not really necessary", that really set me off. Typing in |title= or |notext=1 is a pain in the ass I rather avoid, and if we ever develop the template into something more, we can always use a bot. --{{victar|talk}} 04:19, 15 March 2019 (UTC)Reply

WingerBot error

[edit]

Note the change to the neighboring parameter here. Chuck Entz (talk) 18:53, 16 March 2019 (UTC)Reply

@Chuck Entz It didn't change the neighboring param, it just looks that way. The error is because of a separate change I made to the Lua code to check for unrecognized params. Benwing2 (talk) 20:36, 16 March 2019 (UTC)Reply

Module:form of

[edit]

I noticed you working on this, and while browsing through the code I noticed that there's a mix of template-callable and Lua-callable functions there, while we normally separate these. Do you think there should be a separate Module:form of/templates? —Rua (mew) 21:14, 19 March 2019 (UTC)Reply

@Rua I agree that they should probably be separated. The thing is that Module:form of/templates already exists and already has a form_of_t function in it, which is different from the form_of_t function in Module:form of (the former is used for {{form of}}, and the latter for more specific form-of templates). We should clean this up. Benwing2 (talk) 21:17, 19 March 2019 (UTC)Reply
Hmm, that's odd. What are those two form_of_t's used for, and why are they different? —Rua (mew) 21:19, 19 March 2019 (UTC)Reply
@Rua See comment just above, they are used by different templates. Benwing2 (talk) 23:45, 19 March 2019 (UTC)Reply

{{IPA}}

[edit]

If you're not too busy with other things, could you have a look at making this template accept the language as the first parameter as well? We might as well do it if we're already doing it with everything else. —Rua (mew) 17:42, 28 March 2019 (UTC)Reply

@Rua Agreed, I'll work on this tonight. Benwing2 (talk) 00:13, 29 March 2019 (UTC)Reply
@Rua Done. Benwing2 (talk) 00:35, 29 March 2019 (UTC)Reply
@Metaknowledge OK, I'll work on that too. It will take a bit because I need to first convert it to Lua. In the meantime, could you help by clearing the entries in Category:Language code missing/&lit? There are only 15 or so. Benwing2 (talk) 00:58, 29 March 2019 (UTC)Reply
Thanks. I cleared them all out except the discussion link; maybe it should be substed? —Μετάknowledgediscuss/deeds 01:26, 29 March 2019 (UTC)Reply
@Metaknowledge Thanks! Substing sounds fine. Benwing2 (talk) 01:28, 29 March 2019 (UTC)Reply
@Metaknowledge Done. It's converted to Lua and supports both 1= and lang= for the language code. Benwing2 (talk) 02:38, 29 March 2019 (UTC)Reply
Thank you so much! —Μετάknowledgediscuss/deeds 02:47, 29 March 2019 (UTC)Reply
I'm basically the only one who uses it, but switching over {{was fwotd}} would be nice. —Μετάknowledgediscuss/deeds 04:36, 31 March 2019 (UTC)Reply
@Metaknowledge Done. Benwing2 (talk) 04:46, 31 March 2019 (UTC)Reply

Removing categories from inflectional form-of templates

[edit]

This has been a slow process that has been taking place over the years. Now that you're working on cleaning up all the form-of templates, maybe we can work on this. The process is really quite simple:

  • Make the template categorise/track based on whether nocat=1 is present or not. Modify WT:ACCEL where necessary so that generated entries always have the parameter.
  • Probably with a bot, go through all the tracked instances of nocat=1 being missing, and edit the entry so that the category is included in {{head}}, then add nocat=1.
  • Once all instances have the parameter, change the template to never categorise, then go through them again to remove the parameter. Update WT:ACCEL as well.

Some to start on could be the degree-of-comparison templates and the various participle templates. —Rua (mew) 21:49, 30 March 2019 (UTC)Reply

Floodflag

[edit]

It would make my watchlist less cluttered. DCDuring (talk) 18:35, 1 April 2019 (UTC)Reply

You mean to set this on my bot? Sure. Not sure how to do this but I'll figure it out. Benwing2 (talk) 22:23, 1 April 2019 (UTC)Reply

Why did you learn arabic?

[edit]

...

Replacement of uses of {{RQ:Schuster Hepaticae V}}

[edit]

Guess you are pretty busy at the moment, but if you have some time could you please do a bot run and replace uses along the lines of:

##* {{RQ:Schuster Hepaticae V|7}}
##*: Furthermore, the '''free''' anterior margin of the lobule is arched toward the lobe and is often involute{{...}}

with {{RQ:Schuster Hepaticae|volume=V|page=7|text=Furthermore, the '''free''' anterior margin of the lobule is arched toward the lobe and is often involute{{...}}}}?

Thereafter, {{RQ:Schuster Hepaticae V}} may be deleted as superseded by {{RQ:Schuster Hepaticae}}. Thanks. — SGconlaw (talk) 17:34, 4 April 2019 (UTC)Reply

@Sgconlaw This is done, but there are a few remaining cases that need to be handled manually. Benwing2 (talk) 12:55, 5 April 2019 (UTC)Reply
Thanks; I've fixed those cases and deleted the orphaned template. — SGconlaw (talk) 14:17, 6 April 2019 (UTC)Reply

Replacement of uses of {{RQ:Harry Potter}}

[edit]

I have also rewritten {{RQ:Harry Potter}} so that it uses {{quote-book}}, and renamed it {{RQ:mul:Rowling Harry Potter}}. If you have time, could you do a bot run and replace uses along the lines of:

#* {{RQ:Harry Potter|7|pt-BR|page=508}}
#*: O senhor realizou extraordinária '''magia''' com essa varinha.
#*:: You, sir, have realized extraordinary '''magic''' with that wand.

with {{RQ:mul:Rowling Harry Potter|pt-BR|book=7|page=508|text=O senhor realizou extraordinária '''magia''' com essa varinha.|t=You, sir, have realized extraordinary '''magic''' with that wand.}}?

That will also enable me to assign |1= to the language code and |2= to |book= to align the template with {{quote-book}} generally. Currently |1= = |book= and |2= = the language code. Thanks. — SGconlaw (talk) 18:58, 4 April 2019 (UTC)Reply

@Sgconlaw This is also done, but there are a few remaining cases that need to be handled manually. I had to split the definitions of {{RQ:Harry Potter}} from {{RQ:mul:Rowling Harry Potter}} and fix the param order on the latter to avoid a lot of errors. Benwing2 (talk) 12:56, 5 April 2019 (UTC)Reply
So can {{RQ:Harry Potter}} be deleted once cases that need manual handling are dealt with? — SGconlaw (talk) 14:27, 5 April 2019 (UTC)Reply
@Sgconlaw Yes. Benwing2 (talk) 17:08, 5 April 2019 (UTC)Reply
Thanks, I've deleted the orphaned template. — SGconlaw (talk) 14:17, 6 April 2019 (UTC)Reply

Replacement of uses of {{RQ:Don Quixote}}

[edit]

Hi, I have split {{RQ:Don Quixote}} into two templates, {{RQ:Cervantes Ormsby Don Quixote}} (English) and {{RQ:Cervantes Viardot Don Quichotte}} (French). Kindly do a bot run and replace uses of the French version as follows:

{{RQ:Don Quixote|lang=fr
|passage=Bien que la faim et le dénûment nous tourmentassent quelquefois, et même à peu près toujours, rien ne nous causait autant de tourment que d’être témoins des cruautés inouïes que mon maître exerçait sur les chrétiens. Chaque jour il en faisait pendre quelqu’un; on empalait celui-là, on coupait les oreilles '''à''' celui-ci{{...}}.
|translation=Even though hunger and destitution tormented us sometimes, and even almost always, nothing caused us as much torment as being witnesses to the unheard-of cruelties that my master exercised on the Christians. Every day, he made someone hang; they impaled that one, they cut the ears '''off of''' this one{{...}}.

should be changed to {{RQ:Cervantes Viardot Don Quichotte|volume=I|chapter=[if specified]|text=Bien que la faim et le dénûment nous tourmentassent quelquefois, et même à peu près toujours, rien ne nous causait autant de tourment que d’être témoins des cruautés inouïes que mon maître exerçait sur les chrétiens. Chaque jour il en faisait pendre quelqu’un; on empalait celui-là, on coupait les oreilles '''à''' celui-ci{{...}}.|t=Even though hunger and destitution tormented us sometimes, and even almost always, nothing caused us as much torment as being witnesses to the unheard-of cruelties that my master exercised on the Christians. Every day, he made someone hang; they impaled that one, they cut the ears '''off of''' this one{{...}}.}}

If no volume is specified, add |volume=I. If |volume=2 is specified, change it to |volume=II.

As there are only a few uses of the English version, I will fix them manually. Thereafter, {{RQ:Don Quixote}} can be deleted. Thanks! — SGconlaw (talk) 08:00, 5 April 2019 (UTC)Reply

@Sgconlaw Done. Benwing2 (talk) 00:07, 6 April 2019 (UTC)Reply
Thanks; I've deleted the orphaned template. — SGconlaw (talk) 14:17, 6 April 2019 (UTC)Reply

The template {{Q}}

[edit]

Sorry to bombard you. Any thoughts on what should be done about {{Q}}, which I just came across? Should it be deprecated? — SGconlaw (talk) 11:17, 5 April 2019 (UTC)Reply

@Sgconlaw Not sure about {{Q}}. I've been aware of it for some time but in some ways it's fundamentally different as it keeps a database of works and auto-fills the appropriate info. This appears intended specifically for Greek and Latin and works fairly well for these languages but less so for modern languages. Maybe User:JohnC5 or User:Erutuon can comment. Benwing2 (talk) 12:59, 5 April 2019 (UTC)Reply
{{Q}} is quite different from {{quote-book}}. It does fancy stuff interpreting abbreviations like "Il. 8.409" (which are used in LSJ, a common reference for Ancient Greek entries) into a reference to book 8, line 409 of the Iliad linking to Greek Wikisource. If it were deprecated, what would it be replaced with? — Eru·tuon 20:58, 5 April 2019 (UTC)Reply
OK. I was just wondering whether it was redundant to the other system. — SGconlaw (talk) 21:43, 5 April 2019 (UTC)Reply
At the very least, it needs a better name. We have both {{q}} and {{Q}}, which is really confusing. —Rua (mew) 21:56, 5 April 2019 (UTC)Reply

Possible bot task

[edit]

Heya, I just made this edit (diff) and then thought it might be something you would like to set your bot on? No pressure if you'd rather not. —Rua (mew) 14:49, 6 April 2019 (UTC)Reply

@Rua Yeah, this is a good thing to do, I'll get to it soon. Benwing2 (talk) 14:51, 6 April 2019 (UTC)Reply
@Rua What do you think about my WT:RFDO idea of getting rid of the less-used multicolumn templates? I'd like to do that at the same time as your change. Benwing2 (talk) 15:15, 6 April 2019 (UTC)Reply
I definitely agree. There should really only be one column template, and there shouldn't even be a need to specify the number of columns, because that is entirely down to personal preference and there is no guideline regarding how to choose it. Thus, everyone picks it based on their own screen size and such, leaving it unoptimal for everyone else. —Rua (mew) 15:16, 6 April 2019 (UTC)Reply
Note there are some cases like “{{l|en|whammel}}, {{l|en|whammle}}“ which should be changed to “[[whammel]], [[whammle]]” so that they still sort properly. — SGconlaw (talk) 15:23, 6 April 2019 (UTC)Reply
No, both are incorrect. It's incorrect to include multiple distinct terms in a single parameter, not just for this template but for all of them. The meaning of {{l|en|whammel, whammle}} is a single term consisting of two words and a comma in between. Screen readers will interpret and read it as a single term in whatever the language is, and not as two separately mentioned terms. The only correct way is to specify them as two separate parameters, so that they are individually wrapped in language spans. However, in this case, listing a non-lemma form is superfluous. Only the lemma should be listed, alternative forms should not. —Rua (mew) 15:27, 6 April 2019 (UTC)Reply
Hmmm, I've frequently seen cases where alternative forms, for example, hotdog, hot-dog, and hot dog are all listed. If there is a rule that only the main lemma should be listed, perhaps we should record that somewhere (presumably "Wiktionary:Entry layout"). — SGconlaw (talk) 15:50, 6 April 2019 (UTC)Reply
It's not a rule yet, but it seems like a sensible rule to me. After all, the link to the alternative form doesn't give the user anything interesting other than a link to the lemma. And some terms have too many alternative forms to list them all anyway. The rule about not having more than one individual term per template parameter is more of a technical/correctness principle for websites in general. —Rua (mew) 16:45, 6 April 2019 (UTC)Reply
@Rua I implemented your request, and at the same orphaned/obsoleted the lesser-used col* synonyms and moved lang= to 1=. Benwing2 (talk) 02:32, 8 April 2019 (UTC)Reply

تسوحو

[edit]

Can you take a look at this entry? Your bot made it and an anon flagged it for deletion. - TheDaveRoss 01:43, 8 April 2019 (UTC)Reply

@TheDaveRoss Ugh, the anon is right. This should be تسوحوا, with a silent alif at the end (a weird quirk of Arabic spelling). It looks like there was a bug in the 2nd person masculine plural subjunctive/jussive that was generated by Module:ar-verb, and since my bot relied on that, it created a whole host of incorrect forms. The bug was fixed (by me in fact) on Jan 2, 2017, but the erroneous entries were never cleaned up. It will take some work to clean this up. If the verb form is the only entry on the page, I can rename or delete/recreate the page, but if there are other entries on the page, I have to extract the erroneous entry out of the page and recreate it on a separate page. I will need to do some exploring to see how often the latter case occurs. This is made trickier by the fact that I wrote all this code years ago and haven't looked at it in a long time, but it should be doable. Benwing2 (talk) 02:17, 8 April 2019 (UTC)Reply
@TheDaveRoss I have removed/deleted/renamed all the bad entries that my bot previously created. Benwing2 (talk) 02:20, 20 May 2019 (UTC)Reply
There are a few more words from WingerBot in Category:Candidates_for_speedy_deletion, can you have a look? - TheDaveRoss 20:19, 29 May 2019 (UTC)Reply
@TheDaveRoss All of those words were misspellings (plurals and verbal nouns) in the original headword templates, since corrected; my bot just created the entries based on those misspellings. Benwing2 (talk) 23:24, 29 May 2019 (UTC)Reply

"Text" content model

[edit]

Could you try changing the content model of my page of incorrect header data to the "text" content model? I've never seen this content model used but noticed it in mw:Content handlers, so I'm curious it looks like and if it would be a good choice for that page (and maybe the data pages for Jberkel's lists of wanted entries). At the moment the page content has to be enclosed in pre and nowiki tags to get it to display as plain text. mw:Help:ChangeContentModel gives more information on changing content models (which on Wiktionary can only be done by administrators). — Eru·tuon 02:26, 11 April 2019 (UTC)Reply

@Erutuon Give me a sec and I'll read the docs and see if I can set it up. Benwing2 (talk) 02:45, 11 April 2019 (UTC)Reply
@Erutuon It's changed. Might not be what you want, though. Benwing2 (talk) 02:49, 11 April 2019 (UTC)Reply
Thanks! Yeah, I was hoping it would display in a preformatted monospace style like Lua, JavaScript, or CSS pages so that each header name would be at the beginning of a line. But I'll work with it for now. — Eru·tuon 03:20, 11 April 2019 (UTC)Reply
There's also a "Scribunto" content model which might do what you're after (the content type defaults to text/plain). All pages in the module namespace have it automatically set, perhaps it's possible to set it for user pages as well. – Jberkel 13:56, 12 April 2019 (UTC)Reply
I think that wouldn't work for the data on my page because I can't save invalid Lua on a page with the "Scribunto" content model. But I added styles to my page to make it look like a pre tag, so it's fine now for me (though not for other people viewing it). Surprisingly, there doesn't seem to be a class that signals the "plain text" content model to use in the CSS though, so I have to use the class for the page name. — Eru·tuon 17:31, 12 April 2019 (UTC)Reply

Module Errors Associated With Template:affix

[edit]

At Russian визуальный (vizualʹnyj), it's choking on |lang1=LL. (changing it to |lang1=la clears the error). The only other use of |lang1=LL., Russian полярный (poljarnyj) also displays a module error, but it's not currently in CAT:E. It looks like there's some kind of problem with the usual replacement of etymology-only languages with their primary languages, but I'm not very good at reading code- so I'm not sure where this is (not) happening. Thanks! Chuck Entz (talk) 01:38, 13 April 2019 (UTC)Reply

@Chuck Entz Sure, I'll take a look. Benwing2 (talk) 01:43, 13 April 2019 (UTC)Reply
This can probably be remedied with getNonEtymological from Module:etymology. — Eru·tuon 01:55, 13 April 2019 (UTC)Reply
Indeed, but I want to move that function to Module:languages. Benwing2 (talk) 02:04, 13 April 2019 (UTC)Reply
@Chuck Entz, Erutuon Fixed. Benwing2 (talk) 02:10, 13 April 2019 (UTC)Reply

sock?

[edit]

Hey. Why have you got two accounts? I mean, nothing wrong with it, but why stop at 2? --I learned some phrases (talk) 22:41, 17 April 2019 (UTC)Reply

Stop and consider what you're doing, please

[edit]

I appreciate all the work you're doing to get rid of unnecessary templates, but you are really going too far in it now. So please stop and think, discuss more, before making such huge sweeping changes to global templates.

  • Adding categories to {{inflection of}} is a big no. The template never categorised, and that was by design. This change also changed the behaviour of existing entries, which is clearly not desired.
  • You added other, completely unnecessary things like {{verb form of}}. Again, nobody was asking for these.
  • The way the templates work now is so complex that it's nearly impossible to understand or change anything. You should look into making smaller, more incremental changes instead of huge overhauls. I used to know how Module:form of worked. Now I don't.
  • Code should not be so strongly tied together that they have all kinds of unwritten requirements between functions. Loose coupling is the key. The interface of functions should be simple and well documented. If I remove a parameter from a template, that should not trigger all kinds of weird errors from code down the line that expects the parameter to be there.
  • You immediately set your bot to change entries to conform to whatever new format you created, so that it was impossible to revert anything without resulting in either redlinked templates or module errors. Which other people now have to deal with. Again, incremental changes allow others to revert edits without breaking anything.

Rua (mew) 14:23, 24 April 2019 (UTC)Reply

@Rua Hi Rua. Sorry to see your objections. All of your objections seem to boil down to one thing, which is that I made {{inflection of}} categorize in exactly the circumstances where the corresponding lang-specific templates used to categorize. That is all. The new templates {{verb form of}}, {{noun form of}} and {{adj form of}} are nothing but thin wrappers around {{inflection of}} that set the part of speech appropriately for categorization purposes. This is fully in keeping with incrementality and loose coupling (believe me, I know what these are, I am a software engineer by trade); I am trying to keep the existing behavior until it's clear it should be changed. I don't understand your concerns about removing a parameter; if you remove the |p=/|POS= param, or use {{inflection of}} instead of {{verb form of}} etc., nothing will break but you'll probably get no categorization. In fact, if you want to eliminate categorization completely from a specific language all you need to do is remove that language's entry from Module:form of/cats, and if you want to remove it from all languages, just make that entire file return an empty table. I am not opposed to eliminating categorization in this fashion, in fact I think it would make everything simpler; I just think it should be discussed first. Please note, also, the templates I have converted so far were the ones that didn't categorize in any case. Benwing2 (talk) 17:15, 24 April 2019 (UTC)Reply
My aim was not to eliminate all categorisation right away, but just to eliminate it from {{inflection of}}, because it was causing entries to be categorised where they weren't before. So restoring the original behaviour was all that I was after. I do think that it's a good idea to eliminate categorization, but I think a better way to go about this is to eliminate the categories themselves. Most of them are quite pointless after all.
What I would like is for categorization to be limited to specific templates, such as {{comparative of}}, rather than being a feature of {{inflection of}}. That way, the categorization can be narrowed down to such "specialised" templates and you don't need to understand {{inflection of}}'s internal logic. The way you did it before, with cat= being passed to Module:form of/templates, was very clear to anyone editing the template, so I would like to preserve that.
Moreover, by sectioning off the categorization into specific templates, the use of a template other than {{inflection of}} becomes a very clear and obvious anomaly, and the process of eliminating the template and eliminating the categories becomes one and the same. This is what my original plan was, to gradually add nocat=1 to cases where it was possible, discuss deleting categories through WT:RFM and WT:RFDO, and slowly work towards a situation where all uses of a particular template use nocat=1. At that point, a bot can then convert all the uses to {{inflection of}}.
I understand that what you did is essentially the same process but in reverse, first eliminating the special templates and then eliminating the categorisation. But to be able to do this, you had to add categorization support to {{inflection of}}, which I really do not like. I'd rather have {{inflection of}} in a kind of minimalized ideal state right from the start, and keep it that way as a sort of rigid standard. Then we slowly convert everything to it, rather than extending it to support all the idiosyncrasies of all the other templates. I see that you have done this as well in Module:form of/data, by adding all sorts of tags that are probably not needed, if only you didn't feel like you needed to have {{inflection of}} support all of it. The result is that the once simple and generic template is now rather bloated and not very easy for me to understand due to all the additional factors you incorporated. By keeping those factors clearly separated from the main, standard template, the code stayed clean and in addition the anomalies were much more clearly visible as such. —Rua (mew) 17:28, 24 April 2019 (UTC)Reply
@Rua Some comments:
  1. You took a hacksaw to my code without asking me, and left it in a bad state (see CAT:E). This is not an improvement. I would have much preferred you discussed the reversions with me *before* doing them, then we could have avoided this situation. Making unilateral reversions like this is an aggressive and rather hostile move.
  2. I personally think having all the categorization logic centralized is an improvement over having it scattered across many different templates, each doing it in an idiosyncratic way. By centralizing the logic, we can make it clear how the categorization works on a macro scale, and decide as a whole what to eliminate and what to keep.
  3. It's not obvious to me that having specialized templates like {{comparative of}} categorize one way while having the equivalent written using {{inflection of}} not categorize is the right thing. It means the categories are incomplete, which is wrong.
  4. You'll have to point out places where categorization was being added that wasn't before; that was not my intent other than what's mentioned in #3.
  5. In general, the best way to do things is a matter of opinion. Please keep in mind, no one (including you, including me) owns the code in Wiktionary. You have specific opinions about how to structure templates like {{inflection of}}, which is fine, but I would advise you to avoid saying things like "the once simple and generic template is now rather bloated" that suggest that your code was better than anyone else's. If you find the code difficult to understand, I can explain how it works, and in general it's much better documented than it was before. I could equally well point out that by centralizing the categorization I can eliminate a ton of nasty and hard-to-understand template code.
Benwing2 (talk) 05:04, 25 April 2019 (UTC)Reply
My reversions were not intended to be hostile, although I can see how you interpreted it that way. My aim was to restore the original non-categorising behaviour so that I could empty out the English categories for specific verb forms, which failed RFD. Since I didn't know how else to do it, and your new code was clearly the culprit, I initially tried to disable the categorisation. When that only led to more errors further in the module, I had no recourse but to revert the entire thing. A specific example is absorpt, which was being categorised by {{inflection of}} when it shouldn't have been.
Let's just say we disagree on the rest. I don't think it's inconsistent at all if some templates categorise and others don't. In the past, that has been the exact reason for having the templates. More recently, I've been trying to get the templates replaced with {{inflection of}}, which was only possible if there was consensus to not have the categories anymore. Adding categorization to {{inflection of}} was never an option to me, I wanted to keep it pure. Instead, languages that wanted to categorise specific inflections should use other means to do so.
And yes, I do think the state of {{inflection of}}, before part-of-speech parameters and categories were added, were better. At that point the template did one thing: show definitions and mark up inflections for their grammatical properties. If you have programming experience, are you perhaps a Windows programmer? ;) I'm from the Unix school myself. I wanted it to do one thing only, and do it well, after the Unix philosophy. Now it tries to do everything: not just grammatical tagging, but also categories on a language-specific and POS-specific basis. I never liked it that some form-of templates sneakily categorised, I liked it more when I could assume that {{inflection of}} would never do that. —Rua (mew) 10:01, 25 April 2019 (UTC)Reply
@Rua If you're asking about my programming experience, I have several decades worth of programming experience and was one of the primary maintainers of XEmacs back in the 1990's. At this point I work exclusively on Unix systems; my personal and work laptops are both Mac Book Pro's and all the servers I use run Linux. As for the "Unix philosophy" you refer to, I think you are totally misinterpreting its intent, but that is a discussion for another day.
As for your reversions, any time you adopt a "revert first, ask questions later" attitude, it *will* be interpreted as hostile. This should not come as a surprise to you. It's definitely incorrect to say you "had no recourse but to revert the entire thing"; your first instinct should have been instead to ask me what was going on and how to fix it. Instead, your hasty reversion led to lots of errors, which I see are still present several days later. To fix this I am going to restore my code in Module:form of/templates and Template:participle of. The categorization you refer to in absorpt is due to the language-independent entries under cats["und"]. (Note that this module contains no code, only data, and the format is well documented at the top of the module, so it should be easy to understand.) I personally don't see a problem in categorizing participles in a language-independent fashion, but since you are objecting, I will comment out the entries for participles under cats["und"]. I am also planning on changing the Template:inflection of documentation so it (a) clearly indicates how the categorization works, and (b) automatically documents exactly what the categories are for which languages, similar to the way that it now automatically documents the current recognized inflection tags. This should be done within 24 hours or so. This should address your concerns about not understanding how things work; if you still have questions, feel free to ask. In the slightly longer run, I will start a discussion in WT:GP or WT:BP about removing most of the language-specific categories; this should be much easier and quicker when they are centralized than spread over a bunch of badly-written and idiosyncratic templates. (The centralization is actually similar to Module:languages/data2 etc.; since I think you wrote that code, I'm a bit surprised you so strongly object to similar centralization here.) Benwing2 (talk) 00:02, 28 April 2019 (UTC)Reply
There isn't a consensus for adding categories to {{inflection of}}, so please do not restore it. —Rua (mew) 09:57, 28 April 2019 (UTC)Reply
@Rua OK, you personally don't like it; that's not the same. BTW are you willing to compromise? You seem to have objected primarily to language-independent categories, so if you want we can disable them and instead have the categories set by {{comparative of}} and similar, while maintaining the language-dependent categories so that I can deprecate the remaining language-specific form-of templates. I don't much like this either but at least we can meet in the middle somewhere. Benwing2 (talk) 13:51, 28 April 2019 (UTC)Reply
That would at least be more acceptable, but you should make sure that there are no existing cases that currently do not have a category, but have one added by your changes. There needs to be nocat=1 to prevent the categories. —Rua (mew) 13:55, 28 April 2019 (UTC)Reply
I agree with Rua -- you really should be taking many of your changes to a vote before implementing them. Categorization is a hot button issue, and often incites heated votes. --{{victar|talk}} 14:19, 28 April 2019 (UTC)Reply
While I agree with Rua (albeit with some amusement at seeing her on the receiving end of some of her less-desirable tactics), I vehemently disagree with her methods. Having CAT:E full of entries for the better part of a week is to be avoided if at all possible, and in this case it seems to have been done strictly for strategic reasons.
This push to redo everything reminds me of Daniel Carrero's flood of WT:EL votes: individually, some (maybe most) have merit, but the sheer volume makes it hard to give them proper consideration, and I'm sure there are some interested parties who missed out on the chance to comment before these were acted on. I fully expect Benwing2 to do what it takes so that any of his actions that fail to be approved are reversed completely and without disruption. Chuck Entz (talk) 16:15, 28 April 2019 (UTC)Reply
@Chuck Entz I learned from the best Wiktionary has seen fit to throw at me in the past, obviously. :) If I do disagreeable things to others in the past, it's probably because someone did it to me before and wasn't reprimanded for it. You can't hold me responsible for trying to stay afloat in a hostile culture that teaches you that such things are ok to do in order to get what you want. I gave up long ago trying to be the better person when it clearly wasn't paying off. We could have better personal accountability on Wiktionary by following the Wikipedia model, but of course nobody cares or votes against any measures needed to change it. —Rua (mew) 16:59, 28 April 2019 (UTC)Reply
To come to Rua's defence, all she essentially did was roll back a feature that was implemented prematurely. If those changes are eventually approved through a vote or discussion, then her edits can be reverted. --{{victar|talk}} 17:50, 28 April 2019 (UTC)Reply
@Victar, Chuck Entz OK, I will start a discussion to see what others think. I didn't think rewriting code in a way that fundamentally doesn't change categorization would be objectionable but I see that this isn't the case. Chuck, my apologies if I have started to resort to some of Rua's methods. This is certainly not my intent and I'll be more careful in the future. I certainly don't believe (contrary to what Rua just stated explicitly) that behaving like an asshole is justified just because others may have done it in the past. Benwing2 (talk) 19:21, 28 April 2019 (UTC)Reply

Thank you!

[edit]

Thank you so much for running WingerBot to update the non-standard multi-word tags in Hungarian possessive forms! I should have checked {{inflection of}} for the spos and mpos parameters. Sorry for the extra work I created. :( Panda10 (talk) 21:11, 2 May 2019 (UTC)Reply

@Panda10 You're welcome! The spos and mpos parameters were just added recently. Benwing2 (talk) 22:55, 2 May 2019 (UTC)Reply

Changes in Module:accel/hu and {{hu-decl-table}}

[edit]

Hi Benwing, I've noticed that you modified Module:accel/hu and {{hu-decl-table}} and now the definition line has a new format for non-lemma entries. The case name is no longer linked and there is a dash between the case and singular. The old and new formats can be seen at hideget. I kind of like the old format. What is the reason for the change? Is the dash necessary between accusative and singular? Panda10 (talk) 17:28, 12 May 2019 (UTC)Reply

@Panda10: It looks like Module:hu-nominals is the cause of this. It hasn't yet been updated to the new format. — Eru·tuon 17:35, 12 May 2019 (UTC)Reply
@Erutuon: But what is the reason for the new format with the dash and unlinked case name? Panda10 (talk) 17:40, 12 May 2019 (UTC)Reply
@Panda10: That's not the new format. It's just an error. The new and old formats should be identical, with no hyphen and the case and number both linked. — Eru·tuon 17:43, 12 May 2019 (UTC)Reply
Okay, I've updated Module:hu-nominals. I reverted the change in Module:accel/hu; it should be restored in a day or two when we can assume that most of the entries have been regenerated by the server. — Eru·tuon 17:50, 12 May 2019 (UTC)Reply
@Panda10 Sorry, I've been trying to clean up the accelerators to use the standard rules but I missed updating Module:hu-nominals. Benwing2 (talk) 18:56, 12 May 2019 (UTC)Reply
It's okay. Thank you and User:Erutuon for bringing the Hungarian templates up to standard. Panda10 (talk) 21:14, 13 May 2019 (UTC)Reply

Updating parameters of T:doublet

[edit]

Hi, WingerBot updated the parameters of {{doublet}}, but I encountered two instances that hadn't been fixed and had module errors: diff1, diff2. There might not be any more, but I am a little curious what might have happened. — Eru·tuon 05:15, 13 May 2019 (UTC)Reply

@Erutuon OK. What I did was first do a dry run on everything, and then a day later to speed up the actual run I extracted the list of terms needing fixing and only ran the script on those. diff2 above got missed because it was changed in the meantime. diff1 got missed because it used the g= param and my script at the time didn't know how to handle that. I had already fixed this latter issue when I reran it but forgot to include that word (κουτός). I should have redone the dry run to deal with cases like diff2, but didn't manage to do that. I don't think there are any other missed terms. Benwing2 (talk) 13:24, 13 May 2019 (UTC)Reply

kjentes

[edit]

Your edit didn't look right at all, so I revised it. What do you think? DonnanZ (talk) 21:10, 18 May 2019 (UTC)Reply

@Donnanz What looked wrong? The only difference is putting the two inflections on one line vs. two. I've been generally trying to eliminate the use of |and| because it's ambiguous about how much is being joined — either using // to join two individual tags, or separating the inflections onto two lines. I did add syntax to allow the template code to unambiguously join two sets of tags rather than two individual tags, like this:
{{inflection of|da|kjennes||sim:past//past:part}}

which displays as

simple past/past participle of kjennes

but I'm leaning towards not using this, and instead putting the inflections on separate lines. Benwing2 (talk) 21:18, 18 May 2019 (UTC)Reply

Inflections on separate lines is exactly what I'm trying to avoid (they appear on one line in English), but there are many older verb inflections that have two lines; I have modified the newer inflection entries to one line. I could try |sim|past//past|part next time I do one. DonnanZ (talk) 21:33, 18 May 2019 (UTC)Reply
@Donnanz Sorry, I'm confused, what do you mean "they appear on one line in English"? Also why exactly are you trying to avoid inflections on separate lines? Benwing2 (talk) 22:02, 18 May 2019 (UTC)Reply
@Donnanz Also, if you want to use // to construct "simple past and past participle", you should write |sim:past//past:part| with a colon rather than |sim|past//past|part|. The colon binds more tightly than //, so |sim:past//past:part| means "(simple past) and (past participle)", while |sim|past//past|part| means "simple (past and past) participle", which isn't right. Both forms currently display the same, but this might change in the future. Benwing2 (talk) 22:05, 18 May 2019 (UTC)Reply
In English "simple past tense and past participle" is used, see sought for example. By using "simple past" I can see that I'm economising on words a little. I have been trying to obtain a neater appearance for inflections, using the English pattern. This doesn't apply to all verbs, verbs ending in -ere have different spellings for the simple past and past participle, and irregular verbs are also different. All noted about the code. DonnanZ (talk) 22:32, 18 May 2019 (UTC)Reply

Russian adjectives - frequency list?

[edit]

Hi,

Is there a way to get an adjective list similar to Appendix:Russian_Verbs_-_Frequency_List. Also @Cinemantique? --Anatoli T. (обсудить/вклад) 01:59, 20 May 2019 (UTC)Reply

@Atitarev I have such a list. I swear I already had it on Wiktionary but couldn't find it, so I put it here: Appendix:Russian adjective frequency list. Benwing2 (talk) 02:06, 20 May 2019 (UTC)Reply

Compound word categories

[edit]

Hi,

I noticed that many Mongolian compound words use "|cat2=compound words" in the header to add to Category:Compound words by language. I picked up that practice as well, e.g. газрын тос (gazryn tos). Do you think it's a good idea to add this automatically to compound words? I would find it useful for two reasons in Russian: find potential SoP's and look up complex inflections. Can this be added to Russian terms by default or by a bot? What about other languages? --Anatoli T. (обсудить/вклад) 06:49, 22 May 2019 (UTC)Reply

@Atitarev I can make Module:ru-headword apply this categorization automatically. Since Mongolian also has Module:mn-headword I'm pretty sure I can do that for Mongolian as well. It would be possible to do it for all languages by modifying Module:headword but I imagine some people would object for some languages. We could also do it selectively by sticking a list of languages getting the compound-word category in Module:headword/data; that might be less objectionable. Benwing2 (talk) 08:49, 22 May 2019 (UTC)Reply
But are these compound words or are they multiword idioms? Those two cases should be distinguished. —Rua (mew) 09:40, 22 May 2019 (UTC)Reply
Yes, I figured you'd find a reason to object. Benwing2 (talk) 11:52, 22 May 2019 (UTC)Reply
Thanks, Benwing2. It's OK to separate the two cats but that will require a manual intervention. The Mongolian category currently contains all words with spaces, which may be considered multiword idioms. Can we have an acceptable category, which won't get resistance? Perhaps, multiword terms? --Anatoli T. (обсудить/вклад) 12:30, 22 May 2019 (UTC)Reply
Sorry if I caused any stress. Please let me know if you still want to do it. @Rua: do you have any suggestions? This would be an important enhancement. --Anatoli T. (обсудить/вклад) 03:12, 23 May 2019 (UTC)Reply
Something like Category:Mongolian multiword terms? I don't think including "idioms" in the name is necessary, because they are idiomatic by the nature of CFI. —Rua (mew) 09:02, 23 May 2019 (UTC)Reply
@Atitarev This is implemented for ru and mn. Benwing2 (talk) 10:18, 23 May 2019 (UTC)Reply
Thank you both! —Anatoli T. (обсудить/вклад) 10:46, 23 May 2019 (UTC)Reply
I've added it to the category data modules so you can now use {{auto cat}} for it. —Rua (mew) 10:53, 23 May 2019 (UTC)Reply

Mongolian pronunciation

[edit]

Hi,

Would you like to contribute to the slow-paced work Mongolian pronunciations module, which is kind of stalled? The pronunciation based on the Cyrillic is mostly predictable and phonetic but there are many irregular pronunciations as well. Russian loanwords often imitate the Russian stress and other features. The module does a lot of work already but it lacks vowel reductions, vowel harmony and palatalisations. Do you think you can assess the feasibility of occasionally continuing development Module:mn-IPA? We have no native speakers, I don't know if any books are downloadable but there are plenty of online audio resources word by word on two online dictionaries. --Anatoli T. (обсудить/вклад) 06:49, 4 June 2019 (UTC)Reply

@Atitarev Sure, I'll take a look. The lack of native speakers is problematic, though. I did find a book on Mongolian phonology [7] that indicates certain things, e.g. a long vowel in the second syllable is pronounced short, and a short vowel in the second syllable becomes a schwa. Benwing2 (talk) 00:42, 5 June 2019 (UTC)Reply
Thank you very much. I'll check that book. Sorry, I didn't get back to you earlier. I'm preparing to move jobs, which will impact my ability to work at Wiktionary. However, I'm still very interested. I can share a good popular older Mongolian textbook in PDF, which also has clear and slow audio. I can email them to you. Cyrillic is not searchable and can't be copy/pasted from the book, unfortunately. --Anatoli T. (обсудить/вклад) 23:19, 5 June 2019 (UTC)Reply
@Atitarev Please do share the textbook and audio by email, thanks. Good luck on your new job and hopefully you'll still be able to contribute somewhat. Benwing2 (talk) 01:01, 6 June 2019 (UTC)Reply
Hi. I have emailed you the textbook and the audio. Not sure if you received it. Please let me know if you're still interested at some stage. --Anatoli T. (обсудить/вклад) 03:22, 18 June 2019 (UTC)Reply
@Atitarev Hi Anatoli. Sorry not to respond earlier. I received part 1 and 4 of your emails (chapters 1-3 and 12-14, along with the book and alphabet), but not parts 2 or 3 (chapters 4-7 and 8-11). You might want to resend as more emails with fewer parts per email, as they may have bounced. I'm definitely interested in helping you, just let me know what you want me to tackle first and I'll look into it. Benwing2 (talk) 03:47, 18 June 2019 (UTC)Reply
I've just resent the two emails (part 2 and 3) from work, pls confirm when you receive any. I will resend smaller emails later, if you don't get them. --Anatoli T. (обсудить/вклад) 03:56, 18 June 2019 (UTC)Reply
@Atitarev Got the remainder, thanks. Benwing2 (talk) 07:35, 18 June 2019 (UTC)Reply

Care to explain?

[edit]

Special:Diff/53250487Born2bgratis (talk) 10:33, 7 June 2019 (UTC)Reply

The vowel length is different, the revert was right. Canonicalization (talk) 12:11, 7 June 2019 (UTC)Reply
Right. I didn’t notice that back then. Please make use of edit summaries in every revert, though. —Born2bgratis (talk) 16:33, 9 June 2019 (UTC)Reply
Sure, will do. Benwing2 (talk) 16:38, 9 June 2019 (UTC)Reply

"transitive hence passive"

[edit]

Hello. Re this edit: I won't comment on this verb specifically, but I'm not sure there's a rule that says a transitive verb always has passive forms. Transitivity is a necessary condition for passivisation, but not a sufficient one. Canonicalization (talk) 00:27, 10 June 2019 (UTC)Reply

@Canonicalization You are right but I'm not sure how to determine whether a transitive verb is missing passive forms as it's not indicated in any dictionaries that I can find, and I don't trust the state of existing entries on Wiktionary very much. Note also that many intransitive verbs have impersonal passives; I am assuming that any verb that takes an object through a preposition or a case other than the accusative can potentially have impersonal passives, and checking those by looking at Google Books. One issue doing this though is that a lot of the cites are post-Classical and I'm not sure whether those "count". Benwing2 (talk) 00:32, 10 June 2019 (UTC)Reply
@Canonicalization An example of impersonal passives is with colluceo, which takes an ablative object and where a search for collucetur turns up a lot of cites, but they may be mostly or entirely post-Classical (not sure). Benwing2 (talk) 00:36, 10 June 2019 (UTC)Reply

Slavic l-participles in {{inflection of}}?

[edit]

You probably know that Slavic languages have a thing called the "l-participle", for lack of a better word. It's known by various names, but across the Slavic languages names with "l" in some form appear to be the most universal. I'm trying to think of how to denote it in our definitions using {{inflection of}}. If I use {{inflection of|xx|foo||l-|ptcp}} then there's a space between the - and the word "participle", which is not so nice. Making an entirely new tag l-participle/l-ptcp would also work, I guess. Do you have any ideas? —Rua (mew) 18:44, 14 June 2019 (UTC)Reply

@Rua We could hack the inflection-of code to special-case tags ending in a hyphen. But unless there are other examples where this treatment would make sense, I think it's better to just add a new tag to Module:form of/data2. Benwing2 (talk) 06:43, 15 June 2019 (UTC)Reply

Replacement of uses of {{R:ODO}} and {{R:Oxford Dictionaries Online}}

[edit]

Hi! Quite a number of entries use {{R:ODO}}, which was a shortcut to {{R:Oxford Dictionaries Online}}. The latter has now been renamed to {{R:Lexico}} which is the new website now holding the content of OxfordDictionaries.com. Could you please do a bot run and replace occurrences of {{R:ODO}} and {{R:Oxford Dictionaries Online}} with {{R:Lexico}}? ({{R:ODO}} may be deleted thereafter.) Thanks. — SGconlaw (talk) 18:38, 16 June 2019 (UTC)Reply

@Sgconlaw Sure. Benwing2 (talk) 19:18, 16 June 2019 (UTC)Reply
SGconlaw (talk) 19:33, 16 June 2019 (UTC)Reply
@Sgconlaw Done. Benwing2 (talk) 06:34, 17 June 2019 (UTC)Reply
Thanks! Looking at the old and new templates, I'm thinking I'd like to align it with other reference templates by making |2= a synonym of |pos=. Would it be possible to make the following changes with your bot?
  • Remove all occurrences of |lang=; this parameter is no longer used.
  • If any entries use |entry_uk= or |url_uk=, change these to |entry= or |url= (then |entry_uk= and |url_uk= can be removed from the template).
  • If a URL appears anywhere in the template, add |url= in front of it (then |2= can be reassigned to |pos=).
  • If a template has a passage as |4=, add |passage= or |text= in front of it (|passage= and |text= have been reassigned to |3=).
Thanks again! — SGconlaw (talk) 14:57, 17 June 2019 (UTC)Reply
OK I'll get to this. It might take a bit though. Benwing2 (talk) 01:18, 18 June 2019 (UTC)Reply
@Sgconlaw I wrote the script. However, none of the occurrences of 2= look like URL's so I didn't convert them. You might want to do them by hand, there were < 15 of them; see User:Benwing2/lexico. Benwing2 (talk) 02:10, 18 June 2019 (UTC)Reply
Hey, thanks again. I've done the manual conversions, so feel free to delete your subpage. — SGconlaw (talk) 07:11, 18 June 2019 (UTC)Reply

Templatising Semper's old Italian entries

[edit]

Semper created quite a few plaintext inflected entries that haven't been caught or updated, and I'm wondering if it would be possible to find and fix these. What do you think? Examples: altri, acceso. —Μετάknowledgediscuss/deeds 19:51, 22 June 2019 (UTC)Reply

@Metaknowledge It depends on how consistent the plaintext inflections are. I already wrote a script that converted all plaintext entries it could find that were italicized so it might not be too hard to find the non-italicized versions. If you could gather a few more examples, it would be very helpful. BTW I'm also working through the Latin errors he created, which are legion. Benwing2 (talk) 19:55, 22 June 2019 (UTC)Reply
I'm not sure how to find more types besides coming upon them as I do other cleanup — I can tell you if I find more types, but this is all I've got for now. Thanks for helping out with that; I find it very frustrating that he now refuses to clean up after himself. —Μετάknowledgediscuss/deeds 04:52, 23 June 2019 (UTC)Reply
@Metaknowledge OK no problem. I found several thousand through some strategic grepping through the 06-01 dump. This found acceso but not altri, although I think I can find examples of that type by changing the grep command slightly. Benwing2 (talk) 13:15, 23 June 2019 (UTC)Reply
@Metaknowledge I'm halfway through a bot run to fix up about 5,000 pages with plaintext inflected entries similar to acceso. I'll do another run tomorrow for altri-type inflected entries, which should clean up around 8,000 pages. Benwing2 (talk) 05:53, 24 June 2019 (UTC)Reply
Excellent, thank you. I'll tell you if I find anything more like it after you finish the runs. —Μετάknowledgediscuss/deeds 06:18, 24 June 2019 (UTC)Reply
@Metaknowledge I finished running my bot; it fixed about 8000 pages the second time around, most of them Italian. Benwing2 (talk) 02:19, 26 June 2019 (UTC)Reply
Look at this format found at curatela: Compound of imperative (tu form) of ''[[curare]]'' and ''[[la]]''.. Fay Freak (talk) 21:19, 27 June 2019 (UTC)Reply
@Fay Freak Not sure how to handle that; {{inflection of}} isn't set up for that currently. Benwing2 (talk) 01:36, 28 June 2019 (UTC)Reply
We use {{es-compound of}} for this in Spanish. Do we want an Italian version of that template, or can we make the universal templates handle these cases? —Μετάknowledgediscuss/deeds 04:45, 28 June 2019 (UTC)Reply
I am not even sure what would even be the correct language to describe such formations. dímelo is a compound, really? Also we can have such “multiple objects written together with a verb” in many languages. As I remember for example in Nāhuatl. (Arabic binds not more than one pronominal suffix at least in the standard language, else on إِيَّا (ʔiyyā).) Maybe we add something like |obj= to {{inflection of}}. I have very crude thoughts on this though. Fay Freak (talk) 17:14, 28 June 2019 (UTC)Reply

Bot Fails...

[edit]

Here, here and here. Chuck Entz (talk) 13:26, 25 June 2019 (UTC)Reply

@Chuck Entz Oops, thanks. I'll check to see if any more such errors occurred. Benwing2 (talk) 14:58, 25 June 2019 (UTC)Reply
@Chuck Entz I only found those three when looking for similarly-formed mistakes. BTW how did you find them? Benwing2 (talk) 02:19, 26 June 2019 (UTC)Reply
They showed up in CAT:E because the parameters were messed up. Chuck Entz (talk) 02:56, 26 June 2019 (UTC)Reply
@Chuck Entz Ah, OK, makes sense. Benwing2 (talk) 02:56, 26 June 2019 (UTC)Reply
Also diff, where your bot changed {{passive past tense of}} to {{passive past of}}; you had deleted the latter template the day before.__Gamren (talk) 17:11, 3 July 2019 (UTC)Reply

Wingerbot is adding templates with "<3>" in them. Chuck Entz (talk) 02:20, 5 August 2019 (UTC)Reply

@Chuck Entz Can you give me an example? Those <3>'s are probably intentional; they are the declension class of the noun. Benwing2 (talk) 02:24, 5 August 2019 (UTC)Reply
I've reverted aeon, Deus, Moyses and syllepsis because Wingerbot introduced module errors. If that syntax is correct, than look at missing genders. Chuck Entz (talk) 02:29, 5 August 2019 (UTC)Reply
@Chuck Entz Sorry about that. I've fixed them all. All of these errors were actually my fault, not my bot's fault, despite the change appearing to be made by my bot; I made a list of manual changes and then pushed them all at once using my bot, to save time. Such changes have the word "manually" in the commit message, indicating that they're actually manually-derived changes. In general, I've been pretty careful about testing the actual bot code so as not to introduce errors, but it's more difficult when I make a bunch of manual fixups. Let me see if I can figure out a way to determine if a page is in error, and alert immediately on that. Benwing2 (talk) 02:38, 5 August 2019 (UTC)Reply

Something has gone wrong on this entry for respublica as there are now two declension tables and there is no headword line . Tulros (talk) 11:33, 14 August 2019 (UTC)Reply

@Tulros Thanks for alerting me, I fixed it. This entry was changed by me manually but I used my bot to push the change, and I made a mistake editing the entry. Benwing2 (talk) 19:10, 14 August 2019 (UTC)Reply

Newly awkward wording on some Latin nouns with multiple possible declensions

[edit]

For most nouns, I'm indifferent about consolidating previously separate declension tables into one (it saves space, but that's not a vital priority in an online dictionary). However, I think the wording now used before the declension table for nouns like equa is markedly worse than the previous version: "First declension or first declension, dative/ablative plural in -ābus [table]" vs. the older "First declension. [table1] Sometimes: First declension, dative/ablative plural in -ābus. [2nd table]" I think it's redundant and potentially confusing (because of the awkward wording) to repeat "first declension". The wording I would prefer is "First declension, dative/ablative plural in -īs or -ābus". There are similar issues with awkward wording for other nouns in the category Category:Latin nouns with multiple variants of a single declension, such as "Third declension, Greek type or third declension."--Urszag (talk) 11:14, 8 August 2019 (UTC)Reply

@Urszag Yeah, I agree that wording isn't so good and I'll clean it up. Benwing2 (talk) 15:26, 8 August 2019 (UTC)Reply
@Urszag This is somewhat cleaned up now. Cases like equa still aren't quite right unless you put the -ābus variant first, but I'll see if I can fix that. Benwing2 (talk) 03:53, 13 August 2019 (UTC)Reply
Thanks, that sounds better.--Urszag (talk) 18:31, 13 August 2019 (UTC)Reply

A few Latin conjugation tables needing cleanup

[edit]

Hi. Seeing as you've been cleaning up our conjugation tables, you might be interested in this post. I see some of them have already been fixed, but:

A question about alternate genitive forms for 2nd-declension nouns with stems ending in -io-

[edit]

Hi again! By the way, I'm wondering whether there's a better place to leave comments related to the recent project of changing the Latin inflection tables and the associated templates. I'm not sure where any previous discussions about this topic have been located, and I don't want to spam your talk page with a bunch of comments, but I do think that there are a few other things that could be improved. Right now, the thing I'm thinking about is the way multiple forms are displayed for nouns like beneficium, beneficī/beneficiī. The current situation seems to be that all nouns of this form automatically display both kinds of genitives in the declension table, with a footnote for the contracted form saying "Found in older Latin (until the Augustan Age)".

There are two things that seem problematic to me about this. First, I'm not sure that this is a very accurate description of when the variant forms were in use. Here is a quote from a Latin Stack Exchange post: "Tronskii 1960 adds that the -ī form was still used in the so called post-Augustan Latin (Silver Latin); he argues that Horace (i.e. Augustan) and Persius (post-Augustan) don't use the -iī genitive singular. So, there was a lot of variation." ("Forms of 2nd Declension Neuter Nouns ending in -ium", answer by Alex B.).

Second, and I think worse, is that this is displayed even for words that didn't exist in older/pre-Augustan Latin. It seems rather nonsensical to me to use this wording in the entry for Nagasacium "Nagasaki"; even if the form Nagasacī exists, it's clear that it cannot be "found in older Latin".--Urszag (talk) 05:10, 9 August 2019 (UTC)Reply

@Urszag Hi. Feel free to leave comments. If you have better wording for the "found in older Latin" message I'll definitely add it. As for cases like Nagasacium, you can fix this by changing <2> to <2.-ium>. My choice was either to not include this by default and require that all nouns with this variant (which I think is most) specify <2.ium>, or include it by default and require that nouns without the variant disable it using <2.-ium>, which is what I've done (the minus sign before 'ium' means to disable this variant). Benwing2 (talk) 10:17, 9 August 2019 (UTC)Reply
Thanks for the explanation on how to use the code. I will think about alternative wordings that could be used. I found another small issue that was caused by the template changes: the category Latin neuter nouns in the first declension now includes epulum, which the linked entry says is a neuter second-declension noun in the singular and a feminine first-declension noun in the plural. I think it makes more sense for that category to include only nouns that can be neuter and first-declension at the same time (pascha is the only such example that I know of).--Urszag (talk) 18:31, 13 August 2019 (UTC)Reply

Replacing uses of {{R:COED}} with {{R:Lexico}}

[edit]

Hi. As @Gotitbro points out, {{R:COED}} should be deprecated since it links to the old OxfordDictionaries.com website. Could you run a bot and replace all uses with {{R:Lexico}}? Then I'll delete the template. Thanks. — SGconlaw (talk) 18:38, 13 August 2019 (UTC)Reply

@Sgconlaw Sure, will do. Benwing2 (talk) 19:10, 13 August 2019 (UTC)Reply
SGconlaw (talk) 19:15, 13 August 2019 (UTC)Reply
@Sgconlaw Done. Benwing2 (talk) 19:16, 13 August 2019 (UTC)Reply
Thanks! — SGconlaw (talk) 07:57, 14 August 2019 (UTC)Reply

cydonius, cotoneus

[edit]

While the adjectives lacked, your removal of the noun “quince tree” is not correct. Like any tree one uses cydōnius f, without arbor. Palladius 25, 20: Amant cydonii locum frigidum, humectum. An alternative form ends in -a. And other times in the text, not to speak of New Latin authors. The meaning “(relational) quince” of the adjective cydōnius is of very frequent occurence and easy to find, it isn’t exclusive to cotōneus. (qudenaea in the Edictum Diocletiani we can probably ignore.) Fay Freak (talk) 20:19, 13 August 2019 (UTC)Reply

@Fay Freak Apologies, I was reading the Gaffiot and L+S entries too literally. Benwing2 (talk) 23:10, 13 August 2019 (UTC)Reply

Could -matīs variants be added for the plural ablative/dative forms of -ma neuter nouns?

[edit]

I noticed that Wiktionary currently doesn't seem to record forms like poematīs (dative/ablative plural) which seem to have been commonly used instead of -matibus plurals like poematibus (Source 1, source 2). The only thing I could find was a note in the entry for poema saying "The plural is also declined like 2nd declension neuter", which isn't totally clear. That note suggests that forms like "poematorum" also existed in the genitive plural, which does in fact seem to be true (source 1, source 2), but I don't know whether those forms were common enough to include. In any case, it seems to be worthwhile to add a template specifically for this class of neuter nouns, because as far as I can tell the usage didn't differ much between different -ma neuter nouns.--Urszag (talk) 22:51, 13 August 2019 (UTC)Reply

Replacement of {{RQ:RJfrs AmtrPqr}}

[edit]

Hi! If you have time, could you do the following bot replacement? Please replace:

#* {{RQ:RJfrs AmtrPqr|II|071}}
#*: Orion hit a rabbit once; [...]

with

#* {{RQ:Jeffries Amateur Poacher|chapter=II|passage=Orion hit a rabbit once; [...]}}

The number in the second parameter ("071" in the example above) was used by a previous version of the quotation template but now no longer serves any purpose. Thanks. — SGconlaw (talk) 16:29, 15 August 2019 (UTC)Reply

@Sgconlaw Sure, will do. Benwing2 (talk) 01:04, 16 August 2019 (UTC)Reply
@Sgconlaw I wrote the code to do this, but are you sure that you want the page number (second param) removed? The new template says it won't link to the online edition unless this is supplied (which may not be true per the definition, but it does appear to serve a purpose). Benwing2 (talk) 02:43, 18 August 2019 (UTC)Reply
The old number in the second parameter isn’t a page number, from what I can tell. But let me check. — SGconlaw (talk)
Yup, I checked. The old number isn't a page number. It appears to be some arbitrary code used by the Gutenberg Project website to divide up the text. — SGconlaw (talk) 19:10, 18 August 2019 (UTC)Reply
@Sgconlaw OK, I'll go ahead and run the script then. Benwing2 (talk) 19:13, 18 August 2019 (UTC)Reply
@Sgconlaw Done. There's one instance left (avail). Benwing2 (talk) 19:27, 18 August 2019 (UTC)Reply

La-IPA module changes

[edit]

(moved to Module talk:la-pronunc) Benwing2 (talk) 17:16, 18 August 2019 (UTC)Reply

Cūstōs or custōs; cūstōdiō or custōdiō

[edit]

A question regarding the words "custōs" and "custōdiō": First of all, I am not a Latin expert. I only refer to trustworthy dictionaries. Here are two links: https://logeion.uchicago.edu/custos http://www.perseus.tufts.edu/hopper/text?doc=Perseus%3Atext%3A1999.04.0060%3Aentry%3Dcustos https://logeion.uchicago.edu/custodio http://www.perseus.tufts.edu/hopper/text?doc=Perseus%3Atext%3A1999.04.0060%3Aentry%3Dcustodio According to these dictionaries (which are used by most Latin beginners, amateurs, maybe experts and maybe scholars (since I am not an expert, how will I know whether they use them)), they have an established consensus on their spellings (aka the length of the vowel "u") When I learnt Latin, I was taught "custōs". Maybe my teacher and textbook(s) were wrong. :/ Please let me know what you think. :) Cheers, Genper

@Genper The second link is to Lewis+Short, which is from the 1870's and whose length marks before two consonants aren't very accurate. Note in the first link, for example, that there's a header labeled FriezeDennisonVergil that writes cūstōs and cūstōdiō. That means that this particular dictionary believes the u is long. Various authorities differ on the vowel length, so we write it as ū̆. Benwing2 (talk) 00:01, 20 August 2019 (UTC)Reply
@Benwing2 Thanks! My bad!

First-declension adjectives should distinguish masc/fem and neuter accusative forms

[edit]

I noticed that the declension tables generated for first-declension adjectives (e.g. those in -cola or -gena, like indigena or rūricola) currently show neuter accusative forms in -am and -ās, which are impossible: neuters always have the same form for nominative and accusative. The accusative singular should be -a instead. I'm not sure whether neuter plural nominative or accusative forms are even attested for adjectives of this type (it feels strange to me to use -ae for a neuter plural, although that ending exists for another reason in quae), but if a certain form is given for the neuter nominative plural, it must also be the form of the neuter accusative plural. Would you be able to change the behavior of the relevant template?--Urszag (talk) 00:52, 5 September 2019 (UTC)Reply

@Urszag I'll fix that. There do exist neuter versions of adjectives of this type, e.g. vīnum aliēnigena. Benwing2 (talk) 01:29, 5 September 2019 (UTC)Reply
@Urszag Fixed. Benwing2 (talk) 01:38, 5 September 2019 (UTC)Reply
Thanks for the quick response! My concern/uncertainty is specifically about the existence and form of such adjectives in the neuter nominative/accusative plural: we currently give the plural as "ae", as in "vina alienigae", but I'm wondering whether there are any attestations of that. In comparison, for third-declension adjectives of one termination, the masculine/feminine plural forms in -es are not used for the neuter: the neuter plural either ends in -ia, or more rarely just -a, or is simply unattested. For this reason, I feel like first-declension adjectives of one termination might either have neuter plurals in -a, or simply be defective/unattested in the neuter plural nom/acc. I'm currently looking for examples of usage either way, and so far I've found one text that uses "vina indigena" (Opuscula academica, medica et philologica: collecta, aucta et emendata, Volume 2,

By Carl Gottlob Kühn, p. 378) (it could just be a 2nd-declension form, since a 1/2 adjective indigenus, indigeni also exists), but no text that uses "vina indigenae".--Urszag (talk) 01:58, 5 September 2019 (UTC)Reply

Community Insights Survey

[edit]

RMaung (WMF) 14:31, 9 September 2019 (UTC)Reply

Community Insights Survey

[edit]

RMaung (WMF) 14:34, 9 September 2019 (UTC)Reply

Pages with no language headers

[edit]

diff DTLHS (talk) 16:28, 14 September 2019 (UTC)Reply

@DTLHS Oops. I moved this page to gigasporus, the correct spelling, and accidentally recreated it. Benwing2 (talk) 17:07, 14 September 2019 (UTC)Reply
Also Upsalas, Upsalis, Upsalarum. DTLHS (talk) 17:13, 14 September 2019 (UTC)Reply
@DTLHS Thanks. None of these pages should exist and all were accidentally created by my script. I changed it so it won't create pages unless I specifically tell it to. Benwing2 (talk) 17:22, 14 September 2019 (UTC)Reply
@DTLHS BTW if you see any more let me know. Benwing2 (talk) 17:23, 14 September 2019 (UTC)Reply

Reminder: Community Insights Survey

[edit]

RMaung (WMF) 19:12, 20 September 2019 (UTC)Reply

Reminder: Community Insights Survey

[edit]

RMaung (WMF) 19:14, 20 September 2019 (UTC)Reply

Non-cognates

[edit]

Hi. Just wanted to say that you should make provisions to make sure your bot never does things like this: diff. In this particular case, the Latin word is not a cognate; it was given for comparison (of similar semantic formation), but your bot replaced LANG {{m|CODE|...}} with {{cog}}, where actually it should have been {{noncog}}. Now, if your bot have made similar replacements in the etymology section of other entries, then you will have to check for all of those! I hope you would take care hence while doing such operations as this. Thanks! —Lbdñk (talk) 17:43, 21 September 2019 (UTC)Reply

@Lbdñk Hmmm. The bot keys off of a few phrases, one of which is "Compare". I looked beforehand at this, and I didn't see any cases that weren't cognates. Indeed, from examining the changes, it looks like probably at least 99% of such uses are actual cognates, but there may be very occasional cases where they aren't, as you discovered. The number of changes made by the bot involving the word "Compare" or "compare" is quite large (18438) so it's not really possible to manually check them all. I personally don't think this is a huge deal: there are already occasional manually-entered mistakes as well where people use {{cog}} for non-cognates, and {{cog}} doesn't actually create any cognate-related categories, and there was a previous bot run by User:MewBot that replaced entries of the form {{etyl|FOO|-}} {{m|FOO|...}} with {{cog|FOO|...}}, which may have also introduced occasional false positives. If you can think of a way to identify such cases I'll gladly implement it. Alternatively, one could potentially do something like create a template {{pcog}} ("probably cognate") that is used for such cases and works like {{cog}} except that it indicates bot-made changes and needs to be manually converted to {{cog}}. Failing that, if you really want I'll revert all the changes involving "Compare"/"compare". Benwing2 (talk) 21:30, 21 September 2019 (UTC)Reply
@Benwing2: Thank you for your fullheartedness to this issue. Firstly, I am just curious to know, why are you actually doing the replacements with {{cog}}— I mean— what is the potential behoof or advantage thereof? Now, here comes the complicated matter: while there are many words which have been given only for comparison of similar formation (and manually their templates can be corrected to {{noncog}}), there are also quite a few words that are not cognates in that they have no common descent from a proto-language etc, but their individual components actually are cognates (e.g.: thereof versus Swedish därav), and for which, your bot has done the said operation. Only for this latter case, I think we should include no templates at all, leaving them as {{m|LANG|term}}, though I would like to know your opinion first. That's all. —Lbdñk (talk) 16:18, 22 September 2019 (UTC)Reply
@Lbdñk The reason for doing the replacement is first that it automatically links the language properly to Wikipedia, and also that it indicates extra etymological information that (a) may one day be inserted into categories like "Cognates of LANG words" and (b) is useful for automatic etymology-relationship extractors (a few years ago, someone actually created a tool to analyze the relationships encoded by {{cog}}, {{inh}}, {{bor}}, etc. and display such relationships graphically, and wrote a post about this). As for thereof vs. Swedish därav, I would consider that a cognate, and I think many others here would too; for example, I've often noticed {{cog}} being used for words that share a common PIE root but aren't what I would call "exact cognates" in that they both descend from exactly the same word. In many cases, in fact, it's impossible to know whether pairs of derived words that appear to be exact cognates are in fact actually descended from the same word, or whether the derivations were independently constructed at some later stage. Benwing2 (talk) 21:44, 22 September 2019 (UTC)Reply
[edit]

Hi! In WingerBot's edits, I noticed some links with manually formatted annotations after them, for instance a transliteration and gloss here: [[شهبانو]] (šahbânu, “queen (consort)”){{l|fa|شهبانو}} (šahbânu, “queen (consort)”). I wonder if these annotations can be correctly identified by the bot and moved into the template: {{l|fa|شهبانو|tr=šahbânu|t=queen (consort)}}. A double-quoted annotation (“queen (consort)”) is probably a gloss, but maybe unquoted words (šahbânu) aren't always a transliteration. It might be worth making a list and looking it over. If it's not simple enough for a bot, it might be a task for AWB or a Pywikibot-based editing script that lets someone look over the changes before submitting them. — Eru·tuon 21:44, 21 September 2019 (UTC)Reply

@Erutuon Hey. The code actually already tries to implement this, by looking for parenthesized words following a raw link and using edit distance (Levenshtein distance) to compare the parenthesized word to the automatic transliteration. It's currently not smart enough to handle cases where the transliteration is followed by a gloss, as in the example above, but this can be fixed. The specific case of Farsi, however, is trickier because there's no auto-translit module for it, but code could be implemented to do this (at least for this bot's purposes) using the standard established in Wiktionary:Persian transliteration. Benwing2 (talk) 21:53, 21 September 2019 (UTC)Reply
Oh, sorry for assuming. That sounds like a good method. I think you mentioned something about it before, probably in WT:BP. Maybe it could be used to detect bad transliterations too. Thanks for the work you've done on this.
This edit caught my eye, in which [[مردم‌سالار]] wasn't modified. Is that because ZWNJ isn't in the list of Arabic characters in Module:scripts/data? — Eru·tuon 17:09, 22 September 2019 (UTC)Reply
@Erutuon In my script I manually copied the list of characters for several languages since I haven't yet modified the script to use the data from Module:scripts/data, but yes, the root of the issue is the lack of ZWNJ in the list of characters for Arabic-language scripts. Benwing2 (talk) 20:13, 22 September 2019 (UTC)Reply
I suppose the ZWNJ could be added, but using the Unicode script property might be better. ZWNJ is an Inherited (Zinh)-script character, and several other scripts also frequently use Zinh and Zyyy (Common) characters, such as many combining characters and punctuation marks. So if Wiktionary scripts are mapped to Unicode scripts, we can use regexes that match strings that contain at least one character from the relevant script, as well as Zinh and Zyyy characters. (In Python, this requires the regex package.) For Arabic-script terms would be ^[\p{Zinh}\p{Zyyy}]*[\p{Arab}][\p{Arab}\p{Zinh}\p{Zyyy}]*$.
This regex would match مردم‌سالار, because the letters are Arab and ZWNJ is Zinh. A similar regex for Cyrillic would match Cyrillic terms that contain Zinh characters, which are more numerous than plain Cyrillic terms, probably because of combining diacritics. (Cyrl, Zinh has 311,636 terms and Cyrl 231,099 in this census of the total number of terms in link templates using various script combinations, generated from User:Erutuon/scripts in link templates.)
I imagine this would be an improvement in general, but I should try actually mapping Wiktionary scripts to Unicode ones and comparing the characters that they include and the number of terms matched by each. — Eru·tuon 21:51, 22 September 2019 (UTC)Reply
@Erutuon I'm not super familiar with Unicode charset categories but I agree that the Cyrillic terms containing Zinh characters are actually containing accents (e.g. acute) that are automatically stripped when generating the entry name. My script does handle that correctly; I manually wrote functions to do the accent stripping for the 12 out of 88 languages being handled that need such accent stripping, but I really should parse the entry_name entries and do this automatically. I actually think it's better to use the Wiktionary-specified characters and entry_name entries because they're more likely to reflect the actual conventions used in Wiktionary entries, although I imagine they can be mapped to Unicode character sets pretty closely. Benwing2 (talk) 22:04, 22 September 2019 (UTC)Reply
To clarify, this just involves the characters field in Module:scripts/data, not entry_name. Hmm, I hadn't thought of all the stress marks in Russian and other Slavic languages; they probably make up most of "Cyrl, Zinh" category. But some Cyrillic-script languages do have Zinh characters in their entry names. Here are mainspace titles with a Cyrl character and an Zinh character. All of the Zinh characters are diacritics: macron (U+0304), breve (U+0306), dot above (U+0307), ring above (U+030A), caron (U+030C), ogonek (U+0328). I looked at a few of the entries, and some of the languages represented here actually strip the diacritic in the entry name (Nanai, а̄пон), but not others (Mansi, а̄ви). But based on this test most of the titles on the list are valid. That isn't very many, but the Unicode-based method wouldn't choke on them. — Eru·tuon 23:27, 22 September 2019 (UTC)Reply
I'm not aware of any cases in which the Wiktionary characters fields would be more accurate at recognizing the script than regex based on the Unicode script property, but I'll work on a comparison of the two using entry names or link template contents, and maybe a comparison of the code points contained in each. — Eru·tuon 02:14, 23 September 2019 (UTC)Reply
@Erutuon Thanks for looking into this. As for your comments above, my script effectively uses both entry_name and characters: First it strips accents from links using entry_name, then it checks the resulting page name using characters to see if all characters in the page name match the characters in characters. Note that the countCharacters function in Module:scripts doesn't currently strip accents before counting characters (although perhaps it should); it does in fact make use only of characters. Benwing2 (talk) 04:39, 23 September 2019 (UTC)Reply
I've gotten sidetracked into moving Wiktionary script–related stuff out of Module:Unicode data, but I will get to this eventually. Since it's relevant given the method you're using, I can also check script recognition after entry-name replacements are processed. Though entry-name replacements aren't designed to remove all characters that are ambiguous script-wise, it would be interesting to see how they affect the results. — Eru·tuon 01:50, 25 September 2019 (UTC)Reply
@Erutuon Cool, sounds good, thanks for looking into this. Benwing2 (talk) 01:55, 25 September 2019 (UTC)Reply

Okay, so I generated three sets each containing the list of scripts found in link parameters in link templates. The sets were for Unicode script, Wiktionary script (language-agnostic, using a function equivalent to the char_to_script in Module:Unicode data), and Wiktionary script after applying entry-name replacements. The templates were all the instances of {{affix}}, {{back-formation}}, {{borrowed}}, {{cognate}}, {{confix}}, {{derived}}, {{desc}}, {{inherited}}, {{l}}, {{l-self}}, {{langname-mention}}, {{m}}, {{m-self}}, {{noncognate}}, {{prefix}}, {{rebracketing}}, {{semantic_loan}}, {{suffix}}, {{t}}, {{t+}}, {{t+check}}, {{t-check}} from the latest dump, ignoring those that had - in the link parameter. When at least two of the script sets differed, I printed the term, language code, and the three script sets.

Superficially, about 25% of the link parameters processed had differences in script set (2,019,024 out of 7,845,296). This is quite a lot. However, it looks like many of the results are not relevant to script recognition, because they involve scripts that don't represent writing systems of actual languages. Some results involve Wiktionary scripts like None and the Unicode scripts Zinh, Zyyy, or Zzzz, or Wiktionary scripts like Zmth and Zsym, which aren't Unicode scripts and are only used by Translingual. For instance, Russian зо́нтик generated the Unicode script set Cyrl, Zinh, the Wiktionary script set Cyrl, None, and the entry-name script set Cyrl; and Translingual generated the script sets Zyyy, Zmth, and Zmth. Both of these don't differ in the scripts that matter for recognizing that text is in a particular language, so they can be treated as equivalent. — Eru·tuon 16:49, 25 September 2019 (UTC)Reply

Okay, here are the results after some transformations: replacing some Wiktionary script codes with similar Unicode ones, and removing the not-really-a-script script codes. After I removed duplicates from the initial list of script set differences, there were 828,890 results, and after I applied the transformations only 1,061 (0.128%) still had differences.

Some of these involve punctuation characters that Wiktionary assigns to a script and Unicode does not ( and ، are Hani and Arab in Wiktionary and Zyyy in Unicode). Others involve characters that for some reason aren't in a Wiktionary script pattern, like ʒ and ə and . There are quite a few differences in whether a character is assigned to Hani, Hira, or Kana, but those won't affect script recognition when the languages in question use Jpan, which contains all three. — Eru·tuon 22:34, 25 September 2019 (UTC)Reply

WingerBot mistakes

[edit]

Hi, WingerBot is making mistakes sometimes, e.g. in the Romanian line here. —Mahāgaja · talk 12:11, 25 September 2019 (UTC)Reply

@Mahagaja Thanks for alerting me, I'll go through the output logs and fix the cases like this. Benwing2 (talk) 14:28, 25 September 2019 (UTC)]Reply
@Mahagaja There were 35 cases of this sort I could find; they should all be fixed now. It looks like you fixed some of them already; thanks for doing this! Benwing2 (talk) 14:53, 25 September 2019 (UTC)Reply
I just fixed what I found at CAT:E. —Mahāgaja · talk 15:52, 25 September 2019 (UTC)Reply

Conversion of |lang= to first parameter

[edit]

Hi. I noticed you have a script to convert |lang= to the first parameter like what you did for {{confix}}. Can this script be applied to other templates such as {{quote-book}}, {{quote-journal}}, {{synonym of}}, {{form of}} and {{IPA}}, {{audio}}, {{rhymes}}, {{hyphenation}}? This is because some users are doing the conversion manually but have done some mistakes, like the removal of |lang= from {{wikipedia}} which resulted in an incorrect page. There might also be some cases of incorrect language code being used for these templates so this is worth checking. KevinUp (talk) 18:44, 26 September 2019 (UTC)Reply

@KevinUp Yes, I can do that. Is the above list the exact list of templates you want converted? There are definitely others that could be converted. Also when you say "cases of incorrect language code", what do you mean exactly and how do you want them checked/corrected? Do you mean disagreement between the language of the header and the language code? Benwing2 (talk) 00:21, 27 September 2019 (UTC)Reply
Note also that it will take awhile to do some of these conversions in full. The current count of pages referencing these templates (some of these references could be through other templates):
Template:quote-book 44,574
Template:quote-journal 42,832
Template:synonym of 14,775
Template:form of 4,023
Template:IPA 569,580
Template:audio 238,149
Template:rhymes 88,416
Template:hyphenation 184,244

Wiktionary limits me to save at most one page a second (less frequently if the page takes more than a second to recompute, which happens when you save and can take several seconds for long pages). There are 86,400 seconds in a day so converting Template:IPA would take about a week and Template:audio about 2.75 days. So we might want to start on the smaller ones first (although I'll write the script so it converts all of a list of templates on each page it does, for efficiency). Benwing2 (talk) 00:54, 27 September 2019 (UTC)Reply

The above list are some of the templates that I've encountered more frequently. Yes, there are definitely other templates that also need to be converted. By "cases of incorrect language code", I meant some rare cases of "en" instead of "enm" in Middle English for example. Yes, the incorrect language code should agree with the language of the header.
Since it will take time to convert all of these, I suggest starting with the smaller ones first, such as {{form of}}. It might be better to deal with this template by template rather than all at once, so that the parameter |lang= can be deprecated for that particular template and display a warning if someone tries to use it.
As for {{IPA}}, {{audio}}, {{rhymes}}, {{hyphenation}} which are used for pronunciation, I think these can be done together, starting with languages with the least amount of usages until it reaches languages such as English which has many entries with these templates.
Thank you so much for running the bot. Of course, if there are other priorities those should be handled first. KevinUp (talk) 14:54, 27 September 2019 (UTC)Reply
@KevinUp I ran the bot on {{form of}}, {{synonym of}} (and short form {{syn of}}), and all the quote templates except for {{quote-text}}, which is running now (this template was generated automatically from manually formatted quotes and needs to be manually converted to one of the other templates). I'll do {{rhymes}} after that. What the script actually does is look for pages containing the template in question, but once found, it converts all templates on the page that use |lang= (or at least, all the ones that I could find and list, using Wiktionary:Templates with current language parameter). The reason for this is that it speeds up later runs on those other templates (the script can read about 7 pages per second but only save 1 per second; these are built-in server restrictions). The script doesn't correct incorrect language parameters; I'll do that afterwards by looking through the latest dump. Your idea of going language-by-language is a good one but it requires looking through the latest dump and making a note of all pages that use the template with a given language, which will miss any changes made after the latest dump was created (which happens twice a month), so instead I'm just iterating through all references to each template. Benwing2 (talk) 18:01, 29 September 2019 (UTC)Reply
That's a very smart approach. I was wondering why the script appears to be doing various things at the same time. So will the lang= parameter be deprecated for {{form of}}, {{synonym of}} and other templates that have been fully converted? KevinUp (talk) 18:45, 29 September 2019 (UTC)Reply
@KevinUp Yes. I'm thinking of adding a call to {{error}} to the page when |lang= is used so that these uses trigger an error, but also display the actual output (to make page histories legible). An alternative is what I've done e.g. on Template:de-inflected form of, where you don't get an error but you do get the text in green with the word "deprecated" before it. Benwing2 (talk) 18:54, 29 September 2019 (UTC)Reply
@KevinUp For now I took the latter approach in {{form of}}, which is less intrusive. Benwing2 (talk) 19:10, 29 September 2019 (UTC)Reply
I would prefer the call to {{error}} which displays the full output along with some form of warning such as "The parameter lang= has been deprecated. Please use the language code without lang= as the first parameter instead" compared to the "deprecated" text solution because the latter solution would cause some users to think that the template itself has been deprecated. Perhaps you can discuss this at the Beer Parlour to see what other editors think of it. KevinUp (talk) 19:22, 29 September 2019 (UTC)Reply
@KevinUp OK, I changed the deprecation text to default to "deprecated template usage" but controllable by the individual template. {{form of}} and {{synonym of}} use the text "deprecated use of |lang= parameter"; see User:Benwing2/test-deprecated-lang-param for examples. This is implemented by the {{deprecated code}} template; we can change that at some point to call {{error}} if so desired. Benwing2 (talk) 19:37, 29 September 2019 (UTC)Reply
Great solution. If someone removes lang= without moving it to the first parameter it would display "Please enter a language code in the first parameter" so this solution works. Thanks! KevinUp (talk) 19:53, 29 September 2019 (UTC)Reply

Hi. I just remembered about the following two templates:

Template:given name 28,275
Template:surname 52,574

I hope it's not too late to add these two to the script. KevinUp (talk) 02:52, 30 September 2019 (UTC)Reply

@KevinUp I'll add them. Keep in mind there are tons of other templates to convert, see Wiktionary:Templates with current language parameter. Benwing2 (talk) 02:54, 30 September 2019 (UTC)Reply
@KevinUp I stopped the {{rhymes}} run about 20% of the way through and instead I'm doing all (or most) of the form-of templates, from least frequent to most frequent. There are currently 148 of them (not counting aliases). Of these, 99 have < 1,000 uses; 22 have 1,000-5,000 uses; 18 have 5,000-15,000 uses; 6 have 15,000-50,000 uses; and 3 have more than that ({{alternative form of}} has about 80,000 uses, {{plural of}} has about 425,000 uses, and {{inflection of}} has nearly 2,000,000 uses). I'm going to go at least up to the 15,000-use templates, maybe also up to the 50,000-use templates, before switching back to the above high-use templates ({{rhymes}} etc.). Benwing2 (talk) 07:46, 30 September 2019 (UTC)Reply
Yes, go ahead. Templates that are used less frequently can be prioritized first. Your strategy is working, because {{IPA}} now has 448,600 uses (reduction of 100,000). KevinUp (talk) 11:38, 30 September 2019 (UTC)Reply

Note: You have bricked {{rfquote-sense}}, see زِمَام (zimām), it only shows {{{1}}} now, but maybe that’s temporary, I dunno, I am just telling just in case. Fay Freak (talk) 13:58, 1 October 2019 (UTC)Reply

@Fay Freak Thanks for letting me know. I fixed it and a few similar templates I accidentally broke. Benwing2 (talk) 14:16, 1 October 2019 (UTC)Reply
One you haven’t fixed: {{historical given name}}, as on أَرْبَد (ʔarbad). Fay Freak (talk) 21:18, 10 October 2019 (UTC)Reply
@Fay Freak Apologies. It's fixed now. Benwing2 (talk) 03:13, 11 October 2019 (UTC)Reply

Template:RQ:Milton Paradise Lost

[edit]

When you have time, could you kindly do a bot run and do the following conversions:

#* {{RQ:Milton Lost|VII}}, line 94
#*: and the work begun, how soon '''absolv’d''',

#* {{RQ:Milton Paradise Lost|book=VII|line=94|passage=and the work begun, how soon '''absolv’d''',}}

In general, {{RQ:Milton Lost}} should be replaced by {{RQ:Milton Paradise Lost}}.

If a range of lines is stated, then convert that to, for example, |lines=709–714 (with an en dash in between). Thanks! — SGconlaw (talk) 18:06, 28 September 2019 (UTC)Reply

@Sgconlaw These sorts of things are hard to do automatically. In this case there were only 58 pages involved and lots of different formats being used so I did the changes manually using some scripts I have to speed up such changes. Note that Citations:dire and Citations:Sophi have bad template invocations (missing book= in the former, missing passage= in the latter); perhaps you could fix those? Benwing2 (talk) 17:37, 29 September 2019 (UTC)Reply
OK, thanks! — SGconlaw (talk) 17:42, 29 September 2019 (UTC)Reply

Reminder: Community Insights Survey

[edit]

RMaung (WMF) 17:02, 4 October 2019 (UTC)Reply

Reminder: Community Insights Survey

[edit]

RMaung (WMF) 17:04, 4 October 2019 (UTC)Reply

Whitespace

[edit]

Apparently there are some templates that can't handle language codes with leading whitespace as a first parameter (Exhibit A) I don't know if we should change the templates/modules, WingerBot or both to trim leading/trailing whitespace, but we can't have working templates converted to broken ones every time you do a language-code parameter run. This isn't the first time this has happened, but the last time I thought it was an isolated fluke. Chuck Entz (talk) 00:31, 6 October 2019 (UTC)Reply

@Chuck Entz Sorry, I didn't realize this was an issue. I'll fix it. Benwing2 (talk) 00:33, 6 October 2019 (UTC)Reply
@Chuck Entz The problem occurs only with templates that don't use Module:parameters to process parameters. For weird reasons, MediaWiki templates trim whitespace automatically from named but not numbered params. Module:parameters goes ahead and trims the remaining whitespace. The culprit here was Module:labels/templates, which processes the |from= parameter of {{alternative form of}}, {{eye dialect of}}, and a few similar templates (the full list is in Category:Form-of templates). I fixed this module. Benwing2 (talk) 01:27, 6 October 2019 (UTC)Reply

Languages not using their own language code

[edit]

Hi. Replacing LANG {{CODE|...}} with {{desc}} was too bold an operation, and it yielded results like this. East Frisian is not a recognised language here, so it uses the code for Saterland Frisian (which is its only outliving dialect) in our Proto-Germanic entries. So you understand, the name 'East Frisian' should have been kept as it was in the descendants list. Now the grave problem is that there are many gem-pro entries that show East Frisian words with Saterland Frisian's language code, and I think you have to but revert all the changes your bot has made. Also, you should have noted that, User:Mewbot also had done a similar operation before your bot's, but was always careful in this regard; see: diff. Furthermore, there are also a few other languages without their own language code (though I cannot remember which), so you should have been wary about this thing. —Lbdñk (talk) 15:11, 6 October 2019 (UTC)Reply

@Lbdñk I **was** careful. In general, I don't do bot replacements without careful thought, and I keep a record of everything that was replaced, so that mistakes can be fixed. In this case, there was a long discussion in Wiktionary:Beer parlour#September 2019 (Automatically replacing "Foolang {{m|bar|...}}" with "{{cog|bar|...}}") about how to fix up Descendants lists. We touched on the exact issue you mentioned, what to do about various non-canonical language names showing up in Descendants lists ("East Frisian" is given as an other, i.e. non-canonical, name for Saterland Frisian in Module:languages/data3/s). I suggested replacing East Frisian with Saterland Frisian's language code, and User:KevinUp agreed, hence the replacement. As you can see from that discussion, there were 35 such cases replaced. I suggest you take a look at the list of non-canonical replacements made per that discussion and comment about any of them that you don't agree with, and I can fix them up. Benwing2 (talk) 15:28, 6 October 2019 (UTC)Reply
@Benwing2: So does it look good to have the same language name repeated twice in the descendants list (as has now happened in the gem-pro entries)? East Frisian and Saterland Frisian are not the same language; the latter is a dialect of East Frisian: therefor these two should have separate codes. —Lbdñk (talk) 15:57, 6 October 2019 (UTC)Reply
I think I agree here. Saterland Frisian is indeed the only surviving dialect of East Frisian, but is it the only attested dialect? What code do we use for terms in East Frisian dialects that are now extinct? We can't call them Saterland Frisian. —Rua (mew) 16:00, 6 October 2019 (UTC)Reply
@Rua: So I deem it fair that we have a separate lang. code for East Frisian, or else the reconstructed page would be looking awkward. Do you want to start a proposal for this?—Lbdñk (talk) 16:12, 6 October 2019 (UTC)Reply
@Lbdñk Hi. I think you missed my point, which was not to disagree with fixing up the entries where East Frisian was replaced with Saterland Frisian, but to dispute your assertion that my bot replacements aren't careful. As I said at the end, feel free to review the entire list of non-canonical replacements and propose fixups, and I will implement them. See User:Benwing2/east-frisian-to-stq-replacements for the replacements made involving East Frisian. As for Reconstruction:Proto-Germanic/mangijaną, I don't understand what language "manga" is in; is this Old East Frisian (see East Frisian language)? Also, is Saterland Frisian properly a separate language or (as Wikipedia asserts) a dialect (albeit the only surviving one) of the East Frisian language? In the latter case, we just need to create a code (e.g. gem-efr) for East Frisian, and make Saterland Frisian an etymology-only language with code stq. Benwing2 (talk) 16:29, 6 October 2019 (UTC)Reply
Actually, switching a regular language to an etymology-only language is tricky to do; User:Rua may have a better idea how to handle this. Benwing2 (talk) 16:31, 6 October 2019 (UTC)Reply
@Benwing2: Well, here we recognise Saterland Frisian as a fullfledged language, and not East Frisian, so we need to have the latter as an etymology-only language; the status of Saterland Frisian is fine. —Lbdñk (talk) 16:52, 6 October 2019 (UTC)Reply
@Lbdñk I don't think that will work; etymology-only languages are supposed to be children of regular languages, not parents. I think we'd have to create a family code, see WT:List of families and WT:Families. Benwing2 (talk) 16:56, 6 October 2019 (UTC)Reply
@Lbdñk Also, per my comment above, can you explain what language "manga" is in? Benwing2 (talk) 16:57, 6 October 2019 (UTC)Reply
@Benwing2: Yes, "manga" must be in Old East Frisian— and as such Saterland Frisian is a modern East Frisian dialect. But it is to be borne in mind that we do have Saterland Frisian lemmas here. Now, you were saying that etymology-only languages are meant to be children of regular languages, but hey, to give an example, Middle Scots, the immediate parent of Modern Scots, is itself only an etymology-only language. —Lbdñk (talk) 17:54, 6 October 2019 (UTC)Reply
@Lbdñk Why don't we then just create a language code for Old East Frisian, with Saterland Frisian as a descendant? As for Middle Scots vs. Modern Scots, I'm not super-familiar with how the language-tree system works but I suspect this will cause lots of problems. User:Rua, can you comment? Benwing2 (talk) 18:02, 6 October 2019 (UTC)Reply
Is Old East Frisian distinct at all from Old Frisian in general? According to Wikipedia, Old Frisian ends in the renaissance. —Rua (mew) 19:00, 6 October 2019 (UTC)Reply
@Benwing2: My apologies for recommending the replacement of East Frisian with stq (Saterland Frisian). Previously, I had read the discussions at
  1. Wiktionary talk:About Saterland Frisian#RFM discussion: December 2012
  2. Wiktionary:Beer parlour/2013/January#Saterlandic and East Frisian again
  3. Wiktionary:Grease pit/2015/March#Language code frs is not valid
I saw the statement "used mistakenly for Saterland Frisian (also called East Frisian), in which case it should be changed to stq" in the 2015 discussion, but it appears that things have changed since 2016, when "Saterland Frisian" was added below "East Frisian" in edits such as Special:Diff/37566808/37566819.
Anyway, someone will have to manually check through all of the 35 cases. I think it is a good idea to create a language code for "Old East Frisian" to disambiguate it from "East Frisian". KevinUp (talk) 19:09, 6 October 2019 (UTC)Reply
@Lbdñk Currently, it is not possible to place etymology-only languages as an ancestor of a full language. The correct approach is to place "Old East Frisian" (when it is created) as a descendant of some other language. For example, Middle Scots is a descendant of "Early Scots" under Category:Middle English language with no further descendants, even though it is actually an ancestor of Category:Scots language. KevinUp (talk) 19:09, 6 October 2019 (UTC)Reply
@KevinUp Thanks for your response. No worries about your recommendation, mistakes occasionally happen but they can easily be fixed up so it's not a big deal. Benwing2 (talk) 19:12, 6 October 2019 (UTC)Reply
The problem is that there are two "East Frisians": there's a dialect of nds-de, which ISO gave the code frs (though they were far from clear about it), and the genuine Frisian lect, which ISO didn't give a code to. The solution that was arrived at way back when was to use the Saterland Frisian code, stq, for the whole language. This was very much a case of the tail wagging the dog, but it apparently sidestepped a long debate or even a vote about technicalities. I vaguely remember running into a huge mess resulting from use of the wrong code all over the place, and how confusing it all was. I would recommend against diving into that can of worms again. Chuck Entz (talk) 05:03, 8 October 2019 (UTC)Reply

{{lbor}}

[edit]

Hi! Could you please do a bot operation to change many manually written statements to {{lbor}}? I am the one who wrote long statements like "[[learned borrowing|Learned borrowing]] from ..." and it is just now that I discovered that we actually have a nice template for learned borrowings! However, there is also another format you can find: simply "Learned borrowing from ....", i. e., without the hyperlink, that also needs to be changed. Thanks. —Lbdñk (talk) 19:38, 11 October 2019 (UTC)Reply

@Lbdñk Sure, I'll try to do that tonight or tomorrow. Benwing2 (talk) 05:01, 12 October 2019 (UTC)Reply
@Lbdñk I did this semi-manually; there were only 127 pages worth of learned and semi-learned borrowings not formatted using {{lbor}}. Benwing2 (talk) 19:35, 13 October 2019 (UTC)Reply
Great job! Thank you @Benwing2. —Lbdñk (talk) 15:11, 14 October 2019 (UTC)Reply

Russian Reference Templates (Ushakov)

[edit]

Hi! Sorry if this is not a good place for this—I am new to Wiktionary editing. I saw you were talking somewhere about Ushakov's Russian dictionary and was wondering why it's not included in Category:Russian reference templates. I was hoping to use as a reference for a page I made (since it is the source of the ru Wiktionary version of the page). Thanks! AmosJackson (talk) 19:08, 13 October 2019 (UTC)Reply

@AmosJackson Hi. It's fine to ping me directly. Can you give me the bibliographic details on the dictionary? Then I'll put them in a template. Benwing2 (talk) 19:12, 13 October 2019 (UTC)Reply
I am using Template:R:ru:18vek for a reference of the fields I need. Editor: Ушаков Д. Н., Title: Толковый словарь русского языка (Explanatory Dictionary of Russian Language), URL: http://feb-web.ru/feb/ushakov/ush-abc/, Location: Moscow, Publisher: (vol 1) Государственный институт "Советская энциклопедия"; ОГИЗ, (vols 2-4) Государственное издательство иностранных и национальных словарей, Year: (vol 1) 1935, (vol 2) 1938, (vol 3) 1939, (vol 4) 1940. All Russian lang. This info was sourced from the URL above. AmosJackson (talk) 20:03, 13 October 2019 (UTC)Reply
@AmosJackson I created it as Template:R:ru:Ushakov. Benwing2 (talk) 20:21, 13 October 2019 (UTC)Reply
Thanks! I only just realised that there's no way to make a URL template to point directly to a definition, given a certain word. Do you think then it makes more sense for the link to go to the dictionary's main page or the search page? AmosJackson (talk) 21:03, 13 October 2019 (UTC)Reply
@AmosJackson It's definitely possible to make the URL point to the definition; I just need to know what the format of the definition URL is. Benwing2 (talk) 21:52, 13 October 2019 (UTC)Reply
I think, unfortunately, the url format uses a numeric id for each word. I assume this would frustrate any attempts to make a template based on the word itself. AmosJackson (talk) 21:55, 13 October 2019 (UTC)Reply
@AmosJackson I see, yeah there's no obvious way of mapping terms to numbers. It's the same for the link at https://dic.academic.ru/dic.nsf/ushakov/. In that case I think we should point to the search page, ideally with the word already filled in. Benwing2 (talk) 22:01, 13 October 2019 (UTC)Reply
Sadly, I ran into the same issue with all the sites I found. I'll edit the template to point to the search page but I was even unable to find a site that would let me permalink to search results for a given dictionary. AmosJackson (talk) 08:09, 15 October 2019 (UTC)Reply

Move of Template:param to Template:paramref

[edit]

Are you going to update the template's documentation, which still refers to "param" as the name of the template? There are other such mentions on other template documentation pages, as well. Will you be updating those? - dcljr (talk) 02:05, 14 October 2019 (UTC)Reply

@Dcljr Yes. I'm waiting for the moment in order to leave time for people to comment on the renaming, but as long as I don't end up needing to revert the rename, I'll fix all the references and orphan the old name in a couple of days. Benwing2 (talk) 02:10, 14 October 2019 (UTC)Reply
I see. It was being discussed at Wiktionary:Beer parlour/2019/October#Deprecate {{docparam}}. A courtesy link from Template talk:docparam (and Template talk:param, before the move was done) would have been appreciated. (BTW, it seems to be impossible to link to the section heading you used because of the double-braces markup in the section title, so I inserted a simpler alternative to link to. See also the wikitext source of this comment.) - dcljr (talk) 02:44, 14 October 2019 (UTC)Reply
@Dcljr Apologies, I'll make sure to ping you next time if I change something you appear to have worked on. Benwing2 (talk) 02:52, 14 October 2019 (UTC)Reply
@Dcljr Benwing2 (talk) 02:52, 14 October 2019 (UTC)Reply
You don't need to remember to ping me, personally (or, indeed, any particular individual) in such cases: notifying the corresponding talk page when you nominate something for deprecation, deletion, or similar profound change, should be sufficient. In fact, that seems to be standard practice when nominating something at, say, Wiktionary:Requests for deletion/Others, based on some spot checks I made of recent nominations there. (BTW, you do not have to keep pinging me on this page. I watch talk pages I contribute to, at least for a few days and/or a few subsequent edits.) - dcljr (talk) 06:04, 14 October 2019 (UTC)Reply
OK. I prefer to be pinged personally, but each person is different. Benwing2 (talk) 06:08, 14 October 2019 (UTC)Reply

{{tcx}} ~ {{tlb}}

[edit]

Hi. A question: do {{tcx}} and {{tlb}} not render exactly the same service (i. e., to label the term as a whole)? If indeed so, then you might wish to "unite" them as a single template, for it's simply redundant to have two of these for the same thing. Also, there are also some entries erroneously using {{lb}} on the headword line, so you may do a bot operation for this thing also. Thanks, —Lbdñk (talk) 18:08, 22 October 2019 (UTC)Reply

@Lbdñk In fact I already brought this issue up in the Beer parlour. I want to deprecate {{tcx}}/{{term-context}} in favor of {{tlb}}/{{term-label}}. Benwing2 (talk) 00:33, 23 October 2019 (UTC)Reply

Okinawan applications for your bot.

[edit]

Do you think you could use your bot to create some Okinawan categories automatically and automate category creation? MiguelX413 (talk) 00:29, 23 October 2019 (UTC)Reply

@MiguelX413 Sure, what categories do you need created? Benwing2 (talk) 00:32, 23 October 2019 (UTC)Reply
Do you see all the categories missing for compound words like 沖縄口 and Kanji like 日#Okinawan and 男#Okinawan? Like those. Also, can it be used to update a lot of Okinawan pages with new templates? MiguelX413 (talk) 00:41, 23 October 2019 (UTC)Reply
@MiguelX413 I looked into this. Does Okinawan work exactly like Japanese w.r.t. categories like Category:Okinawan kanji with kun reading ちゅく-ゆん and Category:Okinawan kanji with historical kun reading ちつぃに‏‎ and Category:Okinawan kanji with on reading めー‏‎ and Category:Okinawan terms spelled with kanji read as あち‏‎? If so, the corresponding code to handle similarly-named Japanese categories can be expanded to handle Okinawan. However, categories like Category:Okinawan terms spelled with 宮‏ are trickier because the corresponding Japanese categories don't use {{autocat}} but instead use {{charactercat}} and pass in a sort key consisting of the radical and number of strokes, which I don't know how to derive automatically.
Also, in reference to "can it be used to update a lot of Okinawan pages with new templates?", yes, that can be done if you spell out exactly what changes you want. Benwing2 (talk) 05:34, 27 October 2019 (UTC)Reply
w.r.t.? I believe Module:zh-sortkey is used or the sorting keys, as stated in Template:ja-kanji/documentation. I just made {{ryu-readings-cat}} which can be invoked for the former categories. Could you implement {{ryu-readings-cat}} into module:auto cat in the same way that the Japanese version works? MiguelX413 (talk) 18:23, 27 October 2019 (UTC)Reply
@MiguelX413 Sure, I can do that. Note however that Module:ja-kanji-readings and Module:ryu-kanji-readings should really be merged; it's not such a good idea to have two nearly identical modules, because changes made to one generally won't be properly propagated to the other. If you're not able to merge them I'll look into doing it. I'll ask around about whether {{auto cat}} can be made to work with categories like Category:Japanese terms spelled with 愛. Benwing2 (talk) 18:37, 27 October 2019 (UTC)Reply
@MiguelX413 BTW w.r.t. = "with respect to". Benwing2 (talk) 18:38, 27 October 2019 (UTC)Reply
@Benwing2: Thanks for the clarification of w.r.t., and I don't know enough lua or template code to effectively repurpose the infrastructure to be for the Japonic languages as a whole. MiguelX413 (talk) 18:59, 27 October 2019 (UTC)Reply
@MiguelX413 You may have noticed that I'm auto-creating several of the Okinawan categories. You can see the categories that can't be created in Special:WantedCategories. There are in particular a lot of such categories named e.g. Category:Okinawan terms spelled with 人 read as っちゅ. These can't be auto-created because they need a parameter specifying whether the term is kun, on, nanori, etc., which can't easily be determined automatically. Could you create some of them? There are at least 50 listed in Special:WantedCategories. You'd use {{ryu-readingcat}}, analogously to {{ja-readingcat}}. For example, Category:Japanese terms spelled with 赤 read as あか is defined as {{ja-readingcat|赤|あか|kun|nanori}} because it has both a kun and nanori reading. Benwing2 (talk) 23:09, 24 November 2019 (UTC)Reply

Abuse filter blocking bot

[edit]

I'm not sure if you noticed, but an abuse filter I added was triggering a warning on and thereby blocking your bot edits on "quark" titles with Latin and Cyrillic. I changed it so it will only warn if the page is blank (which should most often affect page creations), and made the edits. Sorry about that. — Eru·tuon 04:48, 24 October 2019 (UTC)Reply

la-ndecl

[edit]

Two thoughts: all the old templates and guides are obsolete now, right? Can we just delete them? Secondly, I think this template would be better off as la-decl for a permanent name, and although that suggestion should probably go to RFM, I thought I'd ask you first. —Μετάknowledgediscuss/deeds 18:40, 26 October 2019 (UTC)Reply

@Metaknowledge Yeah, this is the only noun declension template left. I thought I deleted all the old templates already and fixed up the guides; which ones are you referring to? Also, the issue with {{la-decl}} is that there's also {{la-adecl}} for adjectives; {{la-decl}} by itself might be ambiguous. Benwing2 (talk) 19:10, 26 October 2019 (UTC)Reply
My mistake; I saw the guides and hadn't realised they'd been changed (maybe they should explicitly mention what template to use). Do we use ndecl/adecl for other languages? It certainly is less typing, which is generally a benefit. Truth be told, I haven't created many Latin entries for a while, and because my instinct is still to enter in the old templates, I guess the problem is just me. —Μετάknowledgediscuss/deeds 19:47, 26 October 2019 (UTC)Reply
@Metaknowledge For other languages, it's more common to use names like {{la-decl-noun}}, {{la-decl-adj}}. Benwing2 (talk) 19:49, 26 October 2019 (UTC)Reply
@Metaknowledge Also, the guides are still out-of-date in that the explicit subtype is not necessary in most cases, and some of the subtypes are just plain wrong. I need to fix them up or delete them. The documentation for {{la-ndecl}} and {{la-adecl}} should be in better shape. Benwing2 (talk) 19:53, 26 October 2019 (UTC)Reply

Acceleration in {{hu-decl-table}}

[edit]

Hi, a while ago you updated this template with the new acceleration standard. Today I added four new parameters for "non-attributive possessive" forms and while acceleration technically does work for them, the inflection line needs some adjustments since the new parameters do not exist in the acceleration module. It's fine if you don't add parameters for them in the acceleration module, but I don't know how to modify this template's script to display the correct case name. Would you help me to correct it? I need np1 and np2 to become "non-attributive possessive"

Example for the current line: # np1 plural of term

The correct inflection line would be: # non-attributive possessive plural of term

Thank you. Panda10 (talk) 21:14, 29 October 2019 (UTC)Reply

Bot request

[edit]

Can you use your bot to make changes like diff to given names? Ultimateria (talk) 16:34, 31 October 2019 (UTC)Reply

And unrelated, but this kind (diff) is pretty common in Italian entries and I'd like to standardize them. Ultimateria (talk) 16:39, 31 October 2019 (UTC)Reply
@Ultimateria I did both bot substitutions. In the process I did a lot of hacking on {{given name}} to support various new features that were often manually specified. The templatizing of cognates in {{given name}} made 1050 replacements and couldn't replace 232 cases; I fixed up those remaining cases by hand. The templatizing of masculines/feminines in {{it-noun}} made 245 replacements and couldn't replace 29 cases. Those 29 cases are found at User:Benwing2/it-noun-warnings. Can you fix them up? Either edit each entry or just edit the User:Benwing2/it-noun-warnings page, and I'll run a script to apply the changes. In the latter case, make sure to edit the text between <to> and <end> and leave alone the duplicate text between <from> and <to> (the duplicate text is needed so my bot knows what text to look for when making a substitution). Benwing2 (talk) 20:06, 3 November 2019 (UTC)Reply
Thank you so much! A couple of notes on names: at diff, I think Scandinavian is a special case that should be edited manually, and at diff I caught a substitution that the bot can do. I've edited the warnings and I'll take care of the ones that can't be automated. Ultimateria (talk) 00:05, 4 November 2019 (UTC)Reply
I should mention that some of the Italian nouns have no standard formatting - see my recent post at the Grease Pit. I would appreciate any help with that. Ultimateria (talk) 00:07, 4 November 2019 (UTC)Reply
@Ultimateria I'll take a look at {{it-noun}}. In the meantime I modified my script to handle "variant" and "equivalent" text following {{given name}}. It templatized 695 instances involving the word "variant" and was unable to templatize 331, which are found at User:Benwing2/templatize-given-name-variant-warnings. It also templatized 485 instances involving the word "equivalent" and was unable to templatize 174, which are found at User:Benwing2/templatize-given-name-equivalent-warnings. As before, if you could look at these warnings and fix some of them up, I'd be grateful, and as before you can do it by simply editing the page containing the warnings and making fixes between the <to> and <end> markers, and I'll run a script to apply the changes. Benwing2 (talk) 06:30, 4 November 2019 (UTC)Reply
Thanks! I'll let you know when I'm done looking over these. I also wanted to point out another possible fix, like diff in Slovene. So far I've found two instances of links to English names at the end of the definition line (one raw, one using {{l}}). They might use dashes or equals signs in addition to colons/semicolons. Ultimateria (talk) 17:11, 4 November 2019 (UTC)Reply
I've gone through the variants list and made the fixes I knew how to address. Can you run the script on the equivalent pages though? It looks like a lot of them will be updated if you do, and I can focus on the remaining warnings. Ultimateria (talk) 05:21, 5 November 2019 (UTC)Reply
@Ultimateria I will run the script on both pages. I updated most of the equivalent warnings myself. Benwing2 (talk) 05:24, 5 November 2019 (UTC)Reply
@Ultimateria I ran the script on both pages, and I updated the pages so as to remove the lines that were done. It turns out I also missed some cases with 'equivalent' and 'variant' and so I ran my script on those pages; the unprocessable lines from those scripts are found at User:Benwing2/templatize-given-name-variant-warnings-2 and User:Benwing2/templatize-given-name-equivalent-warnings-2. Benwing2 (talk) 05:40, 5 November 2019 (UTC)Reply
Thanks for rerunning that. I've gone through all the lists and fixed the ones that could be standardized to my knowledge. I did come across one more that was a colon and a link to the English name at the end of the definition line. If you can check that, that'll be my last request for this template! Ultimateria (talk) 06:56, 6 November 2019 (UTC)Reply

Category boilerplate names

[edit]

Thank you for taking the time to rename all these! —Rua (mew) 14:47, 7 November 2019 (UTC)Reply

Aliases of languages and families

[edit]

At Module:languages/data2 you left several "FIXME" comments pointing out that a language-name alias is the name of a family, but it is really the case that some names are ambiguous between being the name of an individual language and the name of a language family. And not just among aliases/alternative names; some canonical names are duplicated too (e.g. CAT:Albanian languages (sqj) and CAT:Albanian language (sq)). So I don't think that such cases need to be fixed at all. —Mahāgaja · talk 13:17, 13 November 2019 (UTC)Reply

@Mahagaja It looks like this is in reference to "Frisian" for "West Frisian" and "Kartvelian" for "Georgian". The difference here with Albanian is that Albanian is the common name for the language and there's only one modern language in the Albanian languages family. OTOH there are three Frisian languages and four Kartvelian languages, and I can't find any evidence in Wikipedia that "Frisian" is a common name for "West Frisian" or "Kartvelian" a common name for "Georgian". So I suggest removing these from otherNames. Benwing2 (talk) 16:41, 13 November 2019 (UTC)Reply
OK. I thought you were objecting to all such cases, but I want to keep "Gaelic" as an alias both for the Goidelic family (cel-gae) and for Scottish Gaelic (gd). I have no objection to removing Frisian as an alias of West Frisian or Kartvelian as an alias of Georgian. BTW, are you going to rename "otherNames" to "aliases" at Module:families too? —Mahāgaja · talk 16:44, 13 November 2019 (UTC)Reply
@Mahagaja I'm ok with Gaelic for Scottish Gaelic, I suppose; at least, I don't feel strongly against it, as Gaelic is commonly used to refer to Irish, and I imagine for Scottish Gaelic too in Northern Scotland. As for Module:families, yes I will split that into aliases and varieties. I also think we should do the same for Module:scripts. Benwing2 (talk) 17:01, 13 November 2019 (UTC)Reply
Yes, "Gaelic" is also commonly used for Irish. The difference is that "Gaelic" (/ˈɡælɪk/) for Scottish Gaelic is most common inside Scotland, while "Gaelic" (/ˈɡeɪlɪk/) for Irish is most common outside Ireland. In other words, Scottish people are most likely to refer to gd as "Gaelic", while Irish people are very unlikely to refer to ga as "Gaelic". I don't think Manx is ever called "Gaelic" by anyone. —Mahāgaja · talk 17:11, 13 November 2019 (UTC)Reply
Just try and find a "West Frisian" dictionary or grammar- most people are unaware that there's any other Frisian language, and most books only mention "West Frisian" somewhere inside, not in the title. Of course, it doesn't help that the ISO gave the name "East Frisian" to a Low German dialect, but I digress... Chuck Entz (talk) 03:39, 14 November 2019 (UTC)Reply

Italian headwords

[edit]

I keep finding variations on the messy (Feminine: '''[[x]]''') headword lines in Italian. Can I add them to a page in your bot's format like with the names, or just leave them here? Ultimateria (talk) 16:51, 14 November 2019 (UTC)Reply

@Ultimateria Sure, go ahead. Benwing2 (talk) 01:57, 15 November 2019 (UTC)Reply
Thanks, I'll keep a running list here. You can change it to whatever format is most useful. Ultimateria (talk) 04:58, 15 November 2019 (UTC)Reply
<from> {{it-noun|m}} (''Female'': [[trafilatrice]]) <to> {{it-noun|m|f=trafilatrice}} <end>
<from> {{it-noun|f}} ''Masculine'' '''[[artigiano]]''' <to> {{it-noun|f|m=artigiano}} <end>
<from> {{it-noun|m}} ''Feminine'' '''[[competitrice]]''' <to> {{it-noun|m|f=competitrice}} <end>
<from> {{it-noun|f}} (''Masculine'': [[bancario]]) <to> {{it-noun|f|m=bancario}} <end>
@Ultimateria See User:Benwing2/templatize-it-noun-m-f-warnings. This is all the remaining cases of lines containing {{it-noun}} followed by feminine, masculine, female or male, possibly capitalized. Benwing2 (talk) 05:53, 15 November 2019 (UTC)Reply
Done! Ultimateria (talk) 21:53, 15 November 2019 (UTC)Reply

Automatic creation of spelling categories

[edit]

It's way more useful to not automatically create these (Category:Russian terms spelled with P, etc)- they often serve as warnings when people have used incorrect templates or language codes, or put control characters in page titles, and I regularly search the wanted categories list for new instances. DTLHS (talk) 17:39, 18 November 2019 (UTC)Reply

@DTLHS OK. My bot by default creates all categories in the Special:WantedCategories list where {{auto cat}} doesn't generate any sort of error. Following is the list of all categories containing "terms spelled with" that have been created by my bot:

Note that almost all of them are for Okinawan and Japanese. What remains is 1 Hebrew, 2 Latin and 2 Russian categories. I can delete the Latin and Russian ones if you want, and blacklist "... terms spelled with ..." categories that aren't Okinawan and Japanese. Benwing2 (talk) 01:31, 19 November 2019 (UTC)Reply

Yes, everything that isn't Okinawan and Japanese would be great. DTLHS (talk) 02:56, 19 November 2019 (UTC)Reply

{{reconstructed}}

[edit]

Hi! I often come upon reconstructed pages wanting this template. Thus far I have added those myself, but I think you can do a bot operation to furnish the missing ones in no time. Thanks. —Lbdñk (talk) 16:36, 28 November 2019 (UTC)Reply

@Lbdñk Done. Benwing2 (talk) 00:36, 29 November 2019 (UTC)Reply
Thank you Benwing2. —Lbdñk (talk) 15:12, 29 November 2019 (UTC)Reply

dynian

[edit]

Hi ! You made an edit to dynian here [[8]], leaving the conjugation as weak 2. However, dynian is conjugated as a weak 1 verb (he dyneþ, dynede, gedyned, etc.) Leasnam (talk) 19:20, 2 December 2019 (UTC)Reply

In fact, dynnan has been made the main entry, when perhaps it should actually be dynian, since there are more instances of conjugated forms of dynian (e.g. dynigende, dynigendum, dynegendum) than of dynnan. Like *hafian, *dynnan doesn't really have any attested forms that can solely be assigned to it (i.e. *dynne, *dynnaþ, *dynnende, etc.). Leasnam (talk) 19:40, 2 December 2019 (UTC)Reply
@Leasnam. My apologies; dynnan is the expected form and is attested in Kobler's dictionary so I assumed this was a better place to put things. Benwing2 (talk) 03:01, 3 December 2019 (UTC)Reply
I think moving it all to dynian makes more sense. A headword listing in a dictionary does not count as attestation. Leasnam (talk) 03:05, 3 December 2019 (UTC)Reply
I'm fine with leaving it at dynnan, as that form can be extrapolated from the others although it's not directly attested. However, I did correct the conj template at dynian. Leasnam (talk) 03:15, 3 December 2019 (UTC)Reply

Moving Descendants

[edit]

Don't forget that Descendants sections are used by {{desctree}}, and removing the Descendants section from an entry will cause a module error in any instance of that template that links to it, so you have to fix it. Chuck Entz (talk) 20:13, 7 December 2019 (UTC)Reply

@Chuck Entz Oops, sorry, I've forgotten about that. Benwing2 (talk) 20:35, 7 December 2019 (UTC)Reply

Templates with deprecated lang parameter

[edit]

Hi Bewing2, I have noticed that the lang parameter has been recently deprecated for a lot of templates, e.g. past/present participle of, masculine/feminine plural of, masculine/feminine singular of ... . Do you have a list of the affected templates or a link to a discussion? Matthias Buchmeier (talk) 12:06, 8 December 2019 (UTC)Reply

@Matthias Buchmeier See Wiktionary:Templates with current language parameter for the list. The specific templates you're mentioning are form-of templates, which are also documented in more detail at Category:Form-of templates. Benwing2 (talk) 17:09, 8 December 2019 (UTC)Reply
Thanks, exactly what I was looking for. Matthias Buchmeier (talk) 17:34, 8 December 2019 (UTC)Reply

Bot task

[edit]

Can you please check Wiktionary:Bots/Tasks? I see you’ve done quite a lot of work with German adjectives so I think the two latest requests by me might suit you.Jonteemil (talk) 23:52, 9 December 2019 (UTC)Reply

Old English verbs

[edit]

I've been looking at the module you wrote for Old English verbs, which is obviously awesome and way better than the previous system, and I'm wondering a couple things about it. First, is there a way to change the conjugation of class II weak verbs, so that the pres.ind.1sg. and the pres.subj.sg&pl. have -iġe(-), while the infl.inf. and pres.part. still have -ie-? This was the normal state of affairs in Old English—for example, if you search the Bosworth-Toller, iċ lufiġe gets 15 results while iċ lufie gets none.

Also, why is the past participle of beran listed without (ġe)-? Ġeboren occurred all the time: e.g., He cwæþ þæt him selre wære þæt he geboren nære. Hundwine (talk) 04:57, 13 December 2019 (UTC)Reply

@Hundwine I will look into fixing class II weak verbs. Currently the module assumes that the same stem is used for all of 1sg. pres. ind., pres. subj., pres. part and 2nd. inf, because in all other cases it is; I'll have to split this into two stems, something like presefin (= "present stem usually-ending-in-e finite") and presenfin (= "present stem usually-ending-in-e non-finite"). This will mostly be under the hood so you won't have to worry about it. As for beran, the module sees that it begins with be- and thinks that's a prefix, hence no (ġe)-. I will make the module smarter about this; ran is not a possible Old English verb. Benwing2 (talk) 05:26, 13 December 2019 (UTC)Reply
@Hundwine I fixed class II weak verbs. However, I notice you made a lot of changes to the verb module, removing a lot of alternative forms and adding extra forms for preterite-present verbs. I'm not sure I agree with this; the alternative forms are those listed in Wright, which is generally Early West Saxon, and we should not add preterite-present forms that aren't attested. Benwing2 (talk) 09:29, 18 December 2019 (UTC)Reply
Thanks for fixing the class II weak verbs! I hope it wasn't too difficult or anything.
The alternative forms I removed were either not Early West Saxon (wiotan, hafu, sagast, etc.) or were the kind of minor variants that you didn't include with 99% of verbs generally. Also, I don't get why we shouldn't infer unattested preterite-present forms from the rest of the conjugations when that's what everyone does with a ton of other words whose paradigms aren't fully attested. Hundwine (talk) 10:00, 18 December 2019 (UTC)Reply
The preterite-present verbs are irregular each in their own way; as I result I think it's important not to guess what the missing forms might have been. User:Urszag ... thoughts? Benwing2 (talk) 10:49, 18 December 2019 (UTC)Reply
I'm not familiar with the conjugation of preterite-present verbs in Old English. Are the gaps/unattested forms that you are discussing things like unattested person/number combinations, or unattested stems for all persons and numbers in certain tenses or moods?--Urszag (talk) 13:42, 19 December 2019 (UTC)Reply

ang-IPA

[edit]

Why did Wingerbot put in an underscore here? It put the entry into CAT:IPA pronunciations with invalid IPA characters since the underscore isn't an IPA character. —Mahāgaja · talk 13:58, 16 December 2019 (UTC)Reply

@Mahagaja This was a mistake I made in the file used to generate the changes. Any time a change is labeled "semi-manual", it means I manually created (or at least manually checked) a file consisting of changes to make, and then used the bot to make the changes. Whenever that happens, there's a chance I'll make a mistake. In this case I meant to put a + sign to prevent in- from being interpreted as a prefix. Benwing2 (talk) 02:38, 17 December 2019 (UTC)Reply

Thoughts about your OE pronunciation module

[edit]
  • Are you sure /l/ was velarized after /e/ and /i/, as in hwelċ and ċild? I've read that it was velarized between back vowels (/ɑ/, /u/, /o/) or when preceded by a back vowel or a low front vowel (/æ/) and followed by a consonant.
  • [d͡ʒ] wasn't an allophone of /j/. If anything [d͡ʒ] was an allophone of /ɡ/, though imo [d͡ʒ] should be considered a separate phoneme just like /t͡ʃ/.
  • I'm 99% sure Campbell is wrong when he says initial ⟨h⟩ before a consonant (as in ⟨hr⟩, ⟨hl⟩, ⟨hn⟩, ⟨hw⟩) was just a diacritic indicating the next sound was unvoiced. This book gives several very good arguments that this ⟨h⟩ was genuinely pronounced (starting at page 42), including that it alliterated with words beginning in ⟨h⟩ plus a vowel. So these digraphs should be rendered /xr/ [hr̥], /xl/ [hl̥], /xn/ [hn̥], /xw/ [hw̥] in IPA. Hundwine (talk) 09:39, 19 December 2019 (UTC)Reply
Could you elaborate on why you say [d͡ʒ] was not an allophone of /j/? I've seen many different treatments of this area of Old English phonology. Like you, I would prefer to transcribe /h/ + resonant clusters with [h], but on the phonetic level I think the difference between stuff like [hr̥] and [r̥] is negligible. Alliteration seems more relevant as evidence of the phonemic representation than of the phonetic realization, and the module does represent hr, hw, hn, hl as clusters of phonemes rather than as a unitary phonemes.--Urszag (talk) 13:55, 19 December 2019 (UTC)Reply
I had thought [d͡ʒ] was an allophone of /j/, as [ɣ] is an allophone of /ɡ/. Then there's neat parallelism: /j/ has two allophones, continuant and non-continuant, just like the phoneme that it partially originated from by palatalization, [ɡ]. But if words ending in -nian have [njɑn], then there is a contrast between that [nj] and the [nd͡ʒ] in senġan, so [d͡ʒ] has to be a phoneme. I don't know if that's the case. (If something similar could be said of the allophones of /ɡ/, it would preserve the parallelism.) — Eru·tuon 17:30, 19 December 2019 (UTC)Reply
@Hundwine I am not certain about the velarization of /l/. If others agree I can implement your suggestion. As for [d͡ʒ], like the other comments I do believe it is an allophone of /j/. This is a natural interpretation based on the fact that [d͡ʒ] occurs only doubly and after /n/, and /j/ doesn't occur in these positions except across strong morpheme boundaries (where [d͡ʒ] doesn't occur). As for [hr̥] vs [r̥], I agree with what Urszag says that alliteration is not an indication of pronunciation because it's based on the phonemic analysis, which unquestionably was /hr/. However, I have no particular issue using [hr̥] (although it looks a bit clumsy to me). Benwing2 (talk) 04:56, 20 December 2019 (UTC)Reply
@Erutuon A Linguistic History of English, Volume II: The Development of Old English, by Don Ringe and Ann Taylor, says "The fricatives g [ɣ] and ġ [j] could come to stand immediately following n by syncope of an intervening short vowel (see 6.7.3), and that makes the written sequences ng and ambiguous. For instance, ng is [ŋg] in bringan ‘to bring’ < PGmc *bringaną because the consonants had always been in contact, but ng is [nɣ] in syngian ‘to sin’ < pre-OE *synʲnʲægōjan because the cluster arose by syncope. Similarly, is [nʤ] in menġan ‘to mix’ < PWGmc *mangijan, but is [nj] in menġu ‘multitude’ < PGmc *managīn-. Fortunately the clusters that arose from syncope are rare. How one analyzes this pattern of facts depends on one’s theory of phonology; I know of no work that addresses this particular issue" (p. 4).--Urszag (talk) 19:52, 20 December 2019 (UTC)Reply
From the examples you give, I think the simplest analysis would be that the four allophones belong to different phonemes, though they only rarely contrast. I suppose another explanation, in a more "Platonic" theory of phonology, might be that there is a phantom vowel in the underlying form, or something else that makes the environments different, that vanishes in the surface form. — Eru·tuon 20:00, 20 December 2019 (UTC)Reply

Module:category tree/topic cat/data/Places

[edit]

Hey, I just wanted to say thanks for working on making the topic cat data much more parsimonious and easy to work with. It's really nice, I appreciate you taking the time. Julia 19:47, 21 December 2019 (UTC)Reply

Template:lv-decl-noun-5

[edit]

Problem with the categories at uzpirkstīte. If you edited the other Latvian declension templates, you should check those as well. —Μετάknowledgediscuss/deeds 02:04, 24 December 2019 (UTC)Reply

@Metaknowledge It loooks like those templates aren't meant to be called directly. When called through {{lv-decl-noun}}, the categories are correct. Benwing2 (talk) 02:22, 24 December 2019 (UTC)Reply
Thanks for fixing the entry... that seems weird, but whatever works, I guess. —Μετάknowledgediscuss/deeds 04:31, 24 December 2019 (UTC)Reply

Bot request 2

[edit]

Can you replace all instances of {{head|it|noun|g=X|invariable}} with {{it-noun|X|-}}? Ultimateria (talk) 20:00, 28 December 2019 (UTC)Reply

@Ultimateria Running now. Will be done in 2-3 hours. Apologies that it took awhile to get to this. I'm surprised to see that {{head|it|noun|g=X|invariable}} (or variants) occurs on 6,804 pages, a lot more than I thought. Benwing2 (talk) 20:06, 12 January 2020 (UTC)Reply
Thank you. A lot of these entries were made in 2007 and have hardly been touched since. I know you have your hands full wrangling some of our byzantine systems and I commend you for it! Ultimateria (talk) 06:31, 13 January 2020 (UTC)Reply
@Ultimateria You're welcome. Note that a lot of the "invariable" nouns are probably misclassified. The full list is here: User:Benwing2/it-noun-invariable. A lot of them are numbers (which probably shouldn't be counted as nouns at all), and a lot of them are abstract nouns ending in -tà, -genesi, -lisi etc. that should probably be marked as uncountable instead of as invariable (marking them as invariable puts them in Category:Italian countable nouns). If you could, can you go through the list and identify the categories that should be reclassified? Either create a list of the nouns that are uncountable, or indicate how to find them (e.g. by looking at the ending), or do some combination. Then I'll use a bot script to fix up those nouns. Benwing2 (talk) 06:57, 13 January 2020 (UTC)Reply
Wow that's a lot. I'll go through the list, but I don't think there will be any patterns for uncountability. Abstract nouns in Romance languages tend to be countable even when the English equivalent isn't. But if I see anything that can be changed by bot, I'll let you know. Ultimateria (talk) 17:32, 13 January 2020 (UTC)Reply
@Ultimateria If an abstract noun in -tà is countable, is its plural still in -tà, or something else? Also, numbers aren't nouns or adjectives and should be placed under ===Numeral===. Benwing2 (talk) 02:02, 14 January 2020 (UTC)Reply
The plural is in -tà. I'll make sure to fix the numbers. Ultimateria (talk) 03:02, 14 January 2020 (UTC)Reply
I cleaned up some numbers and definitely uncountable cases like sports and languages. I didn't realize that all the numbers from 100-1000 need to be cleaned up. Can you automate edits like diff? The two parameters of {{af}} will always be separated after -cento. I can manually add etymologies to pages without that string. Ultimateria (talk) 19:15, 14 January 2020 (UTC)Reply
Also, please exclude any page with 3 or more definition lines. Ultimateria (talk) 22:13, 14 January 2020 (UTC)Reply

@Benwing2 I was going to wait but I will now bump this request since you archived. Ultimateria (talk) 02:45, 19 March 2020 (UTC)Reply

Apologies I completely missed this. Let me see if I can do it tonight. Benwing2 (talk) 03:30, 19 March 2020 (UTC)Reply
@Ultimateria I fixed up the numerals. I did all containing 'cento' except for the following, which need to be fixed up by hand:
Benwing2 (talk) 00:51, 22 March 2020 (UTC)Reply
@Ultimateria Oh yeah, also the following two pages:
  • Page 2 centodieci: WARNING: Page has 3 or more definition lines, skipping
  • Page 3 centotredici: WARNING: Page has 3 or more definition lines, skipping
Benwing2 (talk) 00:53, 22 March 2020 (UTC)Reply
Thank you! But I failed to take cases like novecentottantaquattro into account; in {{affix}}, the o is missing at the beginning of ottantaquattro and other pages with the string -centott-. Ultimateria (talk) 05:42, 22 March 2020 (UTC)Reply
@Ultimateria Oops. I excluded cases like settecentotto but not cases involving centottanta. I'll fix them up. Benwing2 (talk) 18:05, 22 March 2020 (UTC)Reply
@Ultimateria Done. Benwing2 (talk) 18:10, 22 March 2020 (UTC)Reply