Wiktionary:Grease pit

Definition from Wiktionary, the free dictionary
Jump to navigation Jump to search

Wiktionary > Discussion rooms > Grease pit

Welcome to the Grease pit!

This is an area to complement the Beer parlour and Tea room. Its purpose is specifically for discussing the future development of the English Wiktionary, both as a dictionary and as a website.

The Grease pit is a place to discuss technical issues such as templates, Lua modules, CSS, JavaScript, the MediaWiki software, extensions to it, the toolserver, etc. It is also a place to think in non-technical ways about how to make the best free and open online dictionary of “all words in all languages”.

Others have understood this page to explain the “how” of things, while the Beer parlour addresses the “why”.

Permanent notice

  • Tips and tricks about customization or personalization of CSS and JS files are listed at WT:CUSTOM.
  • Other tips and tricks are at WT:TAT.
  • Find information and helpful links about modules, Lua in general, and the Scribunto extension at WT:LUA.
  • Everyone is encouraged to expand both pages, or to come up with more such stuff. Other known pages with “tips-n-tricks” are to be listed here as well.

Grease pit archives edit
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020


August 2020

Templates not working with an Old Prussian word[edit]

Despite the page kīsman existing, some templates don't link to the word: Old Prussian kīsman, kīsman, kīsman. However, kīsman works. - Sarilho1 (talk) 21:30, 1 August 2020 (UTC)

@Sarilho1: The language data for Old Prussian is currently set up so that module-based link templates will remove macrons (entry_name = {remove_diacritics = MACRON}): {{m|prg|kīsman}} links to kisman. But there are lots of entry names with macrons in Category:Old Prussian lemmas. This is inconsistent, so either the entries need to be moved or the language data needs to leave the macrons. The decision should be made by those who know Prussian (I don't). — Eru·tuon 22:03, 1 August 2020 (UTC)
I see, that explains it, then. Well, I don't know anything about the language either, but it's good to know it wasn't a mistake on my part. Thank you for answering. - Sarilho1 (talk) 22:28, 1 August 2020 (UTC)
Perusing Old Prussian language and the websites listed at the bottom of that page that provide texts, I get the impression that diacritic marks are very rare in Old Prussian. The only letters with diacritics I noticed are ê and ů, though there may be a few more. I didn't notice a single instance of ī. Unfortunately Ivan Štambuk, who created the entry, is no longer active at Wiktionary. —Mahāgaja · talk 00:04, 2 August 2020 (UTC)
This is a matter of normalisation. Original Old Prussian texts spell long i as any of <i y ie>, and scholars are able to reconstruct vowel length as a result. A community decision needs to be made about the presentation of normalised vs original orthography, as must be done with all such languages. —Μετάknowledgediscuss/deeds 00:10, 2 August 2020 (UTC)
If original Old Prussian texts never (or very rarely) use ī, then we shouldn't use it in entry names, though of course it can still be used in headword lines and mentions, since diacritic stripping is already activated. —Mahāgaja · talk 00:20, 2 August 2020 (UTC)

Category:English terms with multiple etymologies[edit]

Would it be possible to cause this category to be automatically populated instead of having to be manually added to entry pages? — SGconlaw (talk) 13:54, 2 August 2020 (UTC)

@Sgconlaw It's possible for Lua code to access the entire page text, which is what is required to implement this, but it's considered an "expensive" operation so I'm not sure it would be advisable to add it. Benwing2 (talk) 23:43, 2 August 2020 (UTC)
OK, no worries. Was just wondering if it would save some effort. — SGconlaw (talk) 04:36, 3 August 2020 (UTC)
There are only 11,813 pages with English L2 sections that have "Etymology 2" somewhere on the same page, but are not in your desired category (using search string 'incategory:"English lemmas" -incategory:"English terms with multiple etymologies" insource:/Etymology\ \2/'). So probably around 10,000 pages would need hard categorization. Dump processing could produce the list of English L2s with "Etymology 2" headers and perhaps AWB or some other tool could do the actual updating. It seems like the kind of thing that would be worth doing annually rather than in real time. DCDuring (talk) 10:20, 3 August 2020 (UTC)
I guess we could also ask whether this is a useful category to maintain (I'm not sure). I just happened to notice it had been added to some entries. — SGconlaw (talk) 16:32, 3 August 2020 (UTC)

Assisted rhyme ("Add new rhyme") not working properly[edit]

The "+Add new rhyme" feature isn't working properly. See [1] (an older version of this page, which I have now cleaned up), where a user recently added several rhymes using this feature. They were tacked on to the end of their respective sections instead of being put in the right places.

The problem is partly (or wholly) because the feature creates entries of the form {{l|en|word}} while most of the manually entered rhymes are formatted as [[word]]. The updated list of words is then sorted and the new entry ends up getting put at the end of the list instead of in its correct position. Given that the English section is at the top of the pages linked to anyway, is there any benefit in using this formatting rather than just the simple links? (The alternative would be to write a bot to reformat all of the links to the {{l|...} format, but that would need to be done with great care. Some links are followed by qualifiers, some have variations of the same word on a single line, and some links are to Wikipedia entries (most of which are gradually being changed to link to Wiktionary entries instead).) — Paul G (talk) 15:49, 2 August 2020 (UTC)

That the gadget would not place rhymes in the correct order is not at all related to {{l}} usage. It is rather due to the gadget not knowing about {{rhyme list begin}}. This gadget has always had this kind of problems and pretty sure will continue to have unless we standardize rhymes layout. I suggest we require sections use {{rhyme list begin}}/end templates if assisted adding of rhymes is desired. Any objections? Dixtosa (talk) 18:05, 2 August 2020 (UTC)
One small objection: that template seems to use a fixed number of columns (two, I think). Some sections of the rhymes pages have very many entries and look better with more columns. At the moment I make a judgement and use the topX and bottom (where X = 2, 3, 4 or 5) templates as appropriate. I'm more than happy to use the "rhyme list begin/end" templates if they can be updated to split a list automatically into more or fewer columns for longer or shorter lists. Would that be possible? — Paul G (talk) 08:47, 16 August 2020 (UTC)
@Paul G: I have now updated rhyme's gadget to only allow adding rhymes to the sections marked with the template I mentioned. I have also taken your issue into consideration and added an unnamed parameter to the template. See example [[here Dixtosa (talk) 20:02, 22 August 2020 (UTC)

moving compound nouns to their own category (out of compound words)[edit]

I've just finished removing about 1,300 non-noun (and not excusively noun) entries from Category:Hungarian compound words. The remaining ~4,500 entries are solely nouns. Could you please help me moving them to Category:Hungarian compound nouns by replacing "}}" with "|pos=noun}}"? (It doesn't matter if it's at the end of a "{{compound}}" or an "{{af}}" template.) It's usually the very first instance within the "Hungarian" section, given under "Etymology". There are less than a dozen cases with a preceding "Alternative forms" section, which should be left intact; I can even handle them manually. Adam78 (talk) 11:22, 3 August 2020 (UTC)

@Benwing2, do you think you might be able to do this with the help of your bot? Adam78 (talk) 22:25, 4 August 2020 (UTC)

@Adam78 I can do this. Benwing2 (talk) 00:35, 5 August 2020 (UTC)
Done. I left the words with "Alternative forms" sections. Note that the pronunciation for pólóing appears wrong, with a stray [ʲ]. Benwing2 (talk) 02:36, 5 August 2020 (UTC)

@Benwing2, thanks a LOT! I'll deal with the remaining handful. The epenthetic, non-phonemic [ʲ] is inserted automatically between a back and a front vowel by the pronunciation template, but we'll think it over if it's actually justified there. I tend to think it is, at least in average (not very formal) pronunciation; I'll confirm with @Panda10. Thank you so much once again! Adam78 (talk) 08:16, 5 August 2020 (UTC)

@Adam78 If [ʲ] is supposed to represent an actual glide between vowels, it should be [j]. [ʲ] is only intended as a consonant modifier to indicate that the consonant is palatalized. Benwing2 (talk) 02:05, 6 August 2020 (UTC)
@Adam78, Benwing2 It is not the same as the Hungarian [j]. It's much weaker. A distinction is necessary to avoid confusion. Appendix:Hungarian pronunciation explains the IPA symbols the way they are used in Hungarian terms in this project. But if there is another IPA symbol that would work better here, we can certainly change it. Panda10 (talk) 16:51, 6 August 2020 (UTC)
@Panda10 The standard IPA symbol for that would be [j̞], i.e. a [j] with a lowering diacritic to indicate the weakened pronunciation. Benwing2 (talk) 02:41, 7 August 2020 (UTC)
@Panda10 Also, I wonder if what you think of as [j] is actually [ʝ], i.e. a palatal fricative. Benwing2 (talk) 02:42, 7 August 2020 (UTC)
Wikipedia indicates the weak glide as [j̆], which appears a non-standard use of IPA. Benwing2 (talk) 02:46, 7 August 2020 (UTC)
@Benwing2 Thanks for thinking about this. We are trying to research the options. It is not [ʝ], though. That symbol is already used for [j] at the end of the word after b, v, g, r, m, as in szomj, fürj, etc. Panda10 (talk) 17:47, 7 August 2020 (UTC)
@Benwing2: [j̆] is standard IPA: the breve is used to indicate an ultrashort pronunciation. (Note that the symbol is not [ǰ] with a hacek, which in IPA could only mean [j] with a rising tone.) Using the lowering diacritic wouldn't necessarily indicate a weakened pronunciation, merely a more open one, which would correspond to a nonsyllabic [ɪ̯] or even [e̯]. —Mahāgaja · talk 08:56, 8 August 2020 (UTC)
Here is the conclusion for those of you who followed this thread. Based on research, the decision is to use [j] for both the regular consonant and the glide variant. Even though there is a time difference in pronunciation, it's very short, an average of 10 ms. Thank you all who added comments and thoughts to this topic. Panda10 (talk) 16:31, 8 August 2020 (UTC)

auto cat not working in some categories; or without alphabetical sorting in some others[edit]

{{auto cat}} doesn't work in categories Hungarian compound adverbs‎, determiners‎, interjections‎, numerals‎, particles‎, postpositions‎, pronouns‎, suffixes‎, conjunctions‎, and verbs‎. Could someone please add these to the right module or template? (I added the necessary categories manually to each, but now they are not listed at the head of the parent category Hungarian compound words.) If you wish, you can also create the relevant global categories like "Category:Compound adverbs/determiners/interjections/numerals/particles/postpositions/pronouns/suffixes/verbs by language".

On the other hand, the Category:Hungarian compound adjectives‎ and Hungarian compound nouns‎ work almost all right, except for the alphabetical sorting in Hungarian compound words, since both of them are located at "C", instead of A(djectives) and N(ouns). Could it be fixed? (The above issue should also be possibly fixed with this in mind.) Thank you all in advance. Adam78 (talk) 13:06, 5 August 2020 (UTC)

@Adam78 Fixed. Benwing2 (talk) 02:04, 6 August 2020 (UTC)

@Benwing2 Thank you very much! Adam78 (talk) 11:57, 6 August 2020 (UTC)

Information desk purged[edit]

When I went to WT:ID, it finished with WT:ID#Category:Chinese_lemmas, and didn't contain my answer to the previous post. When I went to WT:Information desk/2020/August, it had my reply, and a further two sections. I fixed this by purging the page. Is something broken, or is it normal to have to do this? --ColinFine (talk) 16:48, 5 August 2020 (UTC)

Template:RQ:Undertale[edit]

That template should probably use {{quote-video game}} instead. Glades12 (talk) 14:38, 7 August 2020 (UTC)

Done. It was easy. --Kriss Barnes (talk) 23:45, 12 August 2020 (UTC)

{{ca-adj}}[edit]

I can't specify a |mpl= parameter, but the masculine plural of adjectives in -gen is -`gens. E.g. for endogen it's endògens and not the automated *endogens. Ultimateria (talk) 07:19, 8 August 2020 (UTC)

@Ultimateria I looked at the code, and the |pl= parameter should work for this purpose, despite the documentation. Benwing2 (talk) 16:57, 8 August 2020 (UTC)

Old Saxon, Old Swedish, and Macedonian links to &mdash;[edit]

Some inflection templates are linking to —, the &mdash; entity. If you click the link you are prompted to create a page &mdash without a semicolon. Examples:

If the inflection table entry is — it should not be a link. Vox Sciurorum (talk) 17:05, 8 August 2020 (UTC)

@Vosx Sciurorum I fixed Old Saxon. Each of these cases has to be handled individually. Benwing2 (talk) 22:28, 8 August 2020 (UTC)
@Vox Sciurorum Benwing2 (talk) 22:28, 8 August 2020 (UTC)
I fixed the Macedonian templates. — Eru·tuon 18:42, 9 August 2020 (UTC)

Edit notice in language data modules[edit]

I've added an edit notice to the regular language data modules (Module:languages/data2, Module:languages/data3/a, etc.). It pops up when you edit the modules. It's basically an abbreviated version of the documentation page and it explains the various data items. It's still a bit long, so language data editors, let me know if that's annoying. And any suggestions for improvements would be welcome. — Eru·tuon 23:15, 8 August 2020 (UTC)

It's great! I edit those modules rather frequently, and I didn't even know that otherNames was deprecated. —Μετάknowledgediscuss/deeds 23:34, 8 August 2020 (UTC)

Edit notice in label data modules[edit]

I've added an edit notice to Module:labels/data, Module:labels/data/regional, Module:labels/data/subvarieties, Module:labels/data/topical that explains the data keys. These modules provide label data for {{label}} ({{lb}}) and {{term-label}} ({{tlb}}). The notice displays when you edit the module. Suggestions for improvements are welcome. — Eru·tuon 21:12, 10 August 2020 (UTC)

Error message when trying to use advanced search features.[edit]

I often will search for words only in a specific category, but today, after I had been searching that way for a while, when going on to the next list of word results that was going to follow, I got this message:

"A warning has occurred while searching: Deep category query returned too many categories"

An absurd message, as I was only searching one category (Category:English lemmas). Moreover, I search that category all of the time.

Now I seem blocked for the time being from using the advanced search features, because if I do (no matter what I put in) it spits out that error message. Tharthan (talk) 05:46, 11 August 2020 (UTC)

@Tharthan: I guess you mean a query like deepcategory:"English lemmas". I get the same error message. (I had never used the feature before.) Maybe someone has added more subcategories of Category:English lemmas recently, but I don't see any in RecentChanges. It's probably something none of us admins can fix (unless by deleting categories?); it would be worth asking the folks at #wikimedia-tech on Freenode though. — Eru·tuon 07:47, 11 August 2020 (UTC)
Oh, good. I'm an IRC person, so I'd be glad to do that. Tharthan (talk) 08:13, 11 August 2020 (UTC)
I submitted a Phabricator task, after a discussion with one of the folks at #wikimedia-tech. I can link that here if desired, or not. It's up to you (plural). Tharthan (talk) 15:43, 11 August 2020 (UTC)
I don't know why one wouldn't link to Phabricator; I always do when I submit a task. But I found your task: phab:T260152. — Eru·tuon 18:39, 11 August 2020 (UTC)

Problem with Tea Room page [edit]

The bottom of WT:TR only shows a single discussion in August. Clicking on the August 2020 link takes one to the August 2020 subpage. I'm not one to be trusted to fix that, no matter how trivial it may be. DCDuring (talk) 23:59, 11 August 2020 (UTC)

Fixed now. I must have been looking at the page in the middle of a technical correction. DCDuring (talk) 00:01, 12 August 2020 (UTC)
I had a similar problem on the Info Desk last week, @DCDuring:, and it only went away when I purged the page. I asked (above) whether this was normal or a fault, but nobody replied. --ColinFine (talk) 17:47, 13 August 2020 (UTC)
I may have caused both problems. I was trying to insert missing months headers using the old anybody-who-knows-wikitext-can-do-it method. I didn't realize that most discussion pages with their new, improved structure need monthly updating using new methods, which I wouldn't even know where to look to learn. The upshot is that, when I notice missing months headers, all I have to do and all I can do is whine about it. DCDuring (talk) 01:04, 14 August 2020 (UTC)

Module:th-pron[edit]

Generates badly formed HTML. Namely it's putting a DIV outside of the TD that should wrap it.

Suggestions on how a contributor is SUPPOSED to repair it when there's virtually NO documentation on how it SHOULD be done? Thanks? ShakespeareFan00 (talk) 23:13, 12 August 2020 (UTC)

This was mentioned over 2 years ago and nothing happened Wiktionary:Grease_pit/2018/December#Template:th-pron ? ShakespeareFan00 (talk) 12:39, 13 August 2020 (UTC)
Looking at the code in a sandbox it seems the 'problem' is that the code used to generate the rows of the relevant table, doesn't apparently generate the tags in the right sequence, or certain code is getting moved by a flawed 'tidy' up. ShakespeareFan00 (talk) 12:42, 13 August 2020 (UTC)
There is a now a sandbox version, that does not generate the LintErrors. If someone couldswap it in?ShakespeareFan00 (talk) 15:13, 13 August 2020 (UTC)

Lua error: not enough memory[edit]

Hello, if you look towards the later part of , the templates seem to be erroring out. Opencooper (talk) 02:16, 17 August 2020 (UTC)

Yes, there are currently about 30 entries in Category:Pages with module errors that we haven't been able to do anything about. There may be a few that we can fix, but the Chinese character entries use some very memory-intensive templates that are hard to to do without. Chuck Entz (talk) 02:56, 17 August 2020 (UTC)
@Chuck Entz: don't we have a practice of moving some parts of such pages to subpages, with a link on the main page? We could create a template for displaying on the main entry page to explain the situation. — SGconlaw (talk) 07:28, 17 August 2020 (UTC)
@Sgconlaw: So far we only do that for translations of English entries, as least as far as I know. —Mahāgaja · talk 14:30, 17 August 2020 (UTC)
@Chuck Entz, Mahagaja: it seems quite weird to have some entries replete with error messages and do nothing about it. The best stopgap solution seems to be to shift some content to one or more subpages and have a message on the main entry page explaining why this has been done. I would suggest that the language headings (in the above case, "Japanese", "Korean", "Old Japanese", and "Vietnamese") remain on the main page so that readers are aware that there are entries for these languages, and the substantive content be moved to subpages called "紅/Japanese", "紅/Korean", and so on. — SGconlaw (talk) 14:40, 17 August 2020 (UTC)
In fact, just moving the Japanese section to a subpage may suffice. — SGconlaw (talk) 14:43, 17 August 2020 (UTC)
@Sgconlaw: One of the difficulties is that multiple templates rely on the title. So if you move the Japanese section of to 紅/Japanese, 紅/Japanese will be used in multiple places where is expected, most notably displayed in the headword-line templates. (There are other invisible places; for instance, some of the templates insert /Japanese into category sortkeys, as you can see with Special:ExpandTemplates.) There might be a workaround for all the headword templates (a |head= or |pagename= parameter), but I don't know all the places where the title is used, and whether they can get by using the title with /Japanese appended. So it's hard to be sure that something won't break invisibly. — Eru·tuon 19:24, 17 August 2020 (UTC)
@Erutuon: maybe we should try copying the Japanese section to 紅/Japanese (i.e., not deleting it from the entry page yet), and see what issues crop up with the subpage. — SGconlaw (talk) 19:35, 17 August 2020 (UTC)
@Sgconlaw I can guarantee that it will break some things. The problem with our CJKV templates is that they get a lot of information from entry wikitext- basic things like transliteration. If you move the wikitext, you would have to change the code to be able to find it. Also, I know of at least one template that checks header levels on its own page and will throw a module error if anything isn't right- even in other language sections. If you ask me, some of the code is a bit too dependent on programming tricks- but I'm not sure how to change anything without wreaking havoc on a lot more pages than the few that have problems now. If you preview one of these pages with the module errors, you'll see an enormous list of transclusions: data modules, submodules, etc., but also a long list of entry names. Any time a module loads the wikitext from an entry, it shows as a transclusion on the list (it says "templates used in this entry", but it's really any kind of transclusion).
At any rate, I don't edit the CJKV pages except to fix obvious general problems, so it's not for me to say what should be done. You would have to ask people like @Justinrleung, Eirikr or @Suzukaze-c who actually work with all of these templates. These may be memory- and processor-hogs, but they do stuff that the CJKV communities depend on. Chuck Entz (talk) 04:19, 18 August 2020 (UTC)
Oh dear. I don't edit these pages either, but it seems quite strange that nothing can be done to solve this problem. It means that the efforts made by editors in maintaining these entries is fruitless, because most of the entry pages cannot be read. Perhaps those who work on the CJKV templates can look into adjusting them so that they ignore everything after the oblique ("/") in the pagename, which would enable them to be used on subpages? — SGconlaw (talk) 13:01, 18 August 2020 (UTC)
I believe most of the memory consumption comes from the section "Derived terms" (or sometimes named "Compounds"). You can try moving only these sections to a subpage, which should be easy and not error-prone. -- Huhu9001 (talk) 20:00, 21 August 2020 (UTC)
Well, I was wrong. Only a small part of the memory consumption comes from "Derived terms". But my suggestion seems working. See 一/derived terms. -- Huhu9001 (talk) 09:58, 22 August 2020 (UTC)
i encountered this problem looking for the Spanish word i. i mistakenly thought i had at least skimmed this section of the grease pit, but here it is, a lot of technical talk i don't understand. Nevertheless, here's what i posted to Talk:i, where i thought a "sandbox" might help diagnose the problem and/or trial run a solution.
Posting this to Talk:i and Wiktionary:Grease_pit#Lua_error:_not_enough_memory
i used CTRL+F to search the i article for "Lua error: not enough memory". Besides the "Requests for cleanup" header, the first search result i got was in Rapa Nui. i clicked "Edit" on the i article and used copy-paste on Rapa Nui and everything below it to "mirror" it on Talk:i. Looks like maybe splitting one page into two pages (or two tabs) might give each page enough memory to work? Maybe make i redirect to something like i/languages with names A-P with a hatnote like, "For i in other languages, see i/in languages with names R-Z"? --96.244.220.178 23:06, 19 September 2020 (UTC)

Cognate finder tool[edit]

https://www.reddit.com/r/linguistics/comments/ibqwtz/cognate_finder/Justin (koavf)TCM 10:35, 18 August 2020 (UTC)

Interesting. This is a Python script that creates lists of words derived from a word in a common ancestor, and gets quite a few accurate results, using the HTML (not the most efficient method).
The method yields some false positives. The script looks for a link to the Wikipedia article for the common ancestor of the two languages (as in {{inh}}) and reports the next link as the ancestor word. Sometimes the link immediately following is not an ancestor (as in {{cog}}) or there is text intervening. This is why "Arabic" is reported as the Proto-Semitic ancestor of Amharic አለም (ʾäläm) and of Arabic م(m) and why there is a nonsensical link in the Amharic–Arabic cognates; both pages have a link to the Wikipedia article on Proto-Semitic, then some text, then a link to the Wikipedia article on Arabic. I didn't look for any false positives because of {{cog}}, but there might be some, depending on the languages you're looking at.
The intervening text thing might be fixable, but there will still be false positives because the HTML for {{inh}} and {{cog}} and {{noncog}} and {{der}} and {{bor}} is identical. To weed out {{cog}} and {{noncog}}, which definitely don't indicate ancestors, you have to look at the wikitext instead of the HTML, which requires rewriting the entire script. (Or we could make the HTML output of these templates distinct in some way.) — Eru·tuon 19:13, 18 August 2020 (UTC)

#* #*[edit]

I believe we have, or at least had, a list of all entries containing #* #*. Such crappy formatting usually came about from Wonderfool errors, but even DCDuring (talkcontribs) has been known to be guilty of such a misdemeanour. Can we generate a list or something? WT:Todo/Double hash star would be a fine name... --Kriss Barnes (talk) 18:16, 18 August 2020 (UTC)

@Kriss Barnes: Created. I regret to report that the list is very short. ☹️ — Eru·tuon 19:56, 18 August 2020 (UTC)
Great. All cleaned up in less than a minute! How about a list of templates that start with RQ: and that include #*, like Template:RQ:Wonder Fool? I have a feeling I may have created a couple of these a while ago... --Kriss Barnes (talk) 20:01, 18 August 2020 (UTC)
Huh, excluding documentation pages, only four results in the whole template namespace: Template:Buyla-inscription, Template:nquote, Template:vote-checkuser-intro, Template:vote-sysop-intro. Apparently you didn't. — Eru·tuon 21:03, 18 August 2020 (UTC)

Removing symbols.[edit]

I'm a bit new with wiktionary. I decided to add a definition on the word 'armour plate' as it had none yet. I couldn't seem to remove visible '[ ]' symbols without getting a warning. —⁠This unsigned comment was added by 197.245.108.209 (talk).

You can see the changes needed to fix formatting at Special:diff/60120778. As for the content, if it is no more than the sum of armour and plate we generally don't create "sums of parts" entries. Vox Sciurorum (talk) 22:52, 21 August 2020 (UTC)

Old English verb conjugation (inflection) template of slēpan 'to sleep'.[edit]

I shaped my own verbal inflection template for the Anglian vaariant of 'to sleep' in Old English, which, when i tried to submit it, was immediately flagged as "abusive". There's a 'request for inflection' on the page, so have i missed something? If so, what? Is that verb somehow barred from receiving an inflection template, and if so why that verb, why not slāpan as well (one can make a case for it being Far less common). As Ringe (2014) points out that everywhere except in Wessex was long æ reared to ē. He also states that in Wessex the change of long æ to ā in many words but was quickly reverted to it's æ sound by the West-Saxons—and our modern form of the word comes from slēpan, so couldn't it be beneficial having an inflection table for the verb which is the etymon for the contemporary descendent? —⁠This unsigned comment was added by 2602:306:CF6F:520:BC0F:CA63:53D6:B460 (talk) at 16:38, 23 August 2020 (UTC).

The reason an abuse filter rejected your edit was that it contained triple braces ({{{), which would retrieve template parameters. They shouldn't be used in an entry. I see that just copying the conjugation template from slæpan and changing the infinitive doesn't work. User:Benwing2, is there a way to use {{ang-conj}} for slēpan? — Eru·tuon 16:54, 23 August 2020 (UTC)
@Erutuon I can change the module to support slēpan and potentially other Anglian forms. How are they conjugated? The West Saxon conjugation as implemented in the module has past sg/pl slēp or slǣpte, past part slǣpen. Would the Anglian equivalent have an ē throughout? Other possible Anglian variants are e.g. faldan, fallan, walcan etc. in place of WS fealdan, feallan, wealcan etc. WS has past ēo in these verbs but Anglian might smooth it to ē. Wikipedia under Anglian smoothing says it should occur in walcan (wēlc not #wēolc) but not in the others, although I can't verify this. Note that the module handles class 7 strong verbs as a large number of special cases rather than a unified verb class, because there are so many exceptions and variants. Benwing2 (talk) 21:06, 23 August 2020 (UTC)

Regexes in Lua[edit]

Are there any regex libraries for Modules that can be used? Or are we constrained by Lua's limited Patterns? Kritixilithos (talk) 15:37, 25 August 2020 (UTC)

@Kritixilithos: No, nothing built into the Scribunto extension at least, which is the only way to get a regular expression library like PCRE written in an efficient language such as C. Someone posted a request on Phabricator, but it was declined. And we haven't used any regular expression libraries written in Lua either, which is probably a good thing because they are likely to be inefficient in memory usage and cause even more module errors. — Eru·tuon 17:05, 25 August 2020 (UTC)
Oh well, can't even use alternation, I will have to come up with a work-around. Thank you for your reply. Kritixilithos (talk) 18:05, 25 August 2020 (UTC)
@Kritixilithos I have gotten used to using Lua patterns for the various conjugation/declension modules I've written. It's true there is no alternation, but I haven't found that to be as limiting as you might think. If you have a case where you really need alternants, you just have to use multiple pattern matches, but most of the time either it doesn't come up or there are ways to rewrite it. Benwing2 (talk) 08:18, 27 August 2020 (UTC)

Request change to {{R:pt:Infopédia}}[edit]

Hi! I would like to request, if no one opposes it, the addition of a new optional parameter to {{R:pt:Infopédia}}, where, if activated, the root of the url would change from https://www.infopedia.pt/dicionarios/lingua-portuguesa/ to https://www.infopedia.pt/dicionarios/toponimia/. This would be quite useful to source toponyms, and I don't think creating an entirely new template is worth it. - Sarilho1 (talk) 21:02, 28 August 2020 (UTC)

fixing the continent of East Timor[edit]

Would someone with access please fix the parent category of East Timor in Module:place/shared-data, changing it from Africa to Asia? (Cf. CIA Factbook for reference.) Thank you. Adam78 (talk) 17:01, 30 August 2020 (UTC)

Yes check.svg DoneΜετάknowledgediscuss/deeds 18:47, 30 August 2020 (UTC)

Thank you. One more thing: Kazakhstan is basically considered more of an Asian country than a European one (even if it's transcontinental, see reference), its category should be moved to Asia in the navigation tree/path (that starts from Fundamental). Adam78 (talk) 00:43, 31 August 2020 (UTC)

I don't know how to do that. I tried flipping the order, in case that works, but I doubt it will. —Μετάknowledgediscuss/deeds 04:01, 31 August 2020 (UTC)

It seems it did work! Category:en:Kazakhstan Thanks! Adam78 (talk) 11:23, 31 August 2020 (UTC)

September 2020

TTT-ejo[edit]

Any way to categorize this Esperanto entry as a common noun? {{eo-head}} automatically makes it a proper noun because it's capitalized. Ultimateria (talk) 00:33, 2 September 2020 (UTC)

@Benwing2, could the module get a PoS override function? —Μετάknowledgediscuss/deeds 15:56, 15 September 2020 (UTC)
@Metaknowledge There already looks to be a |pos= argument to allow you to specify the part of speech, so you could just say |pos=noun. There's also a module Module:eo-headword/exceptions that lists a large number of "exceptional" words and specifies their part of speech. (Almost all of the exceptions are nouns and proper nouns in -to, and adjectives in -ta. I guess these are handled as participles by default.) So we could list TTT-ejo and all its non-lemma forms in that module, but this approach doesn't really seem scalable to me. Benwing2 (talk) 02:37, 16 September 2020 (UTC)
@Benwing2: Thanks. I guess the real issue here is that nobody bothered to document the template. —Μετάknowledgediscuss/deeds 05:04, 16 September 2020 (UTC)

No subject prompt for first post[edit]

today, using my cell phone on the mobile version of this site, I created a new discussion page. However there was no prompt for the subject of my new discussion. The Prompt for the subject only occurs for follow up posts.

There's no structure built in to Wiktionary talk pages, just conventions for how wiki syntax is used to format discussions when one is replying. A Wiktionary talk page isn't a discussion forum in the usual internet sense, it's a space to give information or ask/answer questions about the page it's attached to. Chuck Entz (talk) 14:17, 2 September 2020 (UTC)

Label deactivated?[edit]

The label {{lb|en|British spelling}} isn't working, so maybe someone deactivated it. It is recommended for use in Category:British English forms (see the recommendation at the top of the page). This category has lost about half of its entries, there was over 1,000 previously. For example, see incentivisation, where the label has been added. DonnanZ (talk) 21:43, 2 September 2020 (UTC)

User:Mnemosientje removed it in this edit to Module:labels/data/regional. — Eru·tuon 21:50, 2 September 2020 (UTC)
Why? It looks like {{lb|en|American spelling}} and {{lb|en|Canadian spelling}} have been removed too. Can Mnemosientje delete himself? Can they be restored please? DonnanZ (talk) 22:00, 2 September 2020 (UTC)
@Mnemosientje: What the heck were you doing there? You left a lot of empty categories, like Category:Sinaloa Spanish and Category:Santomean Portuguese, and even removed the Yemen and al-Andalus; removing aliases["Bukowina"] = "Bukovina" is also with little thought since in a Polish, and German, context one could think and has thought already of writing Bukowina; there is no principle for all aliases to be “English”, whatever that is for proper nouns. Why have North India and South Asia been replaced with North Korea and South Asia? The list goes on, I cannot spot all. @Allahverdi Verdizade had to readd a lot things, although perhaps to Module:labels/data/subvarieties, but you didn’t move anything to it. Five days before I looked into the edit and did not discern all, thought things were moved amongst a lot of additions in the one contribution without edit summary; very bad practice. That page should be edited more in a git way. I am inclined towards a restoration of the previous version to allow for partial, commented considerate amendments, not like that mashing all into one massive commit without comment, disregarding architecture people have added for reasons. Fay Freak (talk) 00:33, 3 September 2020 (UTC)
I have literally no idea what happened there. Tried to add a category for Amsterdam Dutch. It succeeded, but somehow, well, that happened as well, without me being aware that it did. Perhaps I edited an older version, or some freak accident happened, but literally no part of that edit other than adding a category for Amsterdam Dutch was intentional. Believe me, I wouldn't make a huge edit like that to an important module out of the blue, let alone without any explanation - I am not super confident when it comes to module editing as it is. Tl;dr - mistake, not sure how it happened as I only remember adding a label for Amsterdam Dutch. Apologies for that, I am as puzzled as you are, and more than a little bit embarrassed... have reverted to previous version before my edit, and will now edit in Allahverdi's additions again. — Mnemosientje (t · c) 08:48, 3 September 2020 (UTC)
Should be fine again now, including the things Allahverdi added (most of his edits were restorations of stuff removed in my edit, but some new aliases and labels were added too). — Mnemosientje (t · c) 09:05, 3 September 2020 (UTC)
Figured out what happened, although I'm still not sure why/how: I accidentally edited the version of my last edit to that page at the time, effectively reverting the page to its February 2019 state. Guess I was editing a bit too absent-mindedly (as suggested by the incorrect alias I copy-pasted in as well, lmao.). — Mnemosientje (t · c) 09:28, 3 September 2020 (UTC)
@Mnemosientje: Heh, that's happened to me a few times. Good thing User:Donnanz noticed something was wrong. — Eru·tuon 00:43, 6 September 2020 (UTC)

@Mnemosientje: The easiest option for restoration may be reverting your edit. DonnanZ (talk) 08:43, 3 September 2020 (UTC)

Categories can take a long time to recover though. Category:British English forms is still repopulating itself over 48 hours later. DonnanZ (talk) 10:34, 5 September 2020 (UTC)

Alternative to R2A/A2R[edit]

There are close to a hundred pages in Category:ParserFunction errors without even checking, I can tell you that the overwhelming majority of those are from WF entering Arabic numerals in quote templates that require Roman numerals. As much as I hate coddling people with short attention sQUIRREL!!pans, wouldn't it be better to have a template that will give the same output regardless of whether the input is Roman or Arabic numerals. After all, quote templates are for formatting quotes, not testing whether people can follow directions. Chuck Entz (talk) 07:54, 3 September 2020 (UTC)

Wikidata and Wiktionary[edit]

How well-linked is Wiktionary to Wikidata? I assume pretty badly, as even Wikidata:yes isn't linked to the Wiktionary page yes. I was thinking about, for example, linking between Wikidata and some quotation templates, like I have done to the Rip Van Winkle template, whilst also linking thereto from the Wikidata item (which I haven't done - one of my new year's resolutions was to never touch Wikidata). What do you think? Too much fiddly work for very limited, if any, improvement? --Java Beauty (talk) 10:41, 3 September 2020 (UTC)

Template:ja-infl-demo is triggering WT:NORM tags[edit]

As far as I understood, this template uses asterisks (*) and line breaks in one of its parameters. But a combination of these two characters may sometimes violate WT:NORM#Lists

One space after a sequence of #, *, : or ; at the start of a line.

, as seen in the entry . -- Huhu9001 (talk) 17:35, 6 September 2020 (UTC)

Italics in ux[edit]

Hi all. Can anyone give me some guidance on this? Are we supposed to put italics in the ux or not? E.g. (without bold for simplicity) Is it:

{{uxi|de|''Übung macht den Meister.''|Practice makes perfect.}}

or:

{{uxi|de|Übung macht den Meister.|Practice makes perfect.}}

I'm confused because Wiktionary:Entry_layout#Example_sentences provides this advice: "be italicized, with the defined term boldfaced." Whereas, Wiktionary:Example_sentences, provides this advice:

{{ux|fr|Example sentence (not in italics), with '''bonjour''' made bold.|translation=An English translation of the sentence with '''hello''' in bold.}}

Which is it?

Thanks. -- Dentonius (talk) 04:41, 7 September 2020 (UTC)

It's 2. {{ux}} carries its own italics (I assume the second link is advising people not to add italics manually?). (And if we wanted {{uxi|de|Übung macht den Meister.|Practice makes perfect.}} every time, we could just incorporate it automatically in the template itself!) —Suzukaze-c (talk) 04:45, 7 September 2020 (UTC)
@Suzukaze-c, Thanks! -- Dentonius (talk) 04:51, 7 September 2020 (UTC)

Unlinking "and" and "the" in part-of-speech templates[edit]

Templates like {{en-verb}}, {{en-adv}}, and {{en-noun}} automatically link all terms in a phrase. Therefore, the headers for phrases like "duck and cover", "to and fro", "cause and effect", "go to the dogs", and "king of the hill" end up linked as duck and cover, to and fro, cause and effect, go to the dogs, and king of the hill. Given the substantial usage of the words "and" and "the" in phrases for all parts of speech, and the commonality of the word in the English language, can we have these templates not link these particular words? bd2412 T 17:58, 7 September 2020 (UTC)

Why? They're part of the etymology of the word. king of the hill is king + of + the + hill. My observation has always been that we link the words in the POS headers so that we don't have to specify it in an etymology section. While I'll concede the actual usefulness of linking to words like of and the is a bit minimal, it's not completely devoid of usefulness. For example, what if you wanted to look at the entry for of and see which sense of of this particular connection would cover? Now you might say you could just look up the word of in the search bar yourself if you really wanted to know, but if that's the case, why not just forget the linking in the POS header altogether, since you could look up king and hill yourself without the links, too? Furthermore, even disregarding that it introduces a bit of an inconsistency in our entries, which I dislike. I am sorry, but I will have to oppose this change, as a person who's actually been going out of my way to undo people's unlinking of these common words when I see this happen. PseudoSkull (talk) 18:58, 7 September 2020 (UTC)
Yeah, oppose. PUC – 19:20, 7 September 2020 (UTC)
Support I already routinely omit determiners, conjunctions, and clitics and sometimes pronouns and prepositions in {{&lit}}. I also sometimes create custom inflection lines which omit links to common stopwords and also link to component MWEs. Having a, an, and the as unlinked terms would make it easier to focus the user on the important words instead of having them face a blue wall of uniform typography on the inflection line. Should these words turn out to be especially important in a particular case, a conscientious contributor could create a custom inflection line that provided the link(s). We could also create default senselinks to the appropriate definitions of one, someone, and similar dummy words for lemmas that use these in the headword. I also call the attention of those opposing the proposal to the saying of Ralph Waldo Emerson: "a foolish consistency is the hobgoblin of little minds, ..." DCDuring (talk) 22:25, 7 September 2020 (UTC)
  • I would also note that taking a reader to and without taking them to the specific definition on this lengthy page is not particularly helpful. The same can probably be said of a lot of these common words that happen to have lengthy pages with numerous senses, but for which the meaning to the typical English speaker will be elementary. bd2412 T 00:55, 8 September 2020 (UTC)
    If an MWE uses only a single rare or obsolete sense of a highly polysemic component word, it might make sense to link to individual sense-ids, though this would require manual effort. In most other cases it either doesn't save the users much effort (as when all the definitions fit on a single screen) or is simply not feasible (as when more than one definition might be applicable).
In the case at hand one and someone have a specific definition of their role as a placeholder that applies in virtually all of the lemma headwords that contain them. DCDuring (talk) 14:47, 8 September 2020 (UTC)
  • The aesthetics are apparently debatable, but there is another point: I would also rather have these words unlinked if one could assume that it would increase the performance of pages (MediaWiki checking whether a page is present to decide about link-colour), but it seems like it also has adverse effects as we would have to maintain language-specific lists of omittable words, which the software would need to load. In sum it is perhaps also better not done just to avoid the maintenance work and the possible dispute points in regard to what is omittable. There are better issues to fix. Fay Freak (talk) 00:23, 10 September 2020 (UTC)

Template creation needed?[edit]

I'm unfamiliar with how to create/edit templates, but I'm encountering the need to do so. Both the {{cy-noun}} and {{cy-noun/new}} templates for Welsh nouns allow parameters for a second possible plural form, but the issue here is that some Welsh nouns have more than two possible plural forms (e.g. the plural of drwm is drymiau, drymau, drwmau, or drwmys), which the current template doesn't allow for. It looks like that template is protected so I'll probably want to make my own as a workaround, but I figured I'd ask here first before diving into the world of template creation. Guitarmankev1 (talk) 20:24, 8 September 2020 (UTC)

We don't need a new template, we just need to edit the existing ones to accommodate more plural forms. —Mahāgaja · talk 20:40, 8 September 2020 (UTC)
Should be very easy to do, but it appeared that the existing template is locked and unable to be edited... any way around that? Guitarmankev1 (talk) 20:42, 8 September 2020 (UTC)
@Guitarmankev1: {{cy-noun}} is editable by autoconfirmed users, which apparently you're not, even though you've made over 400 edits and have been here for almost 5 months. I don't know how getting autoconfirmed works. {{cy-noun/new}} is editable by anyone, and I've already edited it to accommodate up to 5 plural forms. —Mahāgaja · talk 20:49, 8 September 2020 (UTC)
Apparently I can edit it, so I must be autoconfirmed. I must've been preemptively scared off by the big yellow warning telling me that only autoconfirmed users can edit the template, haha. Thanks! Guitarmankev1 (talk) 20:54, 8 September 2020 (UTC)

Line gone missing[edit]

I'm sure I'm not dreaming this up - there has always been a line activated by ---- appearing between languages on pages with more than language. Now the line can't be seen, yet ---- hasn't been removed from entries. The Nairobi page is a good example of this. DonnanZ (talk) 22:07, 9 September 2020 (UTC)

I see the line. Java Beauty (talk) 22:12, 9 September 2020 (UTC)
There still is the auto-generated line under the language header, I meant the border line between languages. DonnanZ (talk) 09:39, 10 September 2020 (UTC)
Yeah, Justinrleung brought this up on Discord. It seems to be because of this edit to a CSS file in the MediaWiki source code ,specifically this line which assigns a height of 0 to the hr tag created by ----. It might be based on the new recommendation that this tag be used for indicating a "thematic break". But it's a bit unhelpful for us since we currently use it for a horizontal line. One solution is to override this CSS with hr { height: 1px; } in MediaWiki:Common.css but it's probably better to change any uses of ---- to use a border or something like that instead, because otherwise we'll be working at cross purposes with MediaWiki.
For language headers, I think a good solution would be to replace hr tags (----) with a top border, as was discussed years ago. I have personal CSS based on the CSS in that discussion that does this:
.ns-0 h2:not([class]) ~ h2:not(:first-of-type) {
	padding: 0.5em;
	border-top: 1px solid #aaa;
	margin-top: 1em;
}
For {{zh-pron}}, the fix was easy because the ---- has a very particular location, inside a table. But I wouldn't want to unilaterally the same for all ---- everywhere and risk breaking something. — Eru·tuon 22:26, 9 September 2020 (UTC)
@Erutuon: I experimented with == == instead of ----, which generates a line, but it appears one line further down than previously, which is actually better I think. DonnanZ (talk) 23:12, 9 September 2020 (UTC)
Nah, empty headers are weird. I finally added the CSS above to MediaWiki:Common.css, so there will be lines above language headers again. (This means we could stop adding ---- between language sections, as far as appearance is concerned.) If anyone notices problems, please let me know. — Eru·tuon 18:34, 10 September 2020 (UTC)
For contributors the absence of a separator visible in the edit window makes it a bit harder to know where one L2 sections ends and another begins. It can matter when getting oriented in the edit window. What happens to the existing separators (----)? DCDuring (talk) 18:47, 10 September 2020 (UTC)
Yeah, I guess the separator can be kept for that reason, even if it doesn't function any more. As for my "brainwave", I thought afterwards it might fall foul of WT:NORM or something silly. Anyway, thanks for fixing it. DonnanZ (talk) 19:10, 10 September 2020 (UTC)

@Erutuon: I noticed when editing this morning this dividing line has disappeared again; see kilat for example. DonnanZ (talk) 10:40, 12 September 2020 (UTC)

@Donnanz: Yesterday they made ---- visible again and so I removed the CSS that added a top border to language headers. I still see the ----; has it reappeared for you again? — Eru·tuon 18:00, 12 September 2020 (UTC)
@Erutuon: No, not yet. I checked both kilat and Nairobi. DonnanZ (talk) 18:05, 12 September 2020 (UTC)
I don't see it either (Firefox on a Mac). I copypasted the link into Safari and got the same results, and I haven't used Safari for Wiktionary in eons. Chuck Entz (talk) 19:18, 12 September 2020 (UTC)
@Erutuon: I see it is visible again this morning. DonnanZ (talk) 09:19, 14 September 2020 (UTC)

Warn when closing tab with unsaved content?[edit]

I've switched to a new computer: same setup (latest Chrome on Win10 64-bit), but if I am creating a new Wiktionary entry and I close the tab, it no longer warns me that I will lose work: it just closes the tab and loses what I was typing. Does anyone know how to turn on the warning setting? I thought it was the default. Equinox 20:27, 10 September 2020 (UTC)

@Equinox Do you have this setting: Preferences - Editing - Warn me when I leave an edit page with unsaved changes. Panda10 (talk) 21:28, 10 September 2020 (UTC)
Yes check.svg Done Thanks. That got unticked somehow. (I'm sure I didn't do it myself! I haven't touched settings for months.) Reticking fixed. Equinox 21:34, 10 September 2020 (UTC)

Request for personal prefix template for Ojibwe[edit]

I'm hoping someone knowledgeable in coding could create a (sub-)template for Ojibwe personal prefixes that i could insert into verb conjugation templates.

For context, Ojibwe verbs are inflected for both the subject (when animate) and the object (if applicable) and there are prefixes and suffixes to show that inflection. There are 4 verb classes and approximately 17 subclasses, each with its own paradigm of suffixes, of which there can be several hundred for the most complex paradigms.

The good news is that there are really only 3 prefixes, indicating 1st, 2nd or 3rd person (singular and plural are not differentiated in the prefixes). The bad news is that each prefix takes different forms in accordance with the first letter of the stem to which they are affixed. See an explanation here.

In short, each paradigm currently needs at least 6 templates based on whether the 1st person personal prefix is n-, ni-, nim-, nin-, nind- or nindo- (there are no verbs in Ojibwe that begin with ii so the 7th potential category is non-existent). See some examples here. My hope would be that i could just insert a code saying "insert appropriate 1st person (or 2nd or 3rd) prefix here." There are currently a bunch of templates for the simplest paradigms (intransitive verbs), but i see no hope of generating conjugations for transitive verbs if i have to recreate personal prefixes every time.

Eventually, this subtemplate could also be used for the inflection of noun possession but that can wait for another day. Thanks in advance for any help. SteveGat (talk) 20:32, 10 September 2020 (UTC)

Category for ideophonic terms[edit]

Is there some way to make the "ideophonic" label (e.g. 사르르) auto-generate a category, e.g. Category:Korean ideophones? The ideophones of Korean or Japanese can easily be analyzed as a word-class in themselves, and the fact that they don't have their own category is frustrating.--Karaeng Matoaya (talk) 10:24, 11 September 2020 (UTC)

NVM, I fixed the issue myself.--Karaeng Matoaya (talk) 13:52, 11 September 2020 (UTC)

Ideophones are not automatically categorized into Category:[Language] lemmas[edit]

I created a new part of speech for Template:ko-pos, "ideophones", to use for ideophonic adverbs like 반짝 (banjjak) or 딩딩 (dingding). Unfortunately, this seems to have blocked the automatic categorization of Korean lemmas into Category:Korean lemmas. This also happens for Bantu ideophones like balala or sombu.

Could somebody fix this please?--Karaeng Matoaya (talk) 06:06, 12 September 2020 (UTC)

The problem continues, unfortunately.--Karaeng Matoaya (talk) 15:55, 19 September 2020 (UTC)
@Erutuon? —Μετάknowledgediscuss/deeds 22:47, 21 September 2020 (UTC)
To make headword-line templates automatically add the lemmas category for a part of speech, it needs to be added to the data.lemmas list in Module:headword/data. But I don't feel qualified to make this edit because I'm not familiar with ideophones; none of the languages that I studied have them. Can a template editor or administrator who is familiar with it add it to data.lemmas? — Eru·tuon 23:02, 21 September 2020 (UTC)
Thanks, added. I'm not sure what being familiar with them has to do with adding them. —Μετάknowledgediscuss/deeds 23:05, 21 September 2020 (UTC)

Phrase ellipsis, three regular dots or two ellipsis characters (six dots)?[edit]

Hi all,

Concern A: I came across how do you say...in English and I'm ... year(s) old. The former has been moved to how do you say …… in English. After reading the page history, there seemed to be a rational explanation as to why two ellipsis characters (six dots) were used. Given that Wiktionary:Phrasebook provides an example with three regular dots (three separate characters), I'm confused about what the naming convention should be. Please advise. - Dentonius (talk) 08:09, 12 September 2020 (UTC)

Concern B: Most people cannot type the ellipsis character (…) without copying and pasting from somewhere else. Doesn't this limit the usefulness of Wiktionary as a tool for looking up words? What if a phrase starts with the ellipsis characters and the user wanted to look that up? It would likely only be found with great difficulty. - Dentonius (talk) 08:13, 12 September 2020 (UTC)

@Dentonius This is maybe more of a beer parlo(u)r issue, and you might get more traction posting it there. However, I agree with you that six dots seems a bit strange. The explanation "and two of them to mark the width of an average word, separated by spaces as usual" by User:Adam78 makes a certain amount of sense but was clearly a unilateral decision. The issue with an ellipsis character vs. three dots seems less of an issue than you might think; at least for me, if I type "I'm ..." with three dots, it autocompletes to the variant with an ellipsis character. Same thing happens if you start typing "..."; it autocompletes to the ellipsis character entry. Even using a single ellipsis character isn't completely standard; for example, there's what does XX mean and Appendix:X is a beautiful language. In addition, all the entries under Appendix:Snowclones use X, Y, Z, N, etc. For snowclones maybe this makes sense as it makes possible things like Appendix:Snowclones/I'm here to X A and Y B, and I'm all out of A. I think at least all the non-snowclone entries should use a single ellipsis character. Benwing2 (talk) 23:55, 12 September 2020 (UTC)

@Benwing2, Thanks for your input and advice! - Dentonius (talk) 00:01, 13 September 2020 (UTC)

Update: I started a conversation here: Wiktionary:Beer_parlour/2020/September#Phrase_ellipsis,_three_regular_dots_or_two_ellipsis_characters_(six_dots)? -- Dentonius (talk) 00:13, 13 September 2020 (UTC)

Module:transliteration/data[edit]

@Erutuon This junky module was formerly named Module:translations/data (totally wrong) and had a table in it called has_auto_translit that was unused and out of date. It still has a table needs_translit in it that's used by format_usex() in Module:usex to determine whether to add the page to CAT:Requests for transliteration of LANG. I think instead this should do this whenever the script is non-Latin, unless the language is missing a translit module. For example, Serbo-Croatian is conspicuously absent from the list, presumably related to the fact that it doesn't have a translit module and doesn't display transliterations for Cyrillic-script terms (intentionally, I think). Does this make sense? Benwing2 (talk) 23:42, 12 September 2020 (UTC)

@Benwing2: The list is pretty messy. Here's a set of cumulative tests on the languages in needs_translit that I thought were relevant. There are two languages that only have Latin script listed (though Forest Enets sometimes uses Cyrillic in Wiktionary entries according to User:Erutuon/scripts in link templates), and quite a few have a transliteration module and it overrides manual transliteration, which means that needs_translit will have no effect (unless the transliteration function returns nil?). Both groups could be deleted from the list, unless I'm missing something.
It's not as simple as excluding languages that don't have transliteration modules. For instance, Hebrew and Persian don't, but they do need manual transliteration. I wonder if it would be more economical to list the languages that use non-Latin scripts that don't need transliteration. — Eru·tuon 04:14, 13 September 2020 (UTC)
@Erutuon Ah right, I forgot about Persian and Hebrew. Thanks for writing the code in your sandbox. My original thought actually was exactly as you mention, essentially a blacklist rather than a whitelist. If you're OK with it, I'm thinking I'll see about adding a flag to those languages with non-Latin scripts but don't need transliteration. I'll have to work out exactly which ones those are, though. Benwing2 (talk) 04:24, 13 September 2020 (UTC)
@Benwing2: Yeah, that sounds good to me. — Eru·tuon 05:39, 13 September 2020 (UTC)

Help with auto-hiding Hindi conjugation tables (CSS issue)[edit]

@Erutuon Maybe you can help me? I created Module:hi-verb and copied the CSS from the existing Module:hi-conj written by User:AryamanA. On Safari the tables start out in the hidden state by default, but in Chrome they are in the open state by default. I'd like to know how to make them consistently be either hidden or open by default, but I'm not super familiar with CSS. I've looked at e.g. Template:ur-conj-head, which shows up as hidden by default on Chrome, but I don't see what it is in the CSS declarations that causes this behavior. Thanks! Benwing2 (talk) 06:30, 13 September 2020 (UTC)

@Benwing2: My guess would be that this is related to the "Show inflection" button under "Visibility" in the sidebar. It toggles all elements in that toggle category to one state or the other. The table in Module:hi-verb is assigned to the "inflection" category by data-toggle-category="inflection". The default state for visibility-toggled elements is hidden, so maybe you've pressed "Show inflection" in Chrome but not in Safari (or pressed it an even number of times in Safari but an odd number of times in Chrome through the whole time that your browser's local storage has been active). Template:ur-conj-head uses a different toggle category ("other boxes"), so it isn't necessarily in the same state as Module:hi-verb's table. To clean the storage for the Visibility section in the sidebar, you can execute delete localStorage["Visibility"] in your browser's JavaScript console. — Eru·tuon 06:45, 13 September 2020 (UTC)
@Erutuon: Thank you! I didn't even know about that feature of the sidebar. By default it said "Show other boxes" but "Hide inflection". Now it says "Show inflection" and the tables are indeed coming up hidden by default. Benwing2 (talk) 06:52, 13 September 2020 (UTC)

Generating accelerator entries for existing forms[edit]

@Erutuon Apologies for pinging you again. I have seen accelerator-assisted entries where the changelog message indicated that the user managed to override an existing page with a new accelerator entry. Do you have any idea how this is done? There are a lot of badly formatted Hindi non-lemma forms that I'd like to clean up. For me, accelerators only work for red links, which are colored green (and maybe for orange links? I haven't checked since enabling the orange link functionality). It's kind of painful and potentially dangerous to have to delete all the relevant pages before recreating them. Benwing2 (talk) 21:38, 13 September 2020 (UTC)

@Benwing2: Huh. I thought the gadget would only activate a link if the page didn't exist or if the Orange Links gadget had oranged it. Were these recent edits or could they have been done with an older version of the gadget that wasn't as careful? Or maybe the link was red because the page had just been created and the server hadn't gotten around to changing the link color. If the message was generated by the gadget, maybe you could search for it in current or past revisions. — Eru·tuon 00:36, 14 September 2020 (UTC)
@Erutuon: Unfortunately I don't remember the page this was on and it was a few weeks ago so there's no easy way to find it. If I somehow come across it I'll let you know. Benwing2 (talk) 03:18, 14 September 2020 (UTC)

(no suffix) at полымя[edit]

The inflection table has a bunch of forms that are given as "no suffix". That's probably an error. —Rua (mew) 13:03, 14 September 2020 (UTC)

@Rua Yes, and fixed. Thanks for pointing it out. Benwing2 (talk) 14:23, 14 September 2020 (UTC)

That template[edit]

What's that template we have that to avoid typing :::::::::::::::::::: in a really long discussion, which brings the indentation more to the left? --Java Beauty (talk) 20:05, 14 September 2020 (UTC)

It's {{outdent}}. But you still have to write the colons into it, or the number of them... — Eru·tuon 21:56, 14 September 2020 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Thanks. "Outdent" is a good name for a dental clinic, too. --Java Beauty (talk) 22:25, 14 September 2020 (UTC)

Open call for bot work[edit]

I'm obsessed with formatting and I have no coding skills, so I've made a wishlist of bot work I would like to see done. If you have a bot, please consider knocking one or two tasks off the list. Possibly the most pressing one is converting ==External links=== to ==Further reading===. I'll be updating it as I encounter more tasks, so I'd appreciate it even if you just add the page to your watchlist. Thanks in advance to anyone who wants to help out! Ultimateria (talk) 22:03, 14 September 2020 (UTC)

@Ultimateria OK, many of these tasks are easy to do with a bot. I'll see about the External links -> Further reading change. Benwing2 (talk) 01:05, 15 September 2020 (UTC)
I did the adv= -> head= change in {{lt-adv}} and the External links -> Further reading change is running now. For some of the other requests, e.g. removing redundant params from {{*-noun}} and {{*-IPA}}, it would help if you specified exactly what "redundant" means. For the *-IPA templates, does this just mean the param is the same as the pagename? For the *-noun templates, it would help if you enumerated the rules exactly as they're currently implemented, because I need to implement those rules in order to check for redundancy. Benwing2 (talk) 02:02, 15 September 2020 (UTC)
Thank you! I'll reply on my sandbox's talk page soon. Ultimateria (talk) 02:07, 15 September 2020 (UTC)

request for some replacement by a bot administrator[edit]

We'd like to ask a bot admin to do some replacement for us: replacing "sg" with "isg" wherever "inflection of" and "mpos|poss" or "(multiple possessions)|poss" are found in Hungarian-language entries, more specifically in this temporary category, which collects entries with "n=sg" parameter value given to {{hu-infl-nom}}. (The goal would be to move multiple-possession forms of nouns, like "my/your/… windows", from the singular column, where they were placed due to earlier programming limitations, into the plural column. Here is an example. (isg is short for -i-type singular: singular from the perspective of the software but plural in meaning.)

I'd also like to ask this kind bot admin to replace the string "(multiple possessions)" with "mpos" and the string "(single possession)" with "spos" if possible, as it would take us lots of time to do manually among these 5,000 entries (although some of them are not affected, such as proper names, most of which need singular-only on their own right). Thank you in advance. Adam78 (talk) 15:43, 15 September 2020 (UTC)

@Adam78 Done except for the following, which my bot rejected because they have multiple etymology sections in them:
Page 1227 egyenlítői: WARNING: Would make a change, but saw ==Etymology 1==, skipping
Page 1541 festői: WARNING: Would make a change, but saw ==Etymology 1==, skipping
Page 2345 kapunk: WARNING: Would make a change, but saw ==Etymology 1==, skipping
Page 2400 kegyed: WARNING: Would make a change, but saw ==Etymology 1==, skipping
Page 2733 lakom: WARNING: Would make a change, but saw ==Etymology 1==, skipping
Page 2974 mártírom: WARNING: Would make a change, but saw ==Etymology 1==, skipping
Page 3276 nyelvészetek: WARNING: Would make a change, but saw ==Etymology 1==, skipping
Page 3694 régészetek: WARNING: Would make a change, but saw ==Etymology 1==, skipping
Page 3733 rezsim: WARNING: Would make a change, but saw ==Etymology 1==, skipping
Page 3992 szaruk: WARNING: Would make a change, but saw ==Etymology 1==, skipping
Page 3993 szarunk: WARNING: Would make a change, but saw ==Etymology 1==, skipping
Page 4865 zavarom: WARNING: Would make a change, but saw ==Etymology 1==, skipping
Benwing2 (talk) 05:33, 17 September 2020 (UTC)

Thank you so much! :) I've fixed these above. Adam78 (talk) 13:39, 17 September 2020 (UTC)

Municipalities of Moldova[edit]

I created Category:Municipalities of Moldova and Category:en:Municipalities of Moldova but they are giving error messages. Could someone please integrate them into the corresponding Module page? Thanks, – Einstein2 (talk) 17:14, 15 September 2020 (UTC)

Module:place/shared-data currently provides "districts" for poldiv (recognized political division) for Moldova. Do you think it might be applicable? Adam78 (talk) 17:56, 15 September 2020 (UTC) I'm sorry, I realized "municipalities" are meant more like towns. So the suggested categories do seem to be justified. Adam78 (talk) 00:33, 16 September 2020 (UTC)
@Einstein2, Adam78 Yes, that's right. The module automatically recognize cities, towns, villages and rivers for all political entities, but most other sorts of political divisions are only recognized if specifically added to the module. You can see, for example, that "municipalities" are listed for Mexico (just above) and Montenegro (a few lines down). I added "municipalities" to Moldova; Wikipedia says there are 13 of them in Moldova. Benwing2 (talk) 03:37, 16 September 2020 (UTC)
@Benwing2 Thank you! – Einstein2 (talk) 05:49, 16 September 2020 (UTC)

request for checking usage of a template[edit]

We badly needed inflection parameters for {{archaic form of}} so I made some change in its code. It shouldn't involve any difference in display in most cases, except if the translation was given somewhere with the fourth unnamed parameter instead of using t=. Could someone possibly check if this template is invoked anywhere with more than three (but fewer than five) unnamed parameters and change the fourth parameter there to t=? (Fewer than five: to [mostly] eliminate those few cases I created in the past two hours, supplying the inflection there.) Adam78 (talk) 20:06, 17 September 2020 (UTC)

@Adam78 I think this should have been discussed first. There are quite a lot of uses of {{archaic form of}} that use 4= to specify a gloss. Just among the first 100 uses, for example, are buss, log, ua, cabree, morel, wile, carr, J and abbadia (as well as vala, which uses 4= to specify an inflection tag, based on a recent change of yours). People will expect {{archaic form of}} to follow the other form-of templates and allow 4= to specify a gloss. There are lots of other similar form-of templates that work the existing way, not the way you changed it: {{archaic spelling of}}, {{obsolete form of}}, {{dated form of}}, {{alternative form of}}, {{uncommon form of}}, etc. See Category:Form-of templates for more or less the complete list. With this change, {{archaic form of}} is inconsistent with all the others, and e.g. buss now has the definition "Archaic passenger vehicle form of bus". Other possibilities to get the behavior you're looking for are e.g. to create an {{archaic inflection of}}, or to add a q= param ("qualifier") to {{inflection of}} to specify an arbitrary qualifier like "archaic", "obsolete", etc. (The latter solution would have to have some magic to make the word "archaic" be linked appropriately and the appropriate category like CAT:LANG archaic forms be added, which some people might not like.) I think {{archaic inflection of}} might be a good solution, as it would be clear from the name that it is similar to {{inflection of}}. Benwing2 (talk) 03:35, 18 September 2020 (UTC)

@Benwing2 Thank you for being cooperative and constructive. Next time I'll try to be more prudent. I agree that this is a simpler and safer solution, although I don't really see the point in having two parallel templates with the same goal, their only difference being that one can handle inflections and the other cannot. It seems like forking to me, which doesn't look preferable as a rule, but never mind. I've created "archaic inflection of" and reverted my change on the other. At least there will be an option for those who'd like to make use of the flexibility and user-friendliness of inflection templates on archaic forms. Adam78 (talk) 14:31, 18 September 2020 (UTC)

@Adam78 Thanks for this change. I'm not opposed to changing all the relevant form-of templates ({{obsolete form of}}, {{archaic form of}}, etc.) to take inflection tags instead of a gloss, but I think there should be consensus first and all the templates changed at once so there is consistency. Benwing2 (talk) 16:58, 19 September 2020 (UTC)

A bug in Serbo-Croatian Cyrillic conjugation template.[edit]

For some reason, the conjugation pattern of Serbo-Croatian verbs in the template does not transform the future suffix from the Latin script to the Cyrillic. So the forms are shown as научиću, шетаću instead of научићу, шетаћу. I tried to look into the template, but the thing is too complicated for me. Could somebody please look into this problem? --Тарас Ашурков (talk) 14:38, 19 September 2020 (UTC)

@Benwing2: It was you, I have narrowed it down by including a Cyrillic conjugation template before and after these revisions. Fay Freak (talk) 15:05, 19 September 2020 (UTC)
@Fay Freak, Тарас Ашурков Oops. I wasn't passing sc= into the subtemplate. Fixed. Benwing2 (talk) 16:56, 19 September 2020 (UTC)
Thank you so much. --Тарас Ашурков (talk) 06:22, 20 September 2020 (UTC)

Why is Wiktionary talk:Sandbox edit protected?[edit]

Why is Wiktionary talk:Sandbox set so only registered users can edit it? --96.244.220.178 23:43, 19 September 2020 (UTC)

So people don't try to ask questions on a page that almost nobody watches. —Μετάknowledgediscuss/deeds 02:30, 20 September 2020 (UTC)

CSS issues[edit]

@Erutuon, Atitarev, AryamanA Eru, can you help? I recently updated Module:hi-verb, redoing the show/hide functionality using NavFrame/NavHead/NavContent and extracting all the CSS into Module:hi-verb/style.css. However, the widths are all messed up; the main table has auto width, which is what I want, but the footnote section always goes to 100% width no matter what, and I can't make the title bar be auto width. (For that matter, I could only make it the right color by setting an inline style; it seems that otherwise, the skin overrides the CSS file.) What I'd like to do is have all three of title bar, contents and footnote section be the same auto-determined width, and have the background color of the title bar be controllable by the CSS file. Maybe that isn't possible using the Nav* stuff, and I need to simulate the functionality manually (however, I don't know how to do that). Thanks! Benwing2 (talk) 04:32, 21 September 2020 (UTC)

BTW a sample page to view with these templates is होना. Benwing2 (talk) 04:34, 21 September 2020 (UTC)
Also, it is OK if the "impersonal forms" and "personal forms" tables have different auto-determined widths. Benwing2 (talk) 04:36, 21 September 2020 (UTC)
I prefer vsSwitcher to NavFrame; NavFrame is older and requires adding extra divs, while vsSwitcher works with just a table, and tables tend to adapt their sizes to their contents so you don't have to either set a manual size (which is often wrong) or let the table take up the full width (always wrong). However, I did some tinkering and the template seems to behave better when NavFrame has display: table; and when NavHead doesn't have a size. Then these two elements adapt to the size of the inner table. However, this means the table is much bigger collapsed than expanded, as it was with vsSwitcher. I don't know how to get the header to be the same size as the inner table when the table is hidden. — Eru·tuon 05:12, 21 September 2020 (UTC)
@Erutuon Thanks! It's a lot better now than it was. Benwing2 (talk) 05:22, 21 September 2020 (UTC)

English Wiktionary autofill suggestions[edit]

While using the English Wiktionary, I discovered that when you type only "n" in the searchbox, the autofill suggestions include both "negro" and the n-word. Why does Wiktionary autosuggest maybe the most racist epithet as one of the ten suggestions for the letter N? This also applies for "homo" after typing just H, and to a lesser extent "cum", "penis", and "vagina". A word can certainly be in the dictionary, but Wiktionary should do better than having a terrible racial slur appear only after typing the fifth-most common letter in English. —⁠This unsigned comment was added by Wheelsgenius (talkcontribs) at 16:41, 21 September 2020 (UTC).

I believe the results depend somehow on page view counts or something like that. Possibly we could file a Phabricator issue. DTLHS (talk) 16:45, 21 September 2020 (UTC)
Maybe there is a better suggestion order than "most commonly viewed first"? All I can think of is "most commonly used in English first", though it prioritises English, won't cover our entire stock of entries, and would need a source list from somewhere. Equinox 16:54, 21 September 2020 (UTC)
The nannies are coming! The nannies are coming. DCDuring (talk) 17:07, 21 September 2020 (UTC)
How do we know who our users are and what their motivation for looking something up might be? Do we think we have skinheads looking up the exact definition, translation, or usage notes for nigger? If young teens are looking up sex-related terms, why exactly should we be discouraging them? Shouldn't we expect lots of users to be looking up common words connected to matters of serious concern like race, sex, and gender? Why not help them? DCDuring (talk) 17:23, 21 September 2020 (UTC)
This has got to be the most concern trolly comment I have ever seen you make. —Μετάknowledgediscuss/deeds 17:59, 21 September 2020 (UTC)
If our goals are reflected in the query by Wheelsgenius, then I am not signed on to our goals. I thought we were into freedom of expression, AGF, etc. How is disagreement with the direction that Wheelsgenius would have us go a manifestation of anything other than disagreement. Please lay out exactly how my comments are instances of "concern trollery". Are you saying that our users are a nasty bunch of skinheads who need to look up our nigger entry to help them spew hate? Or do you think that our users are so incapable of independent thought that they can't be trusted to be helped to the entry they (statistically) are interested in? I will not stoop to namecalling, though a few come to mind. DCDuring (talk) 22:24, 21 September 2020 (UTC)
No one's freedom of speech is restricted by a racist ethnic slur not being autocompleted after typing "n". The entry isn't being removed. IDK where this is coming from, but it seems like you jumped over many miles of terrible inferences to end up at your response here. —AryamanA (मुझसे बात करेंयोगदान) 16:38, 24 September 2020 (UTC)
Got to agree here, I'm probably in what Twitter would call the horrible "free speech extremist" camp but I still think it's just stupid for us to suggest n*r when someone only typed n. It may (also stupidly) be one of the commonest looked-up words but this should not be taken as the "average of everybody". There are probably loads of kids and wannabe vandals who look this up, and might skew the statistics. I doubt these people are foreign English learners, or Scrabble players, or our general (implied) target audience of people who want to learn meanings rather than to mess about with taboo words or commit vandalism. Yes, it is certainly very important that we don't "hide"/censor nasty words but it is also ridiculous that we are popping this stuff up after typing one letter in the search box. Equinox 17:41, 24 September 2020 (UTC)
For the purpose of moving forward from this: who can confirm the way that the predictive search lists are generated? And what do people feel about my earlier suggestion of "most used word" instead of "most looked-up word"? (I think it has serious issues [particularly for people using en.wikt for non-English languages] even before we ask the devteam if they can do it... but I hardly know what else to suggest.) BONUS APPROACH: we could just say "don't include stuff in the search list if it's got vulgar in the entry" but umm that's also probably gonna seriously hurt whatever spider/indexer the wiki nerds use, and there's the same question of whether vulgar was in English, or French, or just part of entry text rather than the glosses... Equinox 17:46, 24 September 2020 (UTC)
FWIW, taking "the n-word" you refer to to be "nigger", that word does not show up when I type "n", nor indeed when I type "ni", or "nig" (though I get "nigga" at that point). In fact, it does not even show up when I type "nigge", though by then terms like "nigger rich" and Nigger with an uppercase N are showing up, so it seems like there may be some list of terms which are suppressed, but it must be quite limited if the compound terms are not also suppressed. (On Wikipedia, "Nigger" is a suggestion as soon as I've typed "Nig".) This is only an observation about what seems to be happening; I am not sure whether we should suppress certain words or not. - -sche (discuss) 01:55, 22 September 2020 (UTC)
I was seeing it earlier today but not now. Does anyone have any official documentation on how it works? DTLHS (talk) 02:20, 22 September 2020 (UTC)
Aren't the lists of suggestons likely to be highly dynamic, reflecting recemt search terms? The autocompletion suggestions I got earlier today had the terms complained about, but more recent searches have not gotten all the same terms. DCDuring (talk) 04:02, 22 September 2020 (UTC)

Offensive suggestions should be removed[edit]

NOTE: I originally created this section below because I didn't see the now-container section since it it is not an English-only problem. The exact problem with the N-word appearing when typing just "n" in English-language Wiktionary was avoided with a quick fix special case. A member of the Search team has acknowledged that they consider fixing this to be in scope. The bug report T263818 in Phabricator contains more details. Please add additional discussion above. RoyLeban (talk) 02:21, 27 September 2020 (UTC)

Discussion moved from #Offensive suggestions should be removed 2.

Wiktionary provides offensive words as suggestions. It has been observed that the N-word is sometimes a suggestion when the letter "n" alone is typed. Similarly, "f" and "c" and probably many other entries return offensive suggestions. This is, of course, offensive. While dictionaries must continue to include offensive words, largely to document their inherent offensiveness, there is no reason to suggest them. People who want to look them up can type the full words.

The N-word's page includes "offensive" 25 times, "vulgar" 11 times, and "slur" 3 times. It is absolutely clear that it is offensive and racist. There is no excuse to suggest it.

I know that some people might argue that these words are real words, so they should be suggested. No, that is an argument for why offensive words must remain in the dictionary, not for why they should be suggested. I can also imagine that people will argue that there are multiple pages for phrases that begin with the N-word and that, without suggestions, they will not be found. This is a specious argument. Wiktionary does not have a mission of promotion — its purpose is documentation. If someone wants the definition of a racist phrase that is in Wiktionary, they can find it. Any important offensive phrases could also be linked from other pages. Not only is there no need to promote such phrases to people who are not looking for them, it is offensive to do so. It implies a normalization of offensive words that is not true.

I imagine other people might argue that it's just the way the suggestion algorithm works. That argument is never valid. Algorithms can and should be changed.

I would file this as a bug report, but I can't find where to do that.

RoyLeban (talk) 00:50, 25 September 2020 (UTC)

You should file a report at Phabricator. So far I have been unable to locate public information about how search suggestions specifically work, other than what is at [2]. DTLHS (talk) 00:58, 25 September 2020 (UTC)
Thank you. I didn't know about Phabricator. RoyLeban (talk) 02:20, 25 September 2020 (UTC)
I see they closed your report. In that case we would probably have to have a vote locally to convince them to make any changes. DTLHS (talk) 04:36, 25 September 2020 (UTC)

I was not too surprised by the knee jerk reaction to treat it as a censorship attempt. It has been reopened by someone on the Search team and I expect it will get fixed. The bug report is T263818 and I hope I've described well enough why suggesting offensive entries in autocomplete is actually a policy violation. Feel free to comment. RoyLeban (talk) 08:26, 26 September 2020 (UTC)

Long or unclosed HTML comments[edit]

Over on Wikipedia, it was discovered that several articles contained large sections (tens of thousands of bytes) that had been hidden in HTML comments for years: w:Wikipedia:Request a query#Extremely_long_HTML_comments_/_hidden_material. Another error WP checks for is HTML comments which are not closed, as mentioned in one of the first replies to that thread. Someone else in the thread pointed out another avenue of error, which is when an HTML comment is not closed properly, but the page is not entirely broken because the closure of another, later HTML comment closes it. Would anyone like to check if we have any imbalanced HTML comments (pages with more <!--s than --> or vice versa, which would find both unclosed and "improperly / accidentally later closed" comments) here? Should we also check for what pages have the longest HTML comments? The latter might find cases where e.g. an entire language section has been commented out, or an overly long comment has been left, both of which should perhaps be moved to the talk page or flagged for cleanup/RFC. - -sche (discuss) 02:18, 22 September 2020 (UTC)

User:Benwing2/mismatched-comments-2020-09-20-dump contains all pages with mismatched comments. Benwing2 (talk) 05:04, 22 September 2020 (UTC)
Thanks. That's a lot of pages, and an interesting variety of kinds of pages. In treat and several other pages, someone persistently miswrote their opening tags. Tamil was using a closing tag as an ersatz arrow, zucchino has one (validly, I guess) in the title/text of a quotation. I fixed several, as did Fay Freak. :) - -sche (discuss) 02:44, 23 September 2020 (UTC)

Template:character_info glitch?[edit]

The Template:character info added here seems to have a glitch as it doesn't show the next character: ۝ (U+06DD) or a link to it. The same issue happens on ۞, where the precedent character, also ۝ (U+06DD), and its link are also missing. --37.11.121.244 10:44, 24 September 2020 (UTC)

@Erutuon Any idea what's going on here? You have done some work on Module:character info. The character ۝ does have a Wiktionary entry, maybe it's just missing from one of the underlying data modules? Benwing2 (talk) 00:46, 25 September 2020 (UTC)
Shoot, I started to reply but didn't finish. U+06DD is categorized as a format character (General Category Cf). Most format characters are invisible and therefore don't get links, but U+06DD is one of a group of Arabic characters that are visible but still categorized as format characters. The module for the template assumes that all format characters cannot be displayed, so they won't have articles, and so it doesn't display a link for them. This is true for characters like the left-to-right mark (U+200E) but not for these Arabic characters. So in short we need a better definition of displayable characters, or characters that can have Wiktionary articles. I've gone looking for something like that but haven't found it. — Eru·tuon 00:53, 25 September 2020 (UTC)
@Erutuon Thanks for your quick reply. What about just adding a check to see whether the page exists? If so, include a link to it regardless of whether it's a format character. Benwing2 (talk) 01:37, 25 September 2020 (UTC)
@Benwing2: I've made the module link to but not display supposedly unprintable code points that have pages, so now there is a link to ۝ in its neighboring code points. (I wish we had a better definition of printable so we could display it.) — Eru·tuon 04:39, 25 September 2020 (UTC)
@Erutuon: Thanks! What happens if you try to display an actually unprintable code point? Does it misbehave? Benwing2 (talk) 04:45, 25 September 2020 (UTC)
@Benwing2: I took a second look at is_printable in Module:Unicode data and I'd misremembered; it actually excludes more than just format characters.
The supposedly non-printable characters are a mixed bag. The left-to-right and right-to-left marks could reorder text depending on what's around them. I haven't actually tried printing them in the template though. I guess they'd be no worse than Latin or Arabic characters respectively because we've already taken measures to prevent text reordering in the template. Others, like the zero-width non-joiner, would probably just be invisible, and the Arabic character above would be visible and maybe cause no problems. Others, like most of the ASCII control characters, like the null byte, would be replaced with a replacement code point (�) by the MediaWiki software or fail to display as characters if they were written as numeric character references. (The tab, LF, and I think CR would be unaffected, but CR and LF could cause other problems because they're interpreted as line breaks.) The whitespace characters would probably just be invisible. So the ASCII control characters that are blacklisted by the MediaWiki software are the worst behavers that I can think of right now. — Eru·tuon 06:01, 25 September 2020 (UTC)
@Erutuon It looks like you're going to have to escape character names that have meaning in Lua syntax. See CAT:E. Chuck Entz (talk) 20:58, 25 September 2020 (UTC)
@Chuck Entz: Hmm, thanks for letting me know about the error. Actually the issue is that the .exists part of mw.title.new("#").exists errors, because of mw.title.lua calling pairs on php.getExpensiveData( t.fullText ), which in this case is nil. I've fixed the error, but I'll submit a Phabricator report. — Eru·tuon 21:22, 25 September 2020 (UTC)

Adding tone to Module:ko-pron[edit]

I'd like to add Gyeongsang tone to Module:ko-pron, which currently only reflects very conservative Seoul speech. Unfortunately, I understand absolutely nothing of the module so I'd like some help.

Gyeongsang tone patterns vary extremely. The best one to implement is probably the Busan dialect which has two tonemes, H(igh) and L(ow), resulting in four distinct tones: H, L, HL/F(alling), and LH/R(ising).

The following tone patterns exist and have to be accounted for in the module:

  • For one-syllable nouns and verb/adjective stems:
    • H: Usually H, but becomes L before a multisyllabic suffix that begins with a consonant.
    • Regular H(H): Always H and makes the subsequent syllable H as well.
    • Irregular H(H): Always H and makes the subsequent syllable H as well, except before the locative suffix (-e), which stays as L.
    • F: Always F.
    • R: Always R, the only examples are contractions of bisyllabic LH forms.
    • Regular L: Always L.
    • Irregular L: For verb/adjective stems with the following rules:
      • Always L before a consonant-initial suffix.
      • H before a suffix that begins with (a) or (eo).
      • Only if the verb stem ends in a vowel or /l/, H before a suffix that begins with an underlying (eu).
    • Irregular: Tone changes unpredictably. Found in a number of very common verb stems.
  • For two-syllable nouns and verb/adjective stems:
    • HH: Always HH.
    • HL: Always HL.
    • Regular LH: Usually LH, but becomes LL before a multisyllabic suffix that begins with a consonant
    • Irregular LH: Usually LH, but becomes LL before a multisyllabic suffix that begins with a consonant, and becomes HL before a suffix that begins with (a) or (eo)
    • LH(H): Always LH and makes the subsequent syllable H as well
    • LF: Always LF
  • For three-syllable nouns and verb/adjective stems:
    • HHL: Always HHL
    • HLL: Always HLL
    • LHH: Always LHH
    • LHL: Always LHL
    • LLH: Usually LLH, but becomes LLL before a multisyllabic suffix that begins with a consonant.
    • LLF: Always LLF
  • For morphemes with four or more syllables:
    • These should all be recent loanwords, where the rule is that the penultimate syllable is H and everything else is L.

For the tones of suffixes:

  • If the noun or verb stem contains H (including R or F), all suffixes are L. This includes underlying H stems that become L because of the presence of a multisyllabic suffix that begins with a consonant.
  • A consonant-initial multisyllabic suffix that lowers the noun or verb stem has H only on its initial syllable.
  • If the noun or verb stem contains only L, excluding underlying forms with H that become L due to a multisyllabic suffix, the first two syllables of the suffixes are H, and the rest is L.

Hopefully this wasn't too confusing. What I'd want ideally is:

  • For nouns, it gives the tonal pattern (in Yale with no accent for L, acute for H, circumflex for F, caron for R) when followed by the suffixes (i), 까지 (kkaji), and (e).
  • For verbs/adjectives, it gives the tonal pattern when followed by the suffixes (da), 아서 (aseo)/어서 (eoseo), 니까 (nikka), and 더라 (deora). This would have to be extracted from Module:ko-conj somehow.

For example, {{ko-IPA|LLH}} on 마지막 (majimak) would give:

마지막이: macimák-i
마지막까지: macimak-kkáci
마지막에: macimák-e

And {{ko-IPA|LH irregular}} on 따르다 (ttareuda) would give:

따르다: ttalú-ta
따라서: ttál-ase
따르니까: ttalú-nikka
따르더라: ttalu-téla

Any thoughts on how to make this work??--Karaeng Matoaya (talk) 11:05, 24 September 2020 (UTC)

Manual input for the verbal forms should also work if making it work together with Module:ko-conj proves too much of a challenge.--Karaeng Matoaya (talk) 13:29, 24 September 2020 (UTC)
@Karaeng Matoaya This module looks to have been implemented by User:Wyang. Unfortunately he has been inactive here for over a year now so you're unlikely to get any help from him. So you would probably have to do a lot of the work yourself unless you can find some kind soul who is willing to put in the time to learn the module and fix it up. How familiar are you with programming? To implement this yourself you'd need to be able to understand how to modify the Lua code of the module. If you are not daunted by this I can definitely help you with some pointers. (I don't know Korean or Hangeul but I can figure out the grammar/writing system if necessary; Hangeul doesn't look to be super complicated esp. as it is a sort of featural alphabet.) Maybe User:Atitarev can also give you some help if he knows Korean.
You'd probably want to start with nouns and deal with verbs later. The tone pattern like LLH would not be specified using param 1=, as in your examples, because that's used to specify the word itself. Instead you'd use some other param, e.g. tp= for tone pattern. You'd want to specify in detail exactly what you want the output to look like. For example, currently there are several lines giving the IPA and different romanizations. Would there be three more lines output when tp= is given, something like "Busan dialect before (i)", "Busan dialect before 까지 (kkaji)" and "Busan dialect before (e)"? Benwing2 (talk) 00:43, 25 September 2020 (UTC)
@Benwing2, Karaeng Matoaya: Thanks for the post and interest. I am currently learning Korean (restarted after a long break) but my level is about low intermediate, if not upper beginner. I am very familiar with hangeul, pronunciation, relatively familiar with grammar and understand how to use and purpose of all Korean templates including {{ko-IPA}} but I won't be able to efficiently assist with the module development. In my personal opinion (maybe I am wrong) the Korean pitch accent is not well-studied or understood and most learners are not particularly interested/concerned about it. We do have handling for Japanese pitch accent, which may be the closest equivalent and something to look at for reference. The Japanese module doesn't use the accent symbols but it's one of the ways to represent the Japanese pitch accent. Please have a look at Template:ja-pron/documentation. Perhaps the lack of interest is unjustified. I don't know how important it is to add the handling of Korean pitch accent. @HappyMidnight, TAKASUGI Shinji, LoutK: are you able to pitch in? BTW, @Karaeng Matoaya, we use the RR system for romanising Korean, the other methods are only given as alternatives. Not sure if RR is used for Busan dialect as well, though. --Anatoli T. (обсудить/вклад) 01:07, 25 September 2020 (UTC)
@Benwing2 My only experience with programming is basic HTML, but I'm willing to learn Lua if need be. What'd be ideal is a drop-down box like in Chinese entries, with a single line showing the Busan tone of the word in isolation and you being able to see the full suffixed pattern by clicking on "More". Eventually Daegu and Hamhung tone would be added too.
@Atitarev Certainly most Korean learners are not particularly interested/concerned about Busan tone (although I know at least one American who speaks only Busan dialect), but I imagine even less Chinese learners are interested, concerned, or even aware of the existence of Sichuanese or Dungan pronunciations :P Almost a third of the South Korean population uses some form of tone aligned with the Busan system, which is a far larger proportion of the Korean speaker community than any non-Mandarin topolect can claim of the Chinese-speaking population, and yet we mark all sorts of Chinese languages in Module:zh-pron but only the Seoul dialect in Module:ko-pron.
Yale is the only acceptable option for transcribing Busan because it represents the underlying forms, while RR transcribes the surface realizations. Busan and Seoul dialects have different allomorphy rules, so RR will produce inaccurate results.--Karaeng Matoaya (talk) 02:21, 25 September 2020 (UTC)
@Benwing2 Sorry for the double ping, but is there a sandbox module that I can work with?--Karaeng Matoaya (talk) 02:35, 25 September 2020 (UTC)
@Karaeng Matoaya Module:sandbox. Benwing2 (talk) 02:47, 25 September 2020 (UTC)
@Benwing2 Thank you. Wow, this is insanely complicated.--Karaeng Matoaya (talk) 04:06, 25 September 2020 (UTC)

@Benwing2 Sorry for having to ask again, and thanks again for all the help you've offered so far.

I decided that having a separate module was probably better than trying to edit Module:ko-pron directly, and made a truly very shoddy module at Module:User:Karaeng Matoaya/ko-pron, the results of which can be seen at User:Karaeng Matoaya/Sandbox. On second thought, using Hangul directly with bolding and color coding seemed the most intuitive way to mark tone (given that Yale is unfamiliar for most readers).

I haven't been able to resolve two issues, though:

  • Is there some way to apply HTML to, or to otherwise bold or change the font size and color of, the output of the function "mw.title.getCurrentTitle().text"?
  • Is there some way to decompose the output of the function "mw.title.getCurrentTitle().text" into its constituent Unicode characters (jamo), so that e.g. HTML can be applied to some parts of the entry title but not to other parts?

Thanks in advance :) --Karaeng Matoaya (talk) 14:30, 25 September 2020 (UTC)

@Karaeng Matoaya: Sure. See this.
When you edit a module, in the green box above the edit area, there is a link to MediaWiki-Lua documentation ("Scribunto"). There's a lot of interesting things there. —Suzukaze-c (talk) 00:51, 26 September 2020 (UTC)
@Suzukaze-c: ありがとうございます, thank you so much! All the monosyllabic noun forms appear to work now per a test of a minimal triplet at (mal), almost entirely thanks to you :P
Just one final thing. I'm trying to make the module apply HTML (i.e. bolding and red color) only to a particular Hangul block of the entry title. The relevant code I'm running on Module:User:Karaeng Matoaya/ko-pron is currently
mw.ustring.toNFC(foobar2)
mw.ustring.gsub(foobar2, mw.ustring.sub(foobar2, 1, 1), foobar5_1)
where foobar5_1 is defined as a bolded and reddened equivalent of the first letter.
But as can be seen in User:Karaeng Matoaya/Sandbox, instead of bolding and reddening an entry title letter, this currently has the effect of dissolving the code and bolding and reddening the bracket character "<". Is there a solution to this? Once I can reliably bold specific Hangul syllables of "mw.title.getCurrentTitle().text", there shouldn't be any issue left with implementing this for all nouns.--Karaeng Matoaya (talk) 12:11, 26 September 2020 (UTC)
@Karaeng Matoaya: It seems like you accidentally refer to the wrong variable: foobar2 instead of foobar5. —Suzukaze-c (talk) 06:12, 27 September 2020 (UTC)

Offensive suggestions should be removed[edit]

Discussion moved to #Offensive suggestions should be removed.

template:circumfix[edit]

Few words in Indonesian doesn't accommodated properly with this template. Example: ketidaksempurnaan. This word analyzed as ke+tidak sempurna+an. While tidak sempurna is negation of sempurna which SoP. —Rex Aurorum (talk) 20:29, 26 September 2020 (UTC)

The normal way to deal with SOP arguments is to put double square brackets around each part, which tells the linking module to treat each as a separate term. With this template, however, it tries to use this as the sort key- and that doesn't work. You would have to use the |sort= parameter (with the uncircumfixed term in ALL UPPERCASE as the sort key) to prevent that. It might be a good idea for someone with better Lua skills than mine to add code so that wouldn't be necessary. Chuck Entz (talk) 22:02, 26 September 2020 (UTC)
@Chuck Entz, Rex Aurorum Fixed. Benwing2 (talk) 04:19, 27 September 2020 (UTC)
Thanks. —Rex Aurorum (talk) 10:40, 27 September 2020 (UTC)