Wiktionary:Beer parlour
Wiktionary > Discussion rooms > Beer parlour
| Wiktionary discussion rooms (edit) see also: requests | ||||
|---|---|---|---|---|
| Information desk comment | history | archives Newcomers' questions, minor problems, specific requests for information or assistance. |
Tea room comment | history | archives Questions and discussions about specific words. |
Etymology scriptorium history | archives Questions and discussions about etymology- the historical development of words. |
Beer parlour comment | history | archives General policy discussions and proposals, requests for permissions and major announcements. |
Grease pit comment | history | archives Technical questions, requests and discussions. |
| All Wiktionary: namespace discussions 1 2 3 4 5 - All discussion pages 1 2 3 4 5 |
Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.
Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.
Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!
| Beer parlour archives | |||
| 2002
|
|||
| 2003
|
|||
| 2004
|
|||
| 2005
|
|||
| 2006
|
|||
| 2007
|
|||
| 2008
|
|||
| 2009
|
|||
| 2010
|
|||
| 2011
|
|||
| 2012
|
|||
| 2013
|
|||
| All subject headings |
March 2013
Help with German verbs [edit]
There are now more than 200 verbs in Category:German verbs needing inflection. Adding these inflection tables is beyond my limited knowledge of German. Please help, if you can. SemperBlotto (talk) 10:13, 1 March 2013 (UTC)
- If the German templates work anything like the Dutch ones, you could probably do most of them by starting off with the base verb and specifying a parameter for the separable part. For example, abbremsen inflects the same as bremsen, with the additional part ab. So it would be:
{{de-conj-weak|brems|gebremst|h||a|ab}}—CodeCat 14:41, 1 March 2013 (UTC) - But if I make mistakes in German, I get moaned at. I'll leave it to my betters. SemperBlotto (talk) 08:20, 2 March 2013 (UTC)
- The German conjugation templates are pretty terrible at the moment. Their syntax should be simplified when (if) they get converted to Lua. -- Liliana • 09:28, 2 March 2013 (UTC)
- Agree with Liliana. The German conjugation templates are a mess, and the declension templates are hardly better; converting them to Lua seems like the perfect opportunity to make them user-friendly rather than the terrifying, baffling quagmire they currently are. —Angr 09:32, 2 March 2013 (UTC)
- Unfortunately I am completely clueless on how Lua works, else I would have started a draft long ago. -- Liliana • 13:50, 2 March 2013 (UTC)
- I may convert them some time in the future, using the current Dutch templates as a base. Could you read Template:nl-conj-wk/doc and list anything that doesn't apply to German, as well as extra things that are needed for German that don't apply to Dutch? —CodeCat 14:13, 2 March 2013 (UTC)
- Unfortunately I am completely clueless on how Lua works, else I would have started a draft long ago. -- Liliana • 13:50, 2 March 2013 (UTC)
- Agree with Liliana. The German conjugation templates are a mess, and the declension templates are hardly better; converting them to Lua seems like the perfect opportunity to make them user-friendly rather than the terrifying, baffling quagmire they currently are. —Angr 09:32, 2 March 2013 (UTC)
- The German conjugation templates are pretty terrible at the moment. Their syntax should be simplified when (if) they get converted to Lua. -- Liliana • 09:28, 2 March 2013 (UTC)
What I'd wish for is that the syntax is consistent for all four conjugation templates we have for German, that is {{de-conj-weak}}, {{de-conj-strong}}, {{de-conj-weak-eln}} and {{de-conj-weak-ern}}. With Lua, we could probably do away with the two parameters that just set whether the stem ends in d/t (like in Dutch) or s/z/ß (doesn't exist in Dutch, affects forms like second person singular), because they can be automatically derived from the stem. Verbs like laden seem to be completely irregular in German and cannot be conjugated by regular means, they should probably fall back to the irregular template (unless you have a better idea?).
Thus, the first three parameters should be 1. the stem, 2. the past participle (completely irregular in German, cannot be derived from the verb at all) and the auxiliary verb. The separable parameter should probably be named sep= like in the Dutch template. Non-separable prefixes are treated in our German templates as if they were part of the stem, dunno if it needs to be changed.
About strong verbs we need to think some more, because that's pretty difficult and widely different for every verb. We also need to see how to integrate {{de-conj-pp}}, the oddball one out of the German conjugation templates, into the whole thing. I guess I need to analyze German verbs for that to see how they conjugate. -- Liliana • 14:56, 2 March 2013 (UTC)
- The non-separable prefixes have a parameter because the script needs to know when to leave out the ge- of the past participle. I know that this happens in German too, so the German templates would also get such a parameter. It also has another advantage, too. The way the Dutch script currently works, it first conjugates the base verb, and only after that it adds the prefix and separable part to it. This means that if the script is coded to handle verbs like laden, then with a separate prefix parameter the exact same code could also handle entladen and all other verbs that have -laden, without any changes at all.
- The German equivalent of
{{nl-conj-wk-cht}}is the "Rückumlaut" verbs (they have mostly become strong in Dutch), but they are handled more or less the same, and should probably get their own template/function in the script. - Strong verbs in Dutch currently take three required parameters: present stem, past stem and past participle stem (without ge-). I think that this can be applied to German as well, but German also has umlaut in the present 2nd and 3rd singular, and I think also in the past subjunctive, so this may need two extra parameters.
- I think that the verbs in -eln and -ern could probably be unified with regular weak verbs. Is this special type of inflection automatic in German, meaning that any verb with a stem in -er- automatically inflects that way?
- The script has allowed us to drastically reduce the amount of verbs that need special treatment in Dutch. gaan and doen, which are otherwise pretty irregular, can be handled by the same script that handles all other strong verbs without any extras. A few verbs have specific but minor irregularities, like zeggen (with two past tense stems) and houden (with an alternate 1st singular), but these are treated as regular weak/strong and the script simply "knows" that those verbs need slightly different treatment. Only the preterite-present verbs (which are too irregular compared to each other to be all treated together) and a few other verbs (hebben, zijn, willen) are treated individually as simply irregular. So how irregular is laden exactly? It is a strong verb both currently and etymologically, so in what way does it differ from most other strong verbs? —CodeCat 15:52, 2 March 2013 (UTC)
- Ah, in German templates past participles are currently set up a bit differently, though I guess this could be changed if desired.
Yes, German has umlaut in 2nd and 3rd person singular present, and also in the past subjunctive. A few verbs seem to have two possible past subjunctive forms like heben, though I don't know why that is the case. (Maybe one is an obsolete form and one the modern one?)
As for -eln and -ern, I don't think they can be determined automatically. There are a few -en verbs which have stems ending in -el or -er, like verlieren.
About laden, the problem is with the suffixes. Normally, as a verb with a stem ending in -d, it would have forms like *lädest and *lädet, but this is not the case here, so the correct forms are lädst and lädt... except for the second-person plural present. That is ladet, not *ladt as expected. This is the reason why this verb is fully irregular. -- Liliana • 16:05, 2 March 2013 (UTC)
- Ah, in German templates past participles are currently set up a bit differently, though I guess this could be changed if desired.
One thing that should also be added to the conjugation templates is this. Longtrend (talk) 17:53, 6 March 2013 (UTC)
I added some more conjugations. Many of the entries that are still in the category should not be there. This applies to all entries of the form "[prefix]zu[lemma form]". Those are special infinite forms (zu-infinitives) that shouldn't get their own conjugation tables but rather be included in the conjugation tables of their lemma forms. (For example, abzuscheiden is the zu-infinitive of abscheiden and should be included in the latter's conjugation table. Unfortunately, there's no way to do this ATM.) Longtrend (talk) 20:41, 31 March 2013 (UTC)
- Before doing anything automated, note that there are lemma forms that look like "[prefix]zu[lemma form]" but aren't. For example, hinzufügen is a lemma form, not a zu-infinitive of a nonexistent *hinfügen. (Its zu-infinitive is hinzuzufügen.) —Angr 20:53, 31 March 2013 (UTC)
- Actually, just to make things interesting, hinfügen has always been very rare, but is an attested word. Your point is still very valid, though. - -sche (discuss) 21:24, 31 March 2013 (UTC)
- Good point about hinzufügen, Angr. However, I don't think there's a need to do anything automatically. IMO we should just change the verb conjugation templates such as
{{de-conj-weak}}so they show the zu-infinitive. However, it should only be linked for separable verbs, because the zu-infinitive of inseparable or simplex verbs is just the two words "zu [lemma]". Also,{{de-verb form of}}should be supplemented with an option to label the zu-infinitive. Is there anyone who could do that? Longtrend (talk) 11:08, 1 April 2013 (UTC)
WT:CSS [edit]
I started a new page WT:CSS, because it was pointed out that our main style sheet is not documented. —Michael Z. 2013-03-01 21:34 z
- Thanks. I found it useful already, directing me to the commonprint stylesheet which accounts for the printing of urls (though it doesn't account for why full urls are needed in index boxes). DCDuring TALK 13:12, 25 March 2013 (UTC)
Egyptian quotations [edit]
I'm interested in adding quotations to, and generally improving, Egyptian (hieroglyphic) words. Is there any policy about how Egyptian quotations should be formatted - i.e. heiroglyph vs. transliteration; how to reference stelae and their authors? I did a short test run at 𓍁𓈖𓏭𓆱 - is that kinda how it should go? Hyarmendacil (talk) 03:48, 3 March 2013 (UTC)
- In general yes. You may wish to create templates for referencing specific works (see Category:Latin quotation templates for examples). DTLHS (talk) 07:29, 3 March 2013 (UTC)
- Also add hieroglypic representation of the citation if it's not the problem, with the cited part bolded in wiki-syntax (even if it's not bolded in actual font in the browser).
- Those abbreviations seem a bit opaque - I'm sure they're well-known for your average Egyptologist, but not necessarily so for everyone else. E.g. for Sanskrit there is
{{sa-a}}template for such purpuse which accepts common abbreviation as a parameter and links to the respective appendix page where the abbreviation is further explained. How it's used in practice in combination with inline citations see e.g. in the entry for सूनु (the very first definition). --Ivan Štambuk (talk) 19:31, 3 March 2013 (UTC)- It is probably best to enclose the quote in
{{lang|egy}}, so that the appropriate language-specific fonts and styling are used. —CodeCat 19:43, 3 March 2013 (UTC) - Not just the abbreviations are opaque: uncommon words like electrum should be wikilinked. Chuck Entz (talk) 03:07, 4 March 2013 (UTC)
- Even in citations? I wouldn't mess with hyperlinks to words in citations, but I don't see any reason to encourage them.--Prosfilaes (talk) 05:08, 4 March 2013 (UTC)
- Why not? I'm not being flippant, I really don't know why having links in any chunk of text, citation or otherwise, would be discouraged. -- Eiríkr Útlendi │ Tala við mig 05:23, 4 March 2013 (UTC)
- Because in "and my staff in ebony decorated with electrum", wikilinking electrum is highlighting something whose meaning is irrelevant to the entry 𓍁𓈖𓏭𓆱. Clicking that link won't explain 𓍁𓈖𓏭𓆱 in any way.--Prosfilaes (talk) 06:33, 4 March 2013 (UTC)
- Perhaps not in this case, but quotes provide context, and you're missing some of the context if you don't understand the other words. It's not the main purpose of the quotes, but sometimes it's helpful. Chuck Entz (talk) 06:49, 4 March 2013 (UTC)
- Because in "and my staff in ebony decorated with electrum", wikilinking electrum is highlighting something whose meaning is irrelevant to the entry 𓍁𓈖𓏭𓆱. Clicking that link won't explain 𓍁𓈖𓏭𓆱 in any way.--Prosfilaes (talk) 06:33, 4 March 2013 (UTC)
- Why not? I'm not being flippant, I really don't know why having links in any chunk of text, citation or otherwise, would be discouraged. -- Eiríkr Útlendi │ Tala við mig 05:23, 4 March 2013 (UTC)
- Even in citations? I wouldn't mess with hyperlinks to words in citations, but I don't see any reason to encourage them.--Prosfilaes (talk) 05:08, 4 March 2013 (UTC)
- It is probably best to enclose the quote in
- I think wikilinking is a rather minor issue really. Ivan Štambuk: The abbreviations are a standardised form of Unicode-transliteration - like in the Sanskrit quotation you linked: yes, they're opaque to non-ititiates, but trying to read the hieroglyphs is worse. Eqyptian is an opaque language however you write it, and transliteration is almost always a prerequisite to translation. I do agree that the quotation would benefit from the hieroglyphs, but entering these is very effort-intensive, as it has to be done via Manual-Codage (see 𓍁𓈖𓏭𓆱 again). Finding all the determinatives in Gardiner is a real pain. Hyarmendacil (talk) 07:41, 4 March 2013 (UTC)
- I think you misunderstand what he was referring to when he said "abbreviation". At least for me, it's the whole "Ity, BM EA 586" source that gives me no idea of what it's referring to.--Prosfilaes (talk) 09:52, 4 March 2013 (UTC)
- Yes, I meant the Ity, BM EA 586 part. --Ivan Štambuk (talk) 17:03, 4 March 2013 (UTC)
- I think you misunderstand what he was referring to when he said "abbreviation". At least for me, it's the whole "Ity, BM EA 586" source that gives me no idea of what it's referring to.--Prosfilaes (talk) 09:52, 4 March 2013 (UTC)
Oh, sorry to misunderstand. Those are there because I wasn't really sure how to cite the work. The situation is not the same as in Sanskrit as I'm only working from assorted stelae and not something as well-documented as the Vedas (I don't have the text for the Book of Coming Forth but that would be really handy). "Ity" is the 'author' - or at least, commissioner, the actual scribe being anonymous. "BM EA 586" is the museum designation: British Museum, Egyptian Antiquities, 586. I do realise that it's rather a bad way to cite the work, but it's the only form of cataloguing there is. The only other thing would be just to call it 'the Stela of Ity'. We can't really call all the stele 'untitled' by 'anon.'. Does anyone have any ideas? And also, what about the dating of the works? BC numbers will only be approximate; how about dating by dynasty? Hyarmendacil (talk) 08:46, 5 March 2013 (UTC)
- Why not write out "British Museum, Egyptian Antiquities, 586"?--Prosfilaes (talk) 11:34, 18 March 2013 (UTC)
French words present here and absent from fr.wikt [edit]
You might have a look at fr:Utilisateur:Darkdadaah/Diff/en/2013-03, a large page with more than 14000 French words present here and absent from fr.wikt. I looked only at verb forms. Many feminine or plural past participles forms are defined here while I told my fr.wikt bot they don't exist, e.g. claudiquée. All of them should be checked very seriously, and many should probably be deleted. There are many other conjugation mistakes, e.g. caqueteraient. And there are a few verbs which simply don't exist in French, e.g. déphlogistiguer (but there are some verbs still missing at fr.wikt, too, e.g. surtitrer, and some verbs present in fr.wikt but not yet considered by my bot when the list was generated, e.g. contenir). Lmaltier (talk) 09:36, 3 March 2013 (UTC)
- After deleting bad entries that are bot-generated, please correct the conjugation template of the base verb and add it to User:SemperBlottoBot/feedme so that the bot can have a second attempt. SemperBlotto (talk) 09:58, 3 March 2013 (UTC)
-
- Many forms will simply be correct but fr.wikt doesn't have them yet. And we have a different CFI to them. @SemperBlotto
{{fr-past participle}}as an intr parameter for intransitive. Mglovesfun (talk) 10:02, 3 March 2013 (UTC)
- Many forms will simply be correct but fr.wikt doesn't have them yet. And we have a different CFI to them. @SemperBlotto
- The French page is just too big to handle easily. Could someone generate a simple list of possibly bad verbs that I could check? SemperBlotto (talk) 08:07, 4 March 2013 (UTC)
Diacritics in French capital letters [edit]
In Edouard and other pages, we can read In traditional French orthography, capital letters do not take diacritics, so É becomes E. No, this is wrong. In French orthography, diacritics are kept (e.g. you can read on many French town halls the capitalized French motto LIBERTÉ, ÉGALITÉ, FRATERNITÉ). And typographers know this basic rule. However, diacritics on capitals were not available, and therefore not used, on traditional typewriters. Some newspapers or printed books may forget them too, if not carefully produced. Anyway, this is a typographic or technical issue, not an orthographic issue. Lmaltier (talk) 09:51, 3 March 2013 (UTC)
- I wouldn't say I disagree so much as it's not what I was taught. I was taught it's an error to put the accent on a capital letter. I believe at least some people consider this to be the case so we can't blanket change all such instances (Egypte is another). Let's do some research before we change anything. Mglovesfun (talk) 09:56, 3 March 2013 (UTC)
-
- Things may have changed since 2000, but traditionally this matter varies by country. At least in the 20th century, capitalized text in France normally did not have diacritics (exceptions: signage and monumental texts, such as signposts with town names often did have diacritics...the rule for dropping the diacritics was for texts in books, magazines, and brochures). In Canada, it was a different story...all capitalized texts in Canada retain the diacritics. Some longstanding typographical conventions began to change in the 1990s due to computerization, so maybe this rule is different now. —Stephen (Talk) 10:09, 3 March 2013 (UTC)
-
-
- I asked a friend on Facebook because I wanted a non-Wiktionary opinion:
- "On apprend qu'on ne met pas d'accents lorsque c'est une majuscule, mais de plus en plus de personnes vont te donner des mots avec accents meme lorsque c'est une majuscule. Les deux se valent."
- "We learn that you don't put an accent on when it's a capital letter, but more and more people will put an accent on even when it's a capital letter. Both are ok." (My translation)
- "On apprend qu'on ne met pas d'accents lorsque c'est une majuscule, mais de plus en plus de personnes vont te donner des mots avec accents meme lorsque c'est une majuscule. Les deux se valent."
- Mglovesfun (talk) 10:49, 3 March 2013 (UTC)
- I asked a friend on Facebook because I wanted a non-Wiktionary opinion:
-
- Looking through an assortment of French books in front of me, published variously in the last 15 years, all of them use diacritics with capital letters. Ƿidsiþ 10:55, 3 March 2013 (UTC)
- Looking through various French books I have to hand, I find that there is a lot of variation. Two rules seem to be categorical in these books: (1) Ç is never written C; and (2) text in all-caps always retains all diacritics. In addition, I find two apparent tendencies: (3) the preposition À loses its accent particularly often; and (4) short snippets in front-matter and back-matter keep more diacritics than full paragraph-style prose. —RuakhTALK 18:17, 3 March 2013 (UTC)
- I don't know French, myself, and haven't checked any for accent prevalence, but based on what everyone's written above I propose that we have an entry for whichever form is attested, and that if the accented spelling is attested and not rare then it (even if less common) be the primary entry (with the other a soft redirect).—msh210℠ (talk) 05:50, 4 March 2013 (UTC)
The Académie Française tell us that we must put the accents on capitals. Believing that they're not mandatory is a common misconception. BanunterX (talk) 22:09, 27 April 2013 (UTC)
- P.S.: Here's a list of pages that should be deleted. There's even a template. BanunterX (talk) 22:11, 27 April 2013 (UTC)
- Are you saying that there are no written texts in existence that might use those spellings? Did they all just disappear when the spelling was changed? —CodeCat 22:30, 27 April 2013 (UTC)
- The article doesn't even say they're mandatory. And of course these spellings exist. We don't delete obsolete spellings as soon as a newer spelling comes along. And these aren't even obsolete; they're still used. The fact that some people don't like them isn't irrelevant (hence the usage notes) but isn't relevant to keeping or deleting. Mglovesfun (talk) 09:54, 28 April 2013 (UTC)
- After I post my comment I thought about it and now I totally agree that we shouldn't delete them. But keep in mind that the recommanded spelling is and has always been with diacritics. But since a lot of people believe we don't have to put the diacritics on, the articles should stay. BanunterX (talk) 14:40, 2 May 2013 (UTC)
- The Académie Française link that people are doesn't actually say that accents are obligatory, but recommended (and for good reasons, in my opinion). Also it hasn't 'always been the case', unless you can find some evidence that it has been. Were they recommended in 1700 for example? By who? The Académie Française didn't exist yet. Mglovesfun (talk) 10:48, 3 May 2013 (UTC)
- After I post my comment I thought about it and now I totally agree that we shouldn't delete them. But keep in mind that the recommanded spelling is and has always been with diacritics. But since a lot of people believe we don't have to put the diacritics on, the articles should stay. BanunterX (talk) 14:40, 2 May 2013 (UTC)
- The article doesn't even say they're mandatory. And of course these spellings exist. We don't delete obsolete spellings as soon as a newer spelling comes along. And these aren't even obsolete; they're still used. The fact that some people don't like them isn't irrelevant (hence the usage notes) but isn't relevant to keeping or deleting. Mglovesfun (talk) 09:54, 28 April 2013 (UTC)
- Are you saying that there are no written texts in existence that might use those spellings? Did they all just disappear when the spelling was changed? —CodeCat 22:30, 27 April 2013 (UTC)
Enabling User:Yair rand/FindTrans.js by default [edit]
- Previous discussion: Wiktionary:Grease pit/2013/February#User:Yair rand/FindTrans.js
User:Yair rand/FindTrans.js is a script that, when a user searches for a word that is listed as a translation on an entry but we don't have an entry for, causes the search page to show "(word) is a (language) translation of (word) (gloss)". It's testable via WT:PREFS ("Show translations-listings of words without entries on the search page."). What do people think about enabling it by default? --Yair rand (talk) 05:05, 4 March 2013 (UTC)
- If it isn't buggy then I see little harm (only slowness) and a good deal of benefit, so support. (I haven't tried it much and don't know whether it isn't buggy, but have no reason to think it is.)—msh210℠ (talk) 06:40, 4 March 2013 (UTC)
- Well, no one has objected to it in the couple weeks this thread has been open, so I've enabled the script by default. --Yair rand (talk) 22:50, 19 March 2013 (UTC)
There's no Chechen Swadesh list on Wiktionary! [edit]
And what about other "North Caucasian languages" (= Circassian/Adyghe, Abhaz, Ingush, Avar, Lak, Lezgic languages)? THERE IS NOTHING! This is a kind of "DISCRIMINATION"!!! I wrote about that on User talk:Mglovesfun, Appendix talk:Swadesh lists, User talk:Stephen G. Brown pages... And I found a Chechen-English Dictionary in Latin alphabet (including English-Chechen) http://ingush.narod.ru/chech/awde/ ; but they are saying: "All the words must be written in Cyrillic!", etc. (If they want, they can find a Chechen dictionary with Cyrillic script... and they can use my dictionary as "Romanized Chechen".) Regards, Böri (talk) 13:31, 4 March 2013 (UTC)
- This is a wiki. Content is written by volunteers. We don't have volunteers for the Chechen language, so no one has created a Swadesh list for Chechen. --Vahag (talk) 14:04, 4 March 2013 (UTC)
- I want to write the Chechen list in Latin alphabet... but they said: "All the words must be written in Cyrillic!" and I said "Then you find it!" and "The Chechen people are NOT a Slavic people", etc. Böri (talk) 14:33, 4 March 2013 (UTC)
- What does Chechen being not Slavic have to do with it? —CodeCat 15:45, 4 March 2013 (UTC)
- The Cyrillic script is not their own script. (I wanted to write a Swadesh list for Chechen in Latin alphabet... but they said: "All the words must be written in Cyrillic!", etc.) Böri (talk) 15:51, 4 March 2013 (UTC)
- Who are the "they" you mention here? Did you do this at Wikipedia? Wiktionary does not have any such requirements, so far as I understand it. The bottom of the Appendix:Swadesh_lists#Assorted_Swadesh_lists box explicitly states, "The list may also include: (t) the transcription in Latin characters; (p) the phonetic transcription".
In other words, I don't think you'll encounter much opposition here at Wiktionary, if you decide to create a Swadesh list for Chechen that uses the Latin alphabet. Go right ahead.-- Eiríkr Útlendi │ Tala við mig 16:25, 4 March 2013 (UTC)- Amended above upon reading more.
- Böri, it looks like the table could include the Latin-alphabet transcription as one of the columns, but it should also include a main column using the script most often used for the language in question. For Arabic languages, editors are expected to use the Arabic script. For Korean, Hangul. For Inuktitut, Canadian Syllabics. So for Chechen, the main entry column should be in Cyrillic, as this is the script most often used to write Chechen.
- This has nothing to do with discrimination, at least not on the part of Wiktionary. Wiktionary aims to be descriptive -- describing how things are -- rather than prescriptive -- describing how things should be. -- Eiríkr Útlendi │ Tala við mig 16:33, 4 March 2013 (UTC)
- The Cyrillic script is not their own script. (I wanted to write a Swadesh list for Chechen in Latin alphabet... but they said: "All the words must be written in Cyrillic!", etc.) Böri (talk) 15:51, 4 March 2013 (UTC)
- What does Chechen being not Slavic have to do with it? —CodeCat 15:45, 4 March 2013 (UTC)
- I want to write the Chechen list in Latin alphabet... but they said: "All the words must be written in Cyrillic!" and I said "Then you find it!" and "The Chechen people are NOT a Slavic people", etc. Böri (talk) 14:33, 4 March 2013 (UTC)
- Wikipedia says that Chechen was historically written in Cyrillic only, but it recently started using the Latin alphabet. I think that may be a misunderstanding then. Whoever removed your work probably didn't realise it. So Cyrillic isn't their own script now but it still used to be. (And aside from that, many non-Slavic languages are written in Cyrillic. Many non-Slavic minority languages in Russia are, as are Mongol, Kazach, etc. —CodeCat 16:29, 4 March 2013 (UTC)
- All languages of Russia are written in Cyrillic, no exceptions. -- Liliana • 17:51, 4 March 2013 (UTC)
- From Wikipedia, I gather that it is currently written in both. And although Cyrillic is official, I very much doubt that it's the only script in use, considering the politically tense situation that still exists there. I imagine that since Cyrillic is the hallmark of Russia (the "oppressor" according to nationalists), they probably use Latin as a symbol of independence. So most likely, Latin is still being used unofficially, like Böri indicates. —CodeCat 17:57, 4 March 2013 (UTC)
- Apparently I missed this bit: The choice of alphabet in Chechen is politically significant (as Russia prefers the use of the Cyrillic script, against the separatists' preference for Latin). —CodeCat 17:58, 4 March 2013 (UTC)
- @Liliana, that statement might need some qualification -- Korean and Chinese are spoken in parts of Russia, and those two at least are generally not written in Cyrillic. Perhaps your comment is limited to the official languages of Russia? -- Eiríkr Útlendi │ Tala við mig 18:03, 4 March 2013 (UTC)
-
- Not just that, it also applies to languages without official treatment. Chinese, as in the Dungan variety, is actually written in Cyrillic. -- Liliana • 18:54, 4 March 2013 (UTC)
-
- There is one more exception - Karelian. It's written in Roman letters but its written form is almost non-existent. I don't think there are any complete Karelian-other language dictionaries.
- Böri, I'm Russian. If you think that we discriminate Chechens and the Chechen language, please add some Chechen contents. There is no discrimination here, only not enough contributors for any language. You can't demand anything.
- If you create entries in the script, which is not used by majority of Chechens, by Chechen media but only by some nationalists abroad, it's useless. The Cyrillic spelling for Chechen is official and wide-spread, it's used by the Chechen government, media and people. The Roman spelling for Chechen is unregulated, it's more like chat script with all possible variations. Sorry, not interested in long talks. If you want to make a difference, do some work, if you want to make a point, join some Wikipedia discussion, we work with languages as they are used, not with politics. Any language is welcome, including Chechen but it has to be right. --Anatoli (обсудить/вклад) 21:33, 4 March 2013 (UTC)
- Chechen-Russian Dictionary in Cyrillic script: http://ingush.narod.ru/chech/dict.htm If you want, you can make a list. Böri (talk) 08:50, 5 March 2013 (UTC)
-
- Not just that, it also applies to languages without official treatment. Chinese, as in the Dungan variety, is actually written in Cyrillic. -- Liliana • 18:54, 4 March 2013 (UTC)
- All languages of Russia are written in Cyrillic, no exceptions. -- Liliana • 17:51, 4 March 2013 (UTC)
-
-
-
- The list could be created by a person who knows the language, at least to some extent. Do you know Chechen? If you want, I can create a template, where one just need to add words. There are a few conditions - the words need to be in the lemma form - a dictionary form, they have to be in the correct script - Cyrillic is correct for Chechen. The dictionary above is one-sided. One can't find translation from English into Chechen. We have no policy on the Chechen language, simply because we never had Chechens. Take a look at these Category:Chechen_nouns, Category:Chechen_pronouns and Category:Chechen_adjectives, though. Here's another list I could find:
-
-
|
-
-
-
- For transliteration, I would use Modern Latin, as in w:Chechen_language#Alphabets
-
-
--Anatoli (обсудить/вклад) 10:18, 5 March 2013 (UTC)
- Böri, I just see a total failure to understand, either a failure or a refusal to understand. If we take a valid word we don't have, say hoyau (French). We don't not have it because we're discriminating against French for political reasons, just nobody's created it. To be honest I think you're perfectly capable of understanding, but it's harder for you to make a political point if you understand what we're saying to you, so you're deliberately ignoring us. Mglovesfun (talk) 11:08, 5 March 2013 (UTC)
Putting large tables into subpages? [edit]
I can't remember if this has been suggested/discussed before. It would help pages to load faster if large tables (mostly conjugations and translations) were put into subpages. Other wiktionaries do this (French, Italian, German and maybe more). I have experimented at parlare. We might have to think of a naming convention - parlare has two language sections with conjugations and I have only moved the Italian one. Any thoughts? SemperBlotto (talk) 19:34, 4 March 2013 (UTC)
- Poking around the FR WT, the tables on those separate pages load lickety-split. Experimenting in the Edit view with transcluding those table pages into the main entry page also shows very fast load times. This makes me think that just having them on the main page wouldn't change performance much. This also makes me think that any pages we have here on the EN WT that load slowly due to tables might be because we've gotten overly fancy, or underly optimized.
- (I also note that the FR WT wiki code is quite elegant indeed, and makes much use of transclusion -- though I haven't dug beyond the first layer of transclusion so far.) -- Eiríkr Útlendi │ Tala við mig 20:25, 4 March 2013 (UTC)
- Since nobody is particularly bothered, one way or the other, I shall put it back the way it was. SemperBlotto (talk) 10:54, 9 March 2013 (UTC)
- I think the status quo is okay. Inflection tables seem unlikely to take too much time to load, anyway. What I expect to take longer time to load are extensive translation tables. --Dan Polansky (talk) 11:00, 9 March 2013 (UTC)
- I dunno, does manual inflection like prosum or memini take considerably longer to load for you? —Μετάknowledgediscuss/deeds 17:30, 9 March 2013 (UTC)
- I think the status quo is okay. Inflection tables seem unlikely to take too much time to load, anyway. What I expect to take longer time to load are extensive translation tables. --Dan Polansky (talk) 11:00, 9 March 2013 (UTC)
Japanese On readings [edit]
- "Katakana are used to indicate the on'yomi (Chinese-derived readings) of a kanji in a kanji dictionary." Katakana - Wikipedia
- "Generally, On-yomi are written in CAPITAL LETTERS (katakana in kana), and Kun-yomi are written in lowercase letters (hiragana in kana), the On-yomi usually having some sort of break to separate the kanji from the extra kana involved (called okurigana (おくりがな))." On-yomi and Kun-yomi (Romaji version) - TheJapanesePage.com
- "On-yomi, on the other hand, is mostly used for words that originate from Chinese, which often use 2 or more Kanji. For that reason, on-yomi is often written in Katakana." Kanji - Learn Japanese
The on readings of kanji should be in katakana(カタカナ), not hiragana(ひらがな). Hiragana is for native Japanese and katakana is for loan-words. On readings come from China, while the kun readings are native to Japan. Besides the underlying reason why on-yomi should be written in katakana, there are many other sites (in particular Japanese domain sites) that follow this guideline. Even on sites and applications that don't use kana, the onyomi reading is written in uppercase while the kunyomi is written in lowercase, further distinguishing the two types of readings. The following sites should provide ample proof: Denshi Jisho, WWWJDIC, goo.
As for how to correct this mistake, most of the on readings could be corrected with a bot script that targets "On: " in Japanese-related pages and changes hiragana characters to katakana. As for other instances where on-readings are used, those may need to be manually changed. --74.85.193.187 20:34, 5 March 2013 (UTC)
- Who are you and why does this matter to you? The above appears to be your sole post.
- FWIW, my copy of Shogakukan's 国語大辞典 (Kokugo Dai Jiten) uses hiragana for all listings except 外来語 (gairaigo, “foreign loanwords”). Same for my copy of 大辞林 (Daijirin). Same for my copy of the 新明解国語辞典 (Shin Meikai Kokugo Jiten). I would hardly classify this use of hiragana as a "mistake".
- Whether to use hira- or katakana appears to be convention, and not a hard-and-fast rule. As such, I'm perfectly happy not undertaking the major task of changing all on'yomi from one to the other, especially when it would bring about no real change in the site's usability.
- If another editor decides that doing so would be a good use of their time, I don't think I'd necessarily be opposed. However, when there are so many other more important and useful things to do, I wonder why on'yomi kana would be an issue. -- Eiríkr Útlendi │ Tala við mig 22:20, 5 March 2013 (UTC)
- ... Same for Kenkyūsha Online (paywalled). Same for Eijiro when searching E>J and yomigana display is enabled. Same for 世界大百科事典 (Sekai Dai Hyakka Jiten), or the デジタル大辞泉 (Dejitaru Daijisen), or the 百科事典マイペディア (Hyakka Jiten Maipedia), all as shown for example on the 鼎 entry at Kotobank... -- Eiríkr Útlendi │ Tala við mig 18:23, 6 March 2013 (UTC)
- I'm the OP. Sorry about that, I previously didn't have a reason to make an account and the username I preferred had already been taken on a different linked service. I am not a native of Japan or Japanese in any way. I simply dislike inconsistencies.
- I have come to realize that I might have been slightly overzealous. The only time that it really needs changing is when it's specifically stated as on-yomi to show that the it is not originally native to Japan, such as on pages like 一. Outside of that context, hiragana is indeed used for readings most of the time as, even though those readings have a foreign origin, they are not considered foreign words (such as アイスクリーム). From what I understand, katakana is similar to italicizing in English in some respects. --Soardra (talk) 06:45, 17 March 2013 (UTC)
- Hello Soardra, thank you for clarifying. In just about every dictionary I've looked at, readings considered to be on'yomi are written in hiragana. The only sense on the 一 entry that might take katakana would be for reading ī, coming from modern Mandarin 一 (yī) -- which we don't have yet. Notably, this reading is considered to be a 外来語 (gairaigo, “foreign loan word”), and thus naturally takes katakana. Was there something else on the 一 entry that you had in mind? -- Eiríkr Útlendi │ Tala við mig 07:24, 17 March 2013 (UTC)
- After some thought, I'd like to clarify what I meant in my previous statement. I believe that katakana should be use for on readings only as they appear in the Readings heading of Kanji entries to better distinguish the origin of the reading. --Soardra (talk) 14:02, 6 May 2013 (UTC)
- That's certainly a more limited scope (and thus easier to implement), but 1) the sizable corpus of existing kanji entries all use hiragana for the
===Readings===section, so changing this would be non-trivial; 2) these hiragana spellings all link to the corresponding hiragana entries, so changing all of these to katakana would again be non-trivial (though I suppose you could have katakana as the displayed script, and actually link to the hiragana entries, but that would be confusing); and 3) the readings section already clearly indicates (or at least should do so, if properly formatted) which readings are on'yomi, which are kun'yomi, etc.
- I somewhat understand your desire for using katakana for on'yomi, but I don't think it's that important, and making the change would be a considerable amount of work, for no notable gain in usability. -- Eiríkr Útlendi │ Tala við mig 17:15, 6 May 2013 (UTC)
- That's certainly a more limited scope (and thus easier to implement), but 1) the sizable corpus of existing kanji entries all use hiragana for the
- After some thought, I'd like to clarify what I meant in my previous statement. I believe that katakana should be use for on readings only as they appear in the Readings heading of Kanji entries to better distinguish the origin of the reading. --Soardra (talk) 14:02, 6 May 2013 (UTC)
- Hello Soardra, thank you for clarifying. In just about every dictionary I've looked at, readings considered to be on'yomi are written in hiragana. The only sense on the 一 entry that might take katakana would be for reading ī, coming from modern Mandarin 一 (yī) -- which we don't have yet. Notably, this reading is considered to be a 外来語 (gairaigo, “foreign loan word”), and thus naturally takes katakana. Was there something else on the 一 entry that you had in mind? -- Eiríkr Útlendi │ Tala við mig 07:24, 17 March 2013 (UTC)
- ... Same for Kenkyūsha Online (paywalled). Same for Eijiro when searching E>J and yomigana display is enabled. Same for 世界大百科事典 (Sekai Dai Hyakka Jiten), or the デジタル大辞泉 (Dejitaru Daijisen), or the 百科事典マイペディア (Hyakka Jiten Maipedia), all as shown for example on the 鼎 entry at Kotobank... -- Eiríkr Útlendi │ Tala við mig 18:23, 6 March 2013 (UTC)
Whitelist Choor monster [edit]
Because I'm a naughty boy I can't edit WT:WL any more, but it's definitely about time that User:Choor monster stopped having to have all of his/her edits approved, right? Equinox ◑ 22:16, 5 March 2013 (UTC)
.gender-period [edit]
Is anyone using the .gender-period class to show the hidden periods after abbreviated gender templates? For example, changing “m pl” to “m. pl.”
If so, please change your code to the following:
/* add a period after abbreviated genders and numbers */
abbr.gender:after, abbr.number:after { content: '.'; }
This will work without any extra code in the relevant templates, which I will remove shortly. Previous discussion was at Wiktionary:Grease_pit/2013/February#How should Category:Gender and number templates be converted to Lua?. —Michael Z. 2013-03-05 22:35 z
sh User languages - Bosnian, Croatian, Serbian merged into Serbo-Croatian [edit]
I have moved the bs, hr, sr language user categories and templates to sh. Please update your Babel, if you're using one of the language templates more than once! --Anatoli (обсудить/вклад) 07:14, 8 March 2013 (UTC)
- I'm not sure if that's a good idea. While we can at least have a consensus that we treat them as one language on Wiktionary, it goes a bit far to expect everyone else to declare their own native language as such. We can at least accommodate them by allowing them to choose. Ivan Štambuk for example has both "sh" and "hr" on his page, even though he was one of the main proponents of merging them. —CodeCat 13:35, 8 March 2013 (UTC)
- Ivan Štambuk told me he had both because he wanted to make clear that he was a native of Croatian variant of sh. I followed his example and put two on my Babel also: "sr" and "sh". I don't have any strong feelings about this merging, but I think it is a good idea to have an option of labeling somehow which variant you natively speak. Maybe making a subtemplate of some kind. --biblbroksдискашн 19:46, 9 March 2013 (UTC)
- Does an American who never heard New Zealand, Australian, etc. accent really need to mark it specifically? Marking
{{Babel-8|Cyrl}}may suffice to let users know that one knows or doesn't know Cyrillic. Serbs and Montenegrins are comfortable with Roman letters, Croatian and Bosnians are less comfortable with Cyrillic. I think it doesn't make sense to treat the knowledge of Serbo-Croatian varieties as the knowledge of different languages if we decided that we treat them as one language. Arguments still happen around words like hleb/hljeb/kruh/kruv or šta/kaj but Serbo-Croatian speakers may prefer to decide whether they want to reveal their origin or just claim to know Serbo-Croatian. --Anatoli (обсудить/вклад) 08:54, 10 March 2013 (UTC)
-
- I’ve been watching WT:FB for a while, and the amount of Americans who complained about New Zealand English and Australian English being considered the same language as American English is zero. On the other hand, there were hundreds who complained about the Serbo-Croatian merger. Obviously this is a much more sensitive issue than English. Merging the Babel templates will do no good at all; all it will do it piss off contributors who consider Croation/Serbian/etc. to be various languages. We can force people to edit Serbo-Croatian as one language, but we can’t force them to believe it is one language. — Ungoliant (Falai) 13:20, 10 March 2013 (UTC)
-
-
- Agreed. User pages don't need to be coerced to the same standards main space is.--Prosfilaes (talk) 08:08, 11 March 2013 (UTC)
-
FWOTD is lacking variety and quoted entries [edit]
FWOTD currently doesn't have much variety, and many of the listed nominations are still lacking quotations and/or pronunciation, which is making it hard to keep it going. If anyone could help out by adding quotations and pronunciations to the nominations, and by nominating new words with both as well, that would be very useful. —CodeCat 15:16, 8 March 2013 (UTC)
Bot to clean the Wiktionary:Sandbox and Template:Sandbox [edit]
Hello everyone, we need a bot cleaning Wiktionary:Sandbox and Template:Sandbox but my bot request to run one was closed on the basis that a discussion should occur deciding how often the bot should clean the sandbox. The relevant BRFA can be found here. Normally, my sandbot runs every hour but here, it seems something closer to six hours is more appropriate (IMHO anyways). What are your thoughts? -Riley Huntley (SWMT) 08:16, 9 March 2013 (UTC)
- It shouldn't be cleaned too often, because we don't want to wash away someone's sandcastles. If it is cleaned more, then there is also a higher chance that the cleanup will accidentally wipe away something that someone is working on. —CodeCat 10:16, 9 March 2013 (UTC)
- I agree. No more often than once per day. SemperBlotto (talk) 10:51, 9 March 2013 (UTC)
- Okay, once a day is reasonable. Also, the "sandcastles" aren't just washed away, a user can always retrieve there information from the history. There is also a warning on the page for this reason; "Any content added to this page may be deleted in twelve hours or less. Do not use this page for anything that you want to keep." -Riley Huntley (SWMT) 20:29, 9 March 2013 (UTC)
- Around four hours is ideal IMO, but I’ll support the bot for any period >= one hour. — Ungoliant (Falai) 20:47, 9 March 2013 (UTC)
The page history is potentially irrelevant to someone who’s just learning wikitext.
Does the bot leave the sandbox alone if someone has edited it in the last hour? There should be some minimum idle period that a new user should have to keep messing with the sandbox. Wiping it out while they are refreshing the page would just be a discouraging prank.
And if the sandbox is wiped out every six hours, then the message should be updated to reflect the reality. —Michael Z. 2013-03-09 21:23 z
- I agree (w/Michael Z.). —RuakhTALK 00:51, 10 March 2013 (UTC)
- The bot checks to see if the page has been edited in the last 15 minutes. If it has been, it delays itself for another 15 minutes. -Riley Huntley (SWMT) 04:25, 10 March 2013 (UTC)
- I really don't see the need for such a bot. But if others do, then how about cleaning daily but no less than two hours since the last edit? (And incidentally what's BRFA? I mean, I gather it means a bot-approval vote, but what is it supposed to stand for? It it some kind of enWP jargon?)—msh210℠ (talk) 04:29, 10 March 2013 (UTC)
- Re: parenthetical question: Yup, enWP jargon; w:WP:BRFA is w:Wikipedia:Bots/Requests for approval. —RuakhTALK 07:12, 10 March 2013 (UTC)
Looking at a few page’s of the sandbox’s history, I would suggest wait 30 minutes before cleaning. Most experimenters are editing for just a few minutes, but some spend 20–30 minutes there. Of course there’s no indication of whether they are then finished, or perhaps refer to the results a few minutes afterwards. Here’s a longer editing session, interrupted. —Michael Z. 2013-03-10 17:04 z
-
- It is fine by me to extend the delay, we just all need to be able to decide on how long. :) -Riley Huntley (SWMT) 21:28, 10 March 2013 (UTC)
Discussion here has slightly stagnated, and our worthy Wikipedian seems anxious to run this bot — not to mention that a number of others also think it's a good idea. We have the following proposals on the table:
- every 1 hour (proposition, vote)
- every 6 hours (SGB, vote)
- every 24 hours, no more often (SB, vote and here)
- only if idle 2 hours (MZ, vote; myself, here)
- every 4 hours (Ungoliant, here, who will support anything less often than Q 1 hour)
- only if idle 0.5 hours (MZ, here)
Based on this, I think that a vote for the bot to empty out the Sandbox every 6, or every 24, hours, but no less than an hour since the last edit, will likely pass. (AFAICT now, I'll support either of those, myself, fwiw.) I recommend that you (Riley Huntley) start a vote with one of those options (or both as alternatives, if you like); of course, you can feel free to start any vote you like, if you read the above differently. Does anyone disagree with or want to voice some objection to my summary? Or, on another note, does anyone have a thought on the bot that has not already been voiced?—msh210℠ (talk) 05:36, 12 March 2013 (UTC)
- Works for me, although I am going on a vacation for a short while so I will have to start the vote when I am back. -Riley Huntley (SWMT) 23:26, 12 March 2013 (UTC)
Do we even need a sandbox? [edit]
- I just realised we can try a completely different approach. We're already in the habit of using a subpage of our own user page as a sandbox. So do we actually need a central sandbox? On our current sandbox page, we could provide a link to a user's own sandbox (Special:MyPage/sandbox) and then lock the page so that users can't accidentally edit it. How is that? —CodeCat 00:09, 14 March 2013 (UTC)
- What about IPs? — Ungoliant (Falai) 00:14, 14 March 2013 (UTC)
- Don't IPs have user pages? —CodeCat 00:25, 14 March 2013 (UTC)
- I mean, won’t it be a waste of space to let IPs create their sandboxes? — Ungoliant (Falai) 00:31, 14 March 2013 (UTC)
- I suppose so. Maybe we should keep the sandbox open for them then. But linking to a user's subpage is still a good idea I think. We could also add a message saying that getting such a page is one of the benefits of registering, so that you get to keep your sandbox edits indefinitely. —CodeCat 00:36, 14 March 2013 (UTC)
- I mean, won’t it be a waste of space to let IPs create their sandboxes? — Ungoliant (Falai) 00:31, 14 March 2013 (UTC)
- Don't IPs have user pages? —CodeCat 00:25, 14 March 2013 (UTC)
- Great idea, CodeCat.—msh210℠ (talk) 06:30, 14 March 2013 (UTC)
- What about IPs? — Ungoliant (Falai) 00:14, 14 March 2013 (UTC)
- Horrible idea. It would leave terrabytes of crap all over the wiki. Just leave things as they are - it ain't broke, don't try to fix it. SemperBlotto (talk) 15:39, 14 March 2013 (UTC)
- I don't think we 'need' a sandbox more than we 'need' WT:LOP, both are really just to attract vandals away from the main namespace. Mglovesfun (talk) 15:43, 14 March 2013 (UTC)
-
- Keep old system per Semper. —Μετάknowledgediscuss/deeds 22:39, 14 March 2013 (UTC)
Wiktionary:Etymology scriptorium/March 2013 [edit]
As much as I would love to ignore KYPark (talk • contribs) and his voluminously pointless "contributions" to the Etymology scriptorium, I have trouble accepting his creation of a subpage for March and copying all the posts from this month to it. This has the effect of removing those threads from watchlists and possibly also interrupting their edit histories in violation of our licensing. Was anyone else aware of this? And what, if anything, should we do about it? Chuck Entz (talk) 06:51, 10 March 2013 (UTC)
- Closer examination shows that all of the material was posted by KYPark, so the copyright issue is pretty marginal. It will be a bit of a mess when anyone else wants to post anything, though. Chuck Entz (talk) 07:46, 10 March 2013 (UTC)
- I don't see how it's different from any other archiving. If the main ES page is on your watchlist, you must have noticed him removing 4861 K of text, and it's easy enough to add the March subpage to your watchlist. But it is frustrating that it's so difficult to have an actual discussion of etymology at ES because of his lengthy diatribes and discussions with himself. —Angr 16:29, 10 March 2013 (UTC)
- I agree. I've tried to get through to him but he just doesn't seem to "get it". I don't really want to get more forceful with him but... what else can we do? —CodeCat 17:41, 10 March 2013 (UTC)
- The only way I've ever gotten through to him was by making concrete threats. That evidently worked, because although much fewer of his edits are actually useful mainspace edits now, at least those that he does make are always good (at least in English and Korean, I can't judge the rest). However, if you see him doing something questionable (especially in etymologies), please tell me. —Μετάknowledgediscuss/deeds 17:47, 10 March 2013 (UTC)
- So what are we going to do? —CodeCat 15:31, 14 March 2013 (UTC)
- I propose that we move every similar topic he starts to User:KYPark/FOO (where FOO is the topic’s title) from now on. — Ungoliant (Falai) 15:51, 14 March 2013 (UTC)
- Agreed. It would be nice if we could tell him not to start them in the ES in the first place, though. —CodeCat 15:53, 14 March 2013 (UTC)
- I did tell him in the most recent discussion (WT:ES#dung beetle, see my first comment), but he answered with his typical poetaster discourse. Someone should tell him in his user page though. — Ungoliant (Falai) 16:02, 14 March 2013 (UTC)
- Agreed. It would be nice if we could tell him not to start them in the ES in the first place, though. —CodeCat 15:53, 14 March 2013 (UTC)
- I propose that we move every similar topic he starts to User:KYPark/FOO (where FOO is the topic’s title) from now on. — Ungoliant (Falai) 15:51, 14 March 2013 (UTC)
- So what are we going to do? —CodeCat 15:31, 14 March 2013 (UTC)
- The only way I've ever gotten through to him was by making concrete threats. That evidently worked, because although much fewer of his edits are actually useful mainspace edits now, at least those that he does make are always good (at least in English and Korean, I can't judge the rest). However, if you see him doing something questionable (especially in etymologies), please tell me. —Μετάknowledgediscuss/deeds 17:47, 10 March 2013 (UTC)
- I agree. I've tried to get through to him but he just doesn't seem to "get it". I don't really want to get more forceful with him but... what else can we do? —CodeCat 17:41, 10 March 2013 (UTC)
- I don't see how it's different from any other archiving. If the main ES page is on your watchlist, you must have noticed him removing 4861 K of text, and it's easy enough to add the March subpage to your watchlist. But it is frustrating that it's so difficult to have an actual discussion of etymology at ES because of his lengthy diatribes and discussions with himself. —Angr 16:29, 10 March 2013 (UTC)
- If he could just post this stuff to a blog website and not here, that would be the ideal situation. Mglovesfun (talk) 16:30, 14 March 2013 (UTC)
- But just what is "this stuff" exactly? I think most of us know what it is, but how do we explain it? We can't just say he should not post anything we think is more of "that stuff"... We need a more objective criterium so he also knows what we expect of him. —CodeCat 16:42, 14 March 2013 (UTC)
- "This stuff" is speculative etymologies based on superficial similarities, whether semantic or phonological. He has stated that he is opposed to positivism, and his proposed etymologies reflect that, but he needs to accept that the vast majority of editors around here expect etymologies—and discussions at ES—to based on hard linguistic evidence, not introspection. —Angr 19:12, 14 March 2013 (UTC)
- So "this stuff" is basically anything that lacks the scientific rigour we expect from etymological discussions, and which is therefore not of any concrete use for the dictionary? I suppose that since his posts aren't intended to improve Wiktionary's content directly (he should know how Wiktionary works by now), he's really just trying to stir up discussion, which falls under our "Wiktionary is not a forum" rule. We could probably use that as a justification to move his posts. —CodeCat 03:04, 15 March 2013 (UTC)
- "This stuff" is speculative etymologies based on superficial similarities, whether semantic or phonological. He has stated that he is opposed to positivism, and his proposed etymologies reflect that, but he needs to accept that the vast majority of editors around here expect etymologies—and discussions at ES—to based on hard linguistic evidence, not introspection. —Angr 19:12, 14 March 2013 (UTC)
- But just what is "this stuff" exactly? I think most of us know what it is, but how do we explain it? We can't just say he should not post anything we think is more of "that stuff"... We need a more objective criterium so he also knows what we expect of him. —CodeCat 16:42, 14 March 2013 (UTC)
Template:bot owner [edit]
On the English Wikipedia, they have a template called Template:User bot owner. I say that we should import it, because it provides a standardised way to include very relevant information about a user's activities in an easily accessible format. A short earlier discussion on the topic is here. --Njardarlogar (talk) 08:43, 12 March 2013 (UTC)
- Instead of import, just make a Wiktionary version. Mglovesfun (talk) 12:15, 12 March 2013 (UTC)
- It's afoul of WT:UBV unless and until there's consensus allowing it.—msh210℠ (talk) 14:41, 12 March 2013 (UTC)
- So are the userboxes for script competence (
{{User Latn}},{{User Grek-4}},{{User Cyrl-3}}etc.) and the userboxes for coding ability ({{User template-2}},{{User Lua-0}}, etc.) —Angr 14:59, 12 March 2013 (UTC)- Uh, no. We had discussion and nobody complained, so it's basically consensus. UBV specifically allows exceptions when supported by consensus. —Μετάknowledgediscuss/deeds 23:46, 12 March 2013 (UTC)
- I think the general consensus is that anything that's on your user page that supports your work here is ok. This user box fits that, so I support it and any other future proposals. I think a separate requirement just for userboxes is a bit silly, when it's the general idea that counts. —CodeCat 23:57, 12 March 2013 (UTC)
- WT:NPOV still says, "Language-proficiency userboxes are encouraged, and may be added easily using
{{Babel}}. All other userboxes are currently forbidden (though specific exceptions may be made, after discussion)." Nowhere does it say that script and coding userboxes have been deemed acceptable, nor does it provide a link to any discussion or vote permitting them. I don't personally oppose them (at least, no more than I oppose the language-proficiency userboxes, which I consider silly and which I only have on my user page because it was expected of me when I ran for admin), but as far as I can tell they are a violation of the letter, if not the spirit, of the law around here. —Angr 21:58, 13 March 2013 (UTC)- Yes the rules are out of date. I bet there's no appetite to update them either. If so, the status quo is the best solution. Mglovesfun (talk) 22:26, 13 March 2013 (UTC)
- WT:NPOV still says, "Language-proficiency userboxes are encouraged, and may be added easily using
- I think the general consensus is that anything that's on your user page that supports your work here is ok. This user box fits that, so I support it and any other future proposals. I think a separate requirement just for userboxes is a bit silly, when it's the general idea that counts. —CodeCat 23:57, 12 March 2013 (UTC)
- Lua is a language. The vote doesn’t specify that it must be human language. — Ungoliant (Falai) 11:23, 14 March 2013 (UTC)
- Interesting. So if I were to create userboxes for all the different programming languages I'm proficient in, you don't think I'd be violating the policy? —RuakhTALK 15:23, 14 March 2013 (UTC)
- I suppose it would be a violation of the spirit but not the letter of the law. I don't think Ungoliant MMDCCLXIV is suggesting we acually do it, just pointing out that the language used is ambiguous. Mglovesfun (talk) 15:28, 14 March 2013 (UTC)
- Not in violation of what’s written, no. But, IMO, coding proficiency userboxes (for the languages and whatnot we use here) are as important as the human language userboxes. — Ungoliant (Falai) 15:58, 14 March 2013 (UTC)
- Interesting. So if I were to create userboxes for all the different programming languages I'm proficient in, you don't think I'd be violating the policy? —RuakhTALK 15:23, 14 March 2013 (UTC)
- Uh, no. We had discussion and nobody complained, so it's basically consensus. UBV specifically allows exceptions when supported by consensus. —Μετάknowledgediscuss/deeds 23:46, 12 March 2013 (UTC)
- So are the userboxes for script competence (
I've created it at Template:bot owner. --Njardarlogar (talk) 19:20, 29 March 2013 (UTC)
- I've moved it to
{{User bot owner}}because I think it's customary for user templates to begin with "user" to make them recognisable. —CodeCat 21:32, 29 March 2013 (UTC)- Yeah, it was a rather hasty import. I created the /core bit since I am lazy. It should do its job, though. --Njardarlogar (talk) 22:25, 29 March 2013 (UTC)
Apple touch icon [edit]
I just discovered this: http://en.wiktionary.org/apple-touch-icon.png This is the image used when Wiktionary is saved as a home screen icon on an iOS device. Where the heck did that logo come from? —Michael Z. 2013-03-14 05:24 z
- That's really random, but honestly I like the look of it. No idea whence it came. —Μετάknowledgediscuss/deeds 22:42, 14 March 2013 (UTC)
-
-
- Anyone know how to go about replacing this weirdness with the newly-official favicon? I found a reference at mw:Manual:$wgAppleTouchIcon, but that doesn’t offer any help on how to upload a replacement file. —Michael Z. 2013-03-14 23:16 z
-
-
-
-
- I see. The documentation is quite clear; it says that the value can be a "relative path or [an] absolute URL". All we have to do is create a bug to let the devs know we want the same value as whatever our favicon has. I'll start the bug if nobody else does. —Μετάknowledgediscuss/deeds 00:19, 15 March 2013 (UTC)
-
-
-
-
-
-
- Please do. But we also need to make appropriately-sized versions of the png file, and get them uploaded to that standard URL.[1] I’d be glad to prepare a file, but I don’t know where to find the logo. It’s not in commons:Category:Wiktionary_icons. All I can find is File:favicon.png. Is there one larger than 32×32 px at all? —Michael Z. 2013-03-15 01:14 z
-
-
-
-
-
-
-
-
- Can't we just blow up the 32x32 and reupload? —Μετάknowledgediscuss/deeds 01:40, 15 March 2013 (UTC)
-
-
-
-
-
-
-
-
-
-
- Is there no SVG version? —CodeCat 02:58, 15 March 2013 (UTC)
- I'm not good with graphics, but we can go to Commons and get someone to retouch it for us. —Μετάknowledgediscuss/deeds 03:51, 15 March 2013 (UTC)
- Is there no SVG version? —CodeCat 02:58, 15 March 2013 (UTC)
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- I like it! —Μετάknowledgediscuss/deeds 18:57, 15 March 2013 (UTC)
-
-
-
-
-
-
-
-
Filed bug 46431: Update Apple touch icon for en.wiktionary.org. —Michael Z. 2013-03-21 19:35 z
Etymology of Entomology [edit]
Not listened to it meself yet, but (for the next two days) this BBC radio programme is available: [2]. Listening from outside the UK will require technical jiggery-pokery due to their restrictions. Equinox ◑ 20:21, 14 March 2013 (UTC)
Category:Modules [edit]
I have created a categorisation system for modules which I ask all editors who may create modules to look at for future reference. The subcategories of Category:Modules may seem somewhat empty to you due to some modules not being listed yet (if you want, you can do a null edit to cause them to be categorised). Don't let that fool you — all non-experimental modules that currently exist are categorised in a subcategory, each which has a blurb explaining its function at the top. Categories are placed on /doc subpages (not /documentation, which is invalid).
If you have any problem with the categorisation system, now is the time to voice your ideas, before the number of modules grows unmanageable. —Μετάknowledgediscuss/deeds 05:20, 15 March 2013 (UTC)
- What do you mean, /documentation is invalid? Also, I think it's better if modules are categorised together with templates. It doesn't make sense if templates are kept separate from the modules they use when they are really closely tied to them. Or had you intended to do both? —CodeCat 14:18, 15 March 2013 (UTC)
- Re /documentation: The change hadn't happened yet, my mistake. More discussion at User talk:Metaknowledge#Could you move the documentation subpages back please?.
- Re modules categorised with templates: They are. Take a look. —Μετάknowledgediscuss/deeds 18:56, 15 March 2013 (UTC)
WebFonts. [edit]
Now that we've had WebFonts for a few months — what do people think of it? Do we want to keep it?
There's some discussion at Wiktionary:Grease pit/2013/February#English Main Page Has Started Crashing in Safari - May Be Font Download Problem of what appears to be a problem with it, but I think that how we proceed with that problem depends on how we feel about WebFonts in general. (Either way we'll presumably open a ticket, but the ticket can be "WebFonts has this problem that should be fixed", or it can be "Please remove WebFonts!")
—RuakhTALK 07:26, 16 March 2013 (UTC)
- I can't really say because I don't think it's actually ever been necessary for me. On the other hand, I have noticed that some Javascripts are replacing fonts while the page is loading, which looks a bit strange to me. —CodeCat 15:03, 16 March 2013 (UTC)
-
- I wish these could be off by default and enabled selectively, preferably per browser.
-
- On a desktop I am loading 1.6 MB of fonts which are absolutely unnecessary, because I already have fonts for all of the web fonts languages. On a mobile I presume that I am loading 1.6 MB of fonts, risking a significant increase in my monthly bill, and I still see boxes for some scripts.
-
- Are there any options in how these are set up? Can readers have any control? —Michael Z. 2013-03-16 15:27 z
- Special:Preferences (Appearance tab, down the bottom) has an option to turn it off, or on. Ho finer control is possible unless ULS is installed. This, that and the other (talk) 10:49, 17 March 2013 (UTC)
- Are there any options in how these are set up? Can readers have any control? —Michael Z. 2013-03-16 15:27 z
- I don't know. --Dan Polansky (talk) 12:14, 17 March 2013 (UTC)
- A very large part of our content is completely unusable for many/most users without WebFonts. I don't think removing it without a replacement is at all feasible. --Yair rand (talk) 18:48, 17 March 2013 (UTC)
- What triggers a WebFonts download? Do they stay downloaded as long as one's computer is on, as one's browser, window, or tab is open?
- If they occur when one first opens Wiktionary, then occasional users on low-bandwidth connections may already be finding Wiktionary unusable. DCDuring TALK 22:29, 17 March 2013 (UTC)
- I believe it's only the fonts used by a particular page that are loaded when you visit that page. I'm hoping that WebFonts doesn't force you to download them again when they're already in your browser's cache from the last time you visited a page with those fonts, but I don't know for sure. I've turned off web fonts for now after having my browser crash when I visit pages that use Burmese fonts (I run Firefox 16.02 on a Mac with OSX 10.5.8, and I already have a Burmese font installed. Safari 5.06 has no problem on the same pages). Chuck Entz (talk) 23:08, 17 March 2013 (UTC)
- (Re Yair.) Assume, temporarily and arguendo, that people interested in a certain language have a Unicode font for it. Is everything visible to every interested party, then, even without WebFonts?—msh210℠ (talk) 06:58, 19 March 2013 (UTC)
- I dislike it as a default because of page-load-time issues. I have no objection to having it as an option.—msh210℠ (talk) 06:59, 19 March 2013 (UTC)
I am just glad we don't have a CJK font as part of WebFonts yet. That would totally break down the entire infrastructure. -- Liliana • 16:02, 19 March 2013 (UTC)
I just turned the Webfonts option back on and reloaded this page. It loaded one font, Deva/Lohit-Devanagari.woff. When I check the url’s headers with curl -I, it returns Cache-Control: max-age=2592000, which I believe is telling your browser not to cache the resource for over 30 days. Of course, your browser could purge its cache much sooner on its own.
32 days would be better, because that might at least not force it to download twice in one billing period. —Michael Z. 2013-03-19 16:14 z
Vote for bug 46327: Don’t purge cache twice in one billing period, for webfonts and other large resources. —Michael Z. 2013-03-19 16:25 z
- Just one minor correction: 259200seconds is 3 days. One more zero is needed for 30 days. --biblbroksдискашн 20:37, 19 March 2013 (UTC)
Appendix:Unicode/CJK Radicals Supplement [edit]
I just wanted to get clarification regarding the Chinese/CJKV characters that are part of Appendix:Unicode/CJK Radicals Supplement. These are variations of CJK radicals that are not located in the other CJKV Unicode ranges. I attempted to make a redirect of one of these characters to a wiktionary article page of that radical's parent character several months ago but was told not to do that. The thing is, all of the characters in the CJK Radicals Supplement range that aren't red links are redirects (such as the one I attempted to make) dating back to 2010 or so. What does the community think the best course of action is? Treat these as individual entries in respect to being in a separate Unicode range than the other CJKV characters or to continue making redirects to the parent character entries (located in the main CJK Unified Ideographs ranges) that contain the main definition information? Bumm13 (talk) 14:29, 16 March 2013 (UTC)
Proposal on Meta that the WMF fund or take over WebCite [edit]
WebCite is currently in financial trouble; unless they can raise enough money to go on, they may have to stop accepting new pages or even delete pages they currently host. It's a good thing we don't consider them durable! Wikipedia, however, did rely on them, so there is currently discussion on Meta of the WMF funding WebCite, either by giving them a grant or by taking over the service. WebCite is receptive to the idea; some thoughts from w:User:Philippe (WMF) are here. You may wish to read and contribute to the discussion on Meta here. - -sche (discuss) 18:18, 16 March 2013 (UTC)
- If it is hosted by WMF, will that make it durable for our purposes? On one hand, if WebCite goes down because Wikimedia is having trouble, that probably means Wiktionary has other problems to worry about itself. But on the other hand, Wiktionary can be mirrored so it(s content) could live on after Wiktionary itself goes down. —CodeCat 21:54, 16 March 2013 (UTC)
- If WebCite goes down, it will not be because "Wikimedia is having trouble", but instead because Wikimedia editors can't be trusted to come to consensus. I hope that WMF hosts it, or at least donates. —Μετάknowledgediscuss/deeds 22:02, 16 March 2013 (UTC)
- I don't think that's what CodeCat means. I think she's suggesting that if WMF does start to host WebCite, then maybe we might as well start considering it durable, because if it were to go down due to WMF having trouble (at some hypothetical future date), then Wiktionary itself would be in doubt (except perhaps for mirrors). —RuakhTALK 08:00, 17 March 2013 (UTC)
- Ah, I see. Sorry, English is a really odd language. She was evidently using the present tense (well, present progressive and then simple present in a logically connected clause set) to refer to the hypothetical (i.e. long-term) future whereas I thought it was referring to the known (i.e. short-term) future. Why oh why did we have to give up on the subjunctive! —Μετάknowledgediscuss/deeds 16:15, 17 March 2013 (UTC)
- I don't think that's what CodeCat means. I think she's suggesting that if WMF does start to host WebCite, then maybe we might as well start considering it durable, because if it were to go down due to WMF having trouble (at some hypothetical future date), then Wiktionary itself would be in doubt (except perhaps for mirrors). —RuakhTALK 08:00, 17 March 2013 (UTC)
- If WebCite goes down, it will not be because "Wikimedia is having trouble", but instead because Wikimedia editors can't be trusted to come to consensus. I hope that WMF hosts it, or at least donates. —Μετάknowledgediscuss/deeds 22:02, 16 March 2013 (UTC)
Wiktionary:Templates [edit]
I have added a lot of information to this page, particularly concerning common practice on Wiktionary which isn't documented anywhere else. I hope it is useful, and I also hope that it is enough to get the idea across. I've tried to go for a more normative/prescriptive approach, so that it helps new users decide more easily which practice to follow and which not, because there is so much old/historical code around it may confuse people otherwise.
Wiktionary:Headword-line templates and Wiktionary:Inflection-table templates should probably be deleted. They've already been nominated for deletion, but I'm not sure if anyone wanted to keep their contents. —CodeCat 21:52, 16 March 2013 (UTC)
- Many, if not most, of the headword-line templates do not follow your advice. Perhaps it would be advisable to go through and standardise them, if you know how to find offenders. —Μετάknowledgediscuss/deeds 22:04, 16 March 2013 (UTC)
- I know, and I don't know if we'll ever be able to fix all of them. The lack of headword-line templates in many entries is a much bigger problem in my opinion, though. —CodeCat 23:20, 16 March 2013 (UTC)
- Agreed. Unfortunately for me, that's basically a bot/AWB job. —Μετάknowledgediscuss/deeds 00:25, 17 March 2013 (UTC)
- I know, and I don't know if we'll ever be able to fix all of them. The lack of headword-line templates in many entries is a much bigger problem in my opinion, though. —CodeCat 23:20, 16 March 2013 (UTC)
- ELE states that the "inflection word" should be "(using the correct Part of Speech template or the word in bold letters)". Are you saying that is no longer allowable? SemperBlotto (talk) 07:51, 17 March 2013 (UTC)
- I don't think so, and as far as I know many others agree too. Several editors are trying to be more consistent in the way we mark text in a given language, by applying the lang= attribute wherever it is appropriate. I think Michael Zajac in particular is championing that approach and I agree with him. On the other hand, others like DCDuring don't seem to care much about those details and favour a more "old school" approach to HTML (one based on the premise that the underlying code serves only to produce the correct result visually; an approach which is strictly deprecated by the HTML standard). I've written that page partly to reflect how I think we should do things rather than how we have done things in the past. If there is disagreement (Ruakh seems to disagree because he changed some things) then I think this is a good time to come to a consensus, because this will only get worse as time goes on. —CodeCat 14:23, 17 March 2013 (UTC)
- I agree that we ought to scrap bold letter headword-lines. Our templates, even just
{{head}}, have useful functions and can now be endowed with even greater powers than before. The HTML is important as well, although that's less central to me. —Μετάknowledgediscuss/deeds 16:57, 17 March 2013 (UTC) - Re: "Ruakh seems to disagree because he changed some things": Mostly I disagree with the notion of having a page that sounds like it's describing consensual current practice but actually describes one editor's personal preferences. For the record, I agree that we need lang="" (except maybe for English headwords), though not with the other aspects of that proposal. —RuakhTALK 17:26, 17 March 2013 (UTC)
- It's not just one editor's preferences though. I know better than to push my own POV that hard... —CodeCat 17:27, 17 March 2013 (UTC)
- Above, you wrote "I've written that page partly to reflect how I think we should do things rather than how we have done things in the past", and I believe that to be a correct statement. I'm not suggesting that all of your preferences are unique to you, merely that the page is accurate only when read as a description of your preferences. —RuakhTALK 17:47, 17 March 2013 (UTC)
- It's not just one editor's preferences though. I know better than to push my own POV that hard... —CodeCat 17:27, 17 March 2013 (UTC)
- I agree that we ought to scrap bold letter headword-lines. Our templates, even just
- I don't think so, and as far as I know many others agree too. Several editors are trying to be more consistent in the way we mark text in a given language, by applying the lang= attribute wherever it is appropriate. I think Michael Zajac in particular is championing that approach and I agree with him. On the other hand, others like DCDuring don't seem to care much about those details and favour a more "old school" approach to HTML (one based on the premise that the underlying code serves only to produce the correct result visually; an approach which is strictly deprecated by the HTML standard). I've written that page partly to reflect how I think we should do things rather than how we have done things in the past. If there is disagreement (Ruakh seems to disagree because he changed some things) then I think this is a good time to come to a consensus, because this will only get worse as time goes on. —CodeCat 14:23, 17 March 2013 (UTC)
Agent noun in definitions [edit]
Let us remove "Agent noun of accuse" from the definition of "accuser" and proceed similarly with other agent nouns. The phrase proposed for removal is not a part of a gloss definition, as it speaks of the noun rather than of the things to which the noun refers. Moreover, the phrase is almost always redundant: a noun defined as "one who accuses" is thereby an agent noun. Definitions in which the phrase is not redundant can be rephrased to make it redundant. Nonetheless, agent nouns can be placed to Category:English agent nouns. --Dan Polansky (talk) 12:02, 17 March 2013 (UTC)
- I agree. — Ungoliant (Falai) 14:26, 17 March 2013 (UTC)
- Yes, I don't like these, either. Equinox ◑ 14:28, 17 March 2013 (UTC)
-
- I've never seen one of these, but yes go ahead. Mglovesfun (talk) 16:13, 17 March 2013 (UTC)
- Sounds good. Ƿidsiþ 16:21, 17 March 2013 (UTC)
- For reference, as accuser entry no longer has "Agent noun of accuse", you can find the phrase as part of definition in this revision. --Dan Polansky (talk) 16:48, 17 March 2013 (UTC)
Disgree. Agent nouns have distinctive syntax and semantics, and I think it's helpful to identify them explicitly. For example, "a lover of men" means "one who {loves men}", not really "{one who loves} of men". "His murderer" means "the one who murdered him", not "the one who murders him". "The grass-puller" can mean "the one who was pulling grass".[3] But you're right that it's a non-gloss statement, and should be in {{non-gloss definition}}. —RuakhTALK 17:34, 17 March 2013 (UTC)
-
-
- But "the vegetable gardener" is someone who gardens vegetables, and "the tomato's gardener" is (potentially) the person who gardened the tomato. Agent nouns are nouns, of course, with all the usual properties of nouns — but they also have their own distinctive syntax and semantics. —RuakhTALK 23:00, 18 March 2013 (UTC)
-
- Some of these come from
{{new en agent noun}}. Mglovesfun (talk) 10:11, 18 March 2013 (UTC)- I created that template, and many of the entries employing it, following this Tea Room discussion. My impression from that discussion (and from the content of the agent noun entry) was that every noun formed by describing the "foo"-er (or "foo"-or) of verb "foo", whether that be a thinker, tinkerer, or editor, was an agent noun. I apologize if my impression was mistaken. There is also some discussion of agent nouns here. I do think that to the extent terms are properly classified as agent nouns, they should be identified as such. bd2412 T 01:26, 19 March 2013 (UTC)
- Is judge#Noun an agent noun of judge#Verb? Is doctor#Noun an agent noun of doctor#Verb? Is typist#Noun and agent noun of type#Verb. IOW, does there need to be a specific diachronic sequence to the formation of the noun? Alternatively, does there have to be a specific morphological relationship? DCDuring TALK 02:15, 19 March 2013 (UTC)
- I created that template, and many of the entries employing it, following this Tea Room discussion. My impression from that discussion (and from the content of the agent noun entry) was that every noun formed by describing the "foo"-er (or "foo"-or) of verb "foo", whether that be a thinker, tinkerer, or editor, was an agent noun. I apologize if my impression was mistaken. There is also some discussion of agent nouns here. I do think that to the extent terms are properly classified as agent nouns, they should be identified as such. bd2412 T 01:26, 19 March 2013 (UTC)
Banning Wonderfool from requesting automatic inflection [edit]
Wonderfool doesn’t care about correctness when requesting automatic inflection. His User:Pofficerbot and User:Dawnraybot created a huge mess of wrong inflected forms ([4][5]), some of which were found out just recently ([6], amanheêsseis, amanheêssemos, amanheêreis, amanheêramos). Yesterday he requested that User:BuchmeierBot create the forms of salpimientar ([7]), but the conjugation table was incorrect ([8]). Therefore I think every sockpuppet he ever creates should be banned from requesting automatic inflection. If he needs it, he can ask someone else to check the table and request inflection. — Ungoliant (Falai) 13:09, 18 March 2013 (UTC)
- I suppose it wouldn't hurt, but it probably would be about as effective as the ban on running an unauthorized bot was yesterday. Chuck Entz (talk) 13:39, 18 March 2013 (UTC)
- Hold on, how did you conclude that User:Razorflame = Wonderfool? -- Liliana • 13:48, 18 March 2013 (UTC)
- Dawnraybot wasn’t WF?? Oops. Still, we should prevent WF from requesting wrong forms. — Ungoliant (Falai) 13:59, 18 March 2013 (UTC)
- Dawnraybot definitely was WF, as was Pofficerbot. I think Liliana was reacting to the presence of Razorflamebot here Chuck Entz (talk) 14:05, 18 March 2013 (UTC)
- Actually, I think I confused it with User:Darkicebot, which was RF. The names sound too similar. -- Liliana • 14:33, 18 March 2013 (UTC)
- Dawnraybot definitely was WF, as was Pofficerbot. I think Liliana was reacting to the presence of Razorflamebot here Chuck Entz (talk) 14:05, 18 March 2013 (UTC)
- Dawnraybot wasn’t WF?? Oops. Still, we should prevent WF from requesting wrong forms. — Ungoliant (Falai) 13:59, 18 March 2013 (UTC)
- I don't really feel like enforcing this myself, because I don't really care much for the whole Wonderfool hunt. So I won't be checking all of User:MewBot/feedme, but if someone else wants to check it they're free to. —CodeCat 17:10, 18 March 2013 (UTC)
- The example you gave about salpimentar was a great example of how useful WF actually is. Please bear in mind that the base form of the verb was correct, even though the conjugation table was wrong. However, also bear in mind that there is an entry for salpimientar. The page [[salpimientar]] was created by one bureaucrat, then editted by another a few years later, and then the bot forms were created by an administrator. A point worth making is that salpimientar is a misspelling of salpimentar, and that nobody realized the error until now. --Fullupfrompizza (talk) 00:29, 19 March 2013 (UTC)
-
- I’m not disputing that you are useful. Your work on Asturian has been good. However, you don’t check whether the conjugation is correct, creating a mess that can take half a decade for someone to find when it’s not. — Ungoliant (Falai) 00:38, 19 March 2013 (UTC)
- Nobody's perfect, dude. All bot users have surely created hundreds of erroneous entries throughout their time. But they will all get found eventually. That's the beauty of wikis. See you on Wednesday. --Fullupfrompizza (talk) 00:42, 19 March 2013 (UTC)
- I’m not disputing that you are useful. Your work on Asturian has been good. However, you don’t check whether the conjugation is correct, creating a mess that can take half a decade for someone to find when it’s not. — Ungoliant (Falai) 00:38, 19 March 2013 (UTC)
Category:en:Geological periods [edit]
As I've filled fr:Catégorie:Périodes géologiques en français, I'm now able to create the around 150 pages of Category:en:Geological periods, and 150 for their corresponding adjectives, from User:Christian COGNEAUX/English geologic names.
My script is ready, do you think that I could launch my bot without the flag for this small unique mission please? JackPotte (talk) 15:46, 19 March 2013 (UTC)
- Go for it. We can always chop it if needs be. SemperBlotto (talk) 15:56, 19 March 2013 (UTC)
In conclusion I've created the category and all the corresponding adjectives in lower case. However, two problems remain:
- Equinox showed me that there was no attestation of gzhelian, but only Gzhelian as an adjective. Reverso gave me an example in lower case, but after double-checking it's the minority of the cases (should we delete them?).
- Some entries were already created as proper nouns: Permian, Pridoli, Turonian and Maastrichtian, but I don't find any dictionary telling that. JackPotte (talk) 19:00, 19 March 2013 (UTC)
- If you can't attest a word, post it on WT:RFV. Entries like Maastrichtian should be fixed; be bold. But your bot-added terms have some formatting errors, and I'd appreciate if you could fix those by bot. See the changes I made here for example. Minor to be sure, but local template usage is pretty important in unifying formatting. Thanks! —Μετάknowledgediscuss/deeds 22:31, 19 March 2013 (UTC)
- Thank you, I've done the corrections and suggestions. JackPotte (talk) 23:32, 19 March 2013 (UTC)
- I've temporarily autopatrolled the bot; I'll be removing that now. Please post a few minutes before you plan to run it so that an admin can do this. I don't think I've ever seen the lowercase forms in English (and I'm well-versed in matters geological), and checking each one individually would be extremely tedious. Do you have a solution to that? —Μετάknowledgediscuss/deeds 23:41, 19 March 2013 (UTC)
- Any administrator can launch delete.py, from a file which content would be (in ASCII) all the pages I've created in lower case (not those which existed before): I can publish the list into Wiktionary:Requests for deletion/Others. JackPotte (talk) 08:07, 20 March 2013 (UTC)
- I've temporarily autopatrolled the bot; I'll be removing that now. Please post a few minutes before you plan to run it so that an admin can do this. I don't think I've ever seen the lowercase forms in English (and I'm well-versed in matters geological), and checking each one individually would be extremely tedious. Do you have a solution to that? —Μετάknowledgediscuss/deeds 23:41, 19 March 2013 (UTC)
|
aeronian aptian artinskian asselian bajocian bartonian bashkirian bathonian burdigalian callovian capitanian cenomanian changhsingian chattian coniacian danian dapingian darriwilian drumian eifelian emsian famennian floian fortunian frasnian gelasian gorstian guzhangian gzhelian hettangian hirnantian homerian induan ionian jiangshanian kasimovian katian kimmeridgian |
kungurian ladinian langhian lochkovian ludfordian lutetian maastrichtian messinian moscovian olenekian oxfordian paibian piacenzian pliensbachian pragian priabonian pridoli rhuddanian roadian rupelian sakmarian sandbian santonian selandian serpukhovian serravallian sheinwoodian sinemurian tarantian telychian thanetian titonian toarcian tortonian tremadocian turonian visean wordian wuchiapingian ypresian zanclean |
JackPotte (talk) 19:29, 20 March 2013 (UTC)
-
- Probably WT:RFV would be better, to at least give them a chance... —Μετάknowledgediscuss/deeds 01:35, 21 March 2013 (UTC)
meta:Wiktionary future [edit]
It has been reported on Wikidata talk:Wiktionary#Project_page_on_meta (though not, oddly, here on Wiktionary) that "a page was created on meta to coordinate propositions concerning wiktionary future"[sic]. Among other things, it is proposed that "all WIKIs containing elements of one and only one foreign language (relativ[sic] to the general language of the Wiktionary) should be eliminated […] this means 1: Haus in the English and French Wiktionary should be eliminated; Maison in the German and English Wiktionary should be eliminated; house in the German and French Wiktionary should be eliminated and substited[sic] by another ENTITY as described later". The page is in dire need of both competent input from actual Wiktionary editors and orthographical and grammatical cleanup; please help out! - -sche (discuss) 21:14, 19 March 2013 (UTC)
- Some of what this fellow says makes sense (better semantic markup, better identification of important data); some of it is pure madness (removing any entry pages that are only for one language -- what happens when editors want to add another language later, such as Luxembourgish at [[Haus]]? -- what about target-language (in our case, English) descriptive and explanatory text? etc. etc.).
- The page reads a bit like buzzword bingo: "The current mostly text- and mark-up-based structures with more or less accidental quality of data-content should be moved, step by step, into a better apt, more easily understandable and usable data model, representing the real long term needs of the subject. The view is that of a cross-Wiktionary user." Huh? And how many of these "cross-Wiktionary users" exist?
- The proposal seems to be completely oblivious to the strikingly different ideas about linguistics and grammar that are evidenced by the various user communities. Entry structure over at the RU WT or the ZH WT is quite different from how we do things here, for instance.
- And that's just for starters. The underlying idea (better data portability and sharing) is a good one; this half-dreamt extrapolation of that idea is frighteningly like a bulldozer revving in front of my house and aimed at the living room. -- Eiríkr Útlendi │ Tala við mig 21:44, 19 March 2013 (UTC)
- Theres omega/Z already, what does this do that that doesnt. —This comment was unsigned.
- I think the difference is that this would require every Wiktionary to change dramatically and converge toward a single entity (considering the fact that the communities can't even agree on a freaking common logo, I don't think this is realistic unless the proposed solution is rock solid). While Omega seems to be separate and complementary to Wiktionary. Dakdada (talk) 09:37, 20 March 2013 (UTC)
- Theres omega/Z already, what does this do that that doesnt. —This comment was unsigned.
meta:Requests for comment/Adopt OmegaWiki [edit]
One more Meta proposal which would affect Wiktionary: the proposal, by GerardM, Kip et al, that the WMF adopt OmegaWiki. OmegaWiki, formerly known as WiktionaryZ, is a WikiData-esque "project to produce a free, multilingual dictionary in every language" using a relational database "not based on 'words', but on the concept of Defined Meaning". I've just left my thoughts; consider leaving yours! - -sche (discuss) 08:54, 20 March 2013 (UTC)
Wiktionary and Wikidata [edit]
The above propositions about the Far Future of Wiktionary and OmegaWiki are fine, but I'm more interested in the near future, i.e. how can we use Wikidata without changing everything everywhere. Also, I'd like to see more synergy between projects when working on things that could be used by everyone.
As you know, Phase I of Wikidata for Wiktionaries is already being worked on. The purpose of this phase is to move the interlanguage links to Wikidata (no more interwiki bots !). This should not be too difficult, and it will probably be mostly automated.
What I'm interested in would be the Phase II. There have been talks about moving pronunciations, declensions etc. to Wikidata, but I think it's unrealistic right now given the heterogeneity of Wiktionaries on these matters. I believe we should focus on common data that can be shared but are independent of the communities. Such data would be, for example :
- Translitterations
- Sort keys (or collation for categories)
Both are standardized and independent of the communities. Only the word and the script destination/language of the word is required. There already are works about Lua Modules, but each Wiktionary seems to be working on its own solution, which is a waste. We should either use Wikidata to store and reuse the translitterations/sort keys, or develop common libraries for Lua. The advantage of Wikidata would be that the values would not have to be computed every time.
I'm sure there are other data like that that could be shared realistically between Wiktionaries (either with Wikidata or common Lua libraries). Dakdada (talk) 10:03, 20 March 2013 (UTC)
- Both of these are probably better handled using Lua than by storing them in some place. I think sharing Lua would be beneficial, but it would be hard because modules rely on infrastructure that may not be there on another wiki. —CodeCat 11:23, 20 March 2013 (UTC)
- I'm hoping for some libraries directly included in the Lua extension, like mw:Extension:Scribunto/Lua_reference_manual#Language_library. There should be some translitteration and collation libraries that could be used, instead of writing everything all over again on every wiki.
- As for sharing code, it may be good to design them so that they could be used elsewhere (be it on other Wiktionaries or other sister projects). Otherwise there will be a lot of wheels reinvented. Dakdada (talk) 11:34, 20 March 2013 (UTC)
- Sorry, but your assumptions just aren't true. Transliterations aren't "standardized and independent of the communities"; in fact the opposite is true. Sort keys might be, but that's even easier to solve with local Lua modules. I personally haven't seen anything in Wiktionary Phase II discussions that I could support. It seems to be badly thought out by people who aren't at all familiar with the differences between the major Wiktionaries. —Μετάknowledgediscuss/deeds 01:42, 21 March 2013 (UTC)
- Frankly, I think Metaknowledge hits the nail on the head: the whole idea of shifting content to WikiData—first the idea of moving pronunciations and translations, now the idea of moving transliterations—"seems to be badly thought out by people who aren't at all familiar with the differences between the major Wiktionaries." - -sche (discuss) 03:06, 21 March 2013 (UTC)
- Sorry, but your assumptions just aren't true. Transliterations aren't "standardized and independent of the communities"; in fact the opposite is true. Sort keys might be, but that's even easier to solve with local Lua modules. I personally haven't seen anything in Wiktionary Phase II discussions that I could support. It seems to be badly thought out by people who aren't at all familiar with the differences between the major Wiktionaries. —Μετάknowledgediscuss/deeds 01:42, 21 March 2013 (UTC)
- Putting transliterations and sort keys on a central database would be workable for Japanese and helpful, especially to other Wiktionaries. We do it well on here and I expect other sites would like to imitate us.
-
- As for sort keys: Japanese entries have a sorting issue that we (mostly I myself) deal with in a very menial and labor-intensive way. We sort Japanese entries on here the same way that Japanese dictionaries do, which is different from the way the servers do it automatically, so for every entry where it's different, we have to add another key (or several other keys) to sell the servers "sort this as if it were actually this." For example, changing everything to Latin transliteration, "gorira" (see ゴリラ) would normally sort under "go" but we want it under "ko" so we have to force it to be sorted by this: korira' The apostrophe forces it to follow "korira" if it exists. Repeat for every category link and context template. It's an ugly hack and very few editors understand it. I have over 43,000 edits and a lot of them were dealing with this issue. Other WTs probably have the same issue and would want to do the same thing. Sort keys for Japanese are the same for every WT as long as they sort in the Japanese style. A quick look at French (#2 in number of entries) shows that they do too. Italian and Mandarin do not.
- Transliterations: Transliterations of Japanese into the phonetic scripts of Japanese, that is, (usually) from kanji to hiragana, are relatively uncontroversial. Those into Roman letters (or as we say, written in romaji) are a source of great disagreement. However, and this is a bit chauvinistic, I think that the method that we use on English WT is the best one, and the most modern, widely accepted one worldwide, and other Wiktionaries would like to use ours. --Haplology (talk) 03:15, 21 March 2013 (UTC)
- @Haplology: Unfortunately, you haven't refuted my statements, but instead backed them up. All you've said is that the way we do it at English Wiktionary yields the best quality of results (and in this case I'll agree with you), but that other Wiktionaries don't agree. Forcing them to would be as bad as them forcing us: I don't think anyone wants that. More to the point, it looks like both your issues (converting between the phonetic scripts of Japanese and sorting Japanese-style) are perfect for Lua. —Μετάknowledgediscuss/deeds 03:28, 21 March 2013 (UTC)
- I've already implemented sorting keys for Dutch and Catalan, see Module:nl-common and Module:ca-common. —CodeCat 03:50, 21 March 2013 (UTC)
- Here are the standards for transliteration : List of ISO romanizations. Collations are also standards, and it is more complicated than simply removing diacritics. Both are in the CLDR. If we could have this available in a common Lua library (not Modules, directly in the Extension), we would not have to create separate modules for every language in every Wiktionary, which is what is being done right now. Again, we are reinventing the wheel (and not in the best way).
- I've already implemented sorting keys for Dutch and Catalan, see Module:nl-common and Module:ca-common. —CodeCat 03:50, 21 March 2013 (UTC)
-
-
-
-
-
-
-
-
- There are some standards for transliteration. Yes, we should definitely use standardized romanizations, but we should choose appropriate ones. The w:ISO 9:1995 romanization for Cyrillic, for example, is based purely on character glyphs and disregards language differences, so it is probably excellent for multinational document cataloguing, but it is very poor for lexicography. You’ll notice that the CLDR includes some non-ISO systems,[9] and allows for others to be added.[10] —Michael Z. 2013-03-21 16:05 z
-
-
-
-
-
-
-
-
-
-
-
- Also : please don't condescendingly assume that I don't know anything about Wiktionary projects. I was suggesting to share some of the work, either with Wikidata or with Lua, precisely because I know that the projects don't share anything. Dakdada (talk) 10:18, 21 March 2013 (UTC)
- I've argued several times before that using sort keys in a multilingual project is still very much a hack, and doesn't solve the real problem. The current ordering can't account for the fact that different languages might order letters differently (for example, Swedish orders ö at the end but Turkish puts it after o). The current system also doesn't allow for languages that might treat character sequences as distinct letters (Hungarian sz, cs, ny etc come to mind). A real solution that I've proposed before is to allow categories themselves to have custom collation orders. Something like a magic word
{{COLLATION:nl-NL}}or something similar. —CodeCat 13:47, 21 March 2013 (UTC)- I totally agree. However this is a long sought feature that does not seem to be worked on (I think there is a bug for that lost in the limbs of Bugzilla). Maybe we should try to ask again (or at least make sure there is actually a bug for that).
- If, for the time being, we could have something that creates correct sort-keys automatically (for a given language), it would be great. And if this sort-key library in Lua is consistently available in all projects: even better. Oh, and collations can be useful for other things than Categories. Dakdada (talk) 14:03, 21 March 2013 (UTC)
- I've argued several times before that using sort keys in a multilingual project is still very much a hack, and doesn't solve the real problem. The current ordering can't account for the fact that different languages might order letters differently (for example, Swedish orders ö at the end but Turkish puts it after o). The current system also doesn't allow for languages that might treat character sequences as distinct letters (Hungarian sz, cs, ny etc come to mind). A real solution that I've proposed before is to allow categories themselves to have custom collation orders. Something like a magic word
- Also : please don't condescendingly assume that I don't know anything about Wiktionary projects. I was suggesting to share some of the work, either with Wikidata or with Lua, precisely because I know that the projects don't share anything. Dakdada (talk) 10:18, 21 March 2013 (UTC)
-
-
-
-
- For that matter, what about cases where a single spelling needs to be categorized under multiple different sortings? See meta:Help_talk:Category#Any_way_to_sort_under_multiple_sort_keys.3F for an explanation of what this means for Japanese, a language that is quite happy to apply several quite different readings to a single kanji spelling. -- Eiríkr Útlendi │ Tala við mig 16:39, 21 March 2013 (UTC)
- Collation per category is possible, but several keys for one word may require changes to the Mediawiki database itself (among other things), or a whole new dedicated extension, so I'm afraid we'll have to use workarounds for this for a while. How do they manage that on ja.wikt ? Dakdada (talk) 17:04, 21 March 2013 (UTC)
- How do they manage that on ja.wikt? --> Mostly, they don't. I notice that ja:靖 is listed under せい (sei) in their index of Japanese name kanji, which is only one of the many readings for this character. I can only assume that they ran into this same technical limitation of the MediaWiki back-end and decided to adopt the most common on'yomi for indexing purposes. I also note that they don't have any given names at all (at least, after searching for a while I couldn't find any), and Japanese given names are some of the most inventive when it comes to readings for any given set of kanji. -- Eiríkr Útlendi │ Tala við mig 21:48, 21 March 2013 (UTC)
- Collation per category is possible, but several keys for one word may require changes to the Mediawiki database itself (among other things), or a whole new dedicated extension, so I'm afraid we'll have to use workarounds for this for a while. How do they manage that on ja.wikt ? Dakdada (talk) 17:04, 21 March 2013 (UTC)
Template:comparative of and Template:superlative of [edit]
I think we should either split these in order to create {{en-comparative of}} and {{en-superlative of}} because of the more ''[[{{{1}}}]]'' (or most ''[[{{{1}}}]]'' ), or remove that bit all together. Alternatively suppress that bit automatically when lang=en or no language is given at all. So that's three options, or four if you include do nothing (leave it as it is)
- split to create en templates
- remove the more/most bit all together
- suppress the more/most bit when lang=en lang is not given
- leave it as it is
Mglovesfun (talk) 13:42, 20 March 2013 (UTC)
- I don't necessarily agree with creating English-specific versions of all the form-of templates, but for this specific case I do agree and was thinking of proposing the same (option 1). Also, option 3 is the same as option 4 because it's how it works right now. —CodeCat 15:15, 20 March 2013 (UTC)
- Also, most Slavic languages form a periphrastic comparative in much the same way as English, and those languages might want to insert their own word for "more" instead. I doubt we'd want to add support for all of those into these templates. —CodeCat 15:17, 20 March 2013 (UTC)
Use of Template:param in template documentation [edit]
I just noticed today that Template:param is being used on several template documentation subpages to mark up parameter names, but the way it's being done makes no sense. See Template talk:param for details, and to discuss the issue further. - dcljr (talk) 07:30, 21 March 2013 (UTC)
Wiktionary:Votes/pl-2013-03/Japanese Romaji romanization - format and content [edit]
FYI: Wiktionary:Votes/pl-2013-03/Japanese Romaji romanization - format and content. --Dan Polansky (talk) 14:05, 23 March 2013 (UTC)
Category collation [edit]
For those of you who did not read the #Wiktionary and Wikidata discussion, please take a look at the following bug, and add your vote (registration required):
This would allow every category to have its own collation (word order), adapted to the language of its content, instead of the default Unicode. This way we would not have to bother with setting sort keys in every article. Dakdada (talk) 12:43, 25 March 2013 (UTC)
Category collation for kanji [edit]
I don't suppose you know if there's a bug report for the problem where a single headword can only ever be indexed under one listing per category?
- This is related to collation, as single Japanese entries might need to be collated in multiple places. Take the single kanji 凹, for instance -- all on its own, it can be read variously as kubo, kubomi, nakakubo, hekomi, or boko, all of which are nouns. However, due to the current implementation of categories in the MediaWiki software, even if all of these categories are included in the wikitext, the last one on the page seems to be the only one that works (I'm guessing that this is probably because the MW software looks at each cat in turn and overwrites the previous indexing value, instead of allowing multiples). Consequently, 凹 is only listed under the boko reading in Category:Japanese nouns, when a proper dictionary would list this under all five readings, not just one of them.
- If there is such a bug report, please let me know. -- Eiríkr Útlendi │ Tala við mig 15:14, 25 March 2013 (UTC)
- You can report such a bug yourself (on bugzilla), although it would be good to have an idea to propose. One workaround would be to create several redirects which would be categorized like the main article, but with different names corresponding to the reading, e.g. よう (凹). I suppose something like that was already suggested... Dakdada (talk) 16:42, 25 March 2013 (UTC)
- That's what I did with Welsh cyngyd, which requires two different alphabetizations: one meaning treats "ng" as a single letter alphabetized between "g" and "h", and the other meaning treats "ng" as two separate letters. I made a redirect from cyngyd (with a zero-width nonjoiner) for the single-letter variant and sorted it as "cygzyd". Kind of kludgey, but it works. —Angr 16:50, 25 March 2013 (UTC)
Yes, I mentioned that in the thread over on Meta. Due to the very large number of possible alt readings for some Japanese kanji, this kind of hack quickly becomes untenable, and very hard to maintain if anything changes. cyngyd is easy enough with just two collations needed. What about 靖, which would need at least 13 collations? Or 生, which would need somewhere around 30 to account for all the name readings? Even excluding name readings, this character would need eight or nine collations. Manually creating so many hack workaround blank pages just to handle category collations can't be the best way to solve this problem... -- Eiríkr Útlendi │ Tala við mig 06:27, 27 March 2013 (UTC)
- Even though it is a hack, this is nonetheless the way the entries should be displayed: よう (凹) or 凹 (よう). Because having just 凹 in a category doesn't help, especially if it's only there several times. Right now we can only use redirects for that. Maybe Wikidata could help for this. Dakdada (talk) 10:21, 27 March 2013 (UTC)
-
-
- Why not actually create one of those as an entry and redirect it to 凹? —CodeCat 14:55, 27 March 2013 (UTC)
- A soft redirection then? Well, there are already articles about transcriptions, so that would be similar. Dakdada (talk) 15:01, 27 March 2013 (UTC)
- I think CodeCat intends for this to be a hard redirect, much as the cyngyd redirect that Angr mentions above.
- One serious question though about Dakdada's suggestion is how do you intend for the entry to display as you've written? Are you proposing separate pages for each reading of any kanji combination? Notice that we already have kana pages, so 凹 should be listed under よう, おう, くぼ, くぼみ, etc. Are you suggesting that we should have pages listed under the combination of kanji + kana reading for those kanji?
- The standard has been to list all readings of a given kanji term within the entry for that term. If you look at the JA entry for 凹, you'll see that each reading has its own etym (as each reading historically has its own distinct derivation), with each etym sectin showing the reading. -- Eiríkr Útlendi │ Tala við mig 15:28, 27 March 2013 (UTC)
- Either we only use kana pages to link to the corresponding kanjis, or we don't and we have to create separate pages for each reading, in the form [kanji (kana)], redirecting to the kanji. Dakdada (talk) 16:12, 27 March 2013 (UTC)
- Rather that we've already been creating kana pages to link to the corresponding kanji entries for several years now, I suggest we just continue with our current m.o. However, this has no bearing on any solution to the collation problem. I'll see about filing a bug report at some point. -- Eiríkr Útlendi │ Tala við mig 17:16, 27 March 2013 (UTC)
- Either we only use kana pages to link to the corresponding kanjis, or we don't and we have to create separate pages for each reading, in the form [kanji (kana)], redirecting to the kanji. Dakdada (talk) 16:12, 27 March 2013 (UTC)
- A soft redirection then? Well, there are already articles about transcriptions, so that would be similar. Dakdada (talk) 15:01, 27 March 2013 (UTC)
- Why not actually create one of those as an entry and redirect it to 凹? —CodeCat 14:55, 27 March 2013 (UTC)
-
- That's what I did with Welsh cyngyd, which requires two different alphabetizations: one meaning treats "ng" as a single letter alphabetized between "g" and "h", and the other meaning treats "ng" as two separate letters. I made a redirect from cyngyd (with a zero-width nonjoiner) for the single-letter variant and sorted it as "cygzyd". Kind of kludgey, but it works. —Angr 16:50, 25 March 2013 (UTC)
- You can report such a bug yourself (on bugzilla), although it would be good to have an idea to propose. One workaround would be to create several redirects which would be categorized like the main article, but with different names corresponding to the reading, e.g. よう (凹). I suppose something like that was already suggested... Dakdada (talk) 16:42, 25 March 2013 (UTC)
- ┌─────────────────────────────────┘
- Perhaps I'm missing something, but isn't the original poster's concern properly addressed by using "Index:" pages? - dcljr (talk) 08:42, 19 April 2013 (UTC)
"-des" pluralizations [edit]
While adding plural categories to nouns with "-es" endings, I came across a few dozen words that are pluralized by adding or changing to a "-des", such as ephelides, which is the plural of ephelis, and lagopodes, which is the plural of lagopus. Are these properly considered "-es" endings, or should they be categorized separately? Also, if they should have their own category, are they irregular plurals? bd2412 T 02:54, 27 March 2013 (UTC)
- They're Ancient Greek borrowings that have kept the Greek plural forms. S at the and of an Ancient Greek word tends to absorb most consonants that come into contact with it, but the vowels in the plural endings keep them separate, so you can see the real ending of the stem. These are all words with a hidden -d in the singular that shows itself in the plural. Needless to say, this is all by Ancient Greek rules: the words were either borrowed as a unit with both singular and plural taken directly from the Ancient Greek, or they had the plural endings added back by people trying to imitate the Ancient Greek. I don't think there's any process in English that produces them- it's the fact that they're left unaltered that sets them apart. They're definitely a group, but only because they're all Ancient Greek third declension -d stems that haven't been assimilated to the English ways of forming plurals. Chuck Entz (talk) 05:36, 27 March 2013 (UTC)
- [after e/c] If we go by the actual text of Category:English plurals ending in "-es", it's a hodge-podge. It's apparently supposed to include two kinds of plurals:
- plurals whose spellings are formed by adding <-es> to the singular spellings. (This is already a bit arbitrary, actually, since this is not the same as the set of plurals formed by adding /-əz/ to the singular; note that "heroes" and "tomatoes" use the /-z/ plural, whereas "ridges" and "caches" use the /-əz/ plural even though their singulars are already spelled with <-e>. But it may be useful anyway.)
- Greek-derived plurals of the -is → -es type, as in "analyses" and "diagnoses" and so on. (This is almost completely separate from the first kind. The two seem to have a tiny bit of overlap at the edges, in that you'll sometimes hear people use the /iz/ pronunciation in words like "processes" where the etymology does not support it, but really, they're still almost completely separate. Besides, people also use that pronunciation sometimes for "Reese's pieces", which does not satisfy the criterion given in the category text.)
- In addition, the category currently contains various assorted plurals that don't satisfy its description, such as "phalanges" (plural of "phalanx", after Greek) and various plurals in "-ices" of singulars in "-ix" (after Latin).
- Personally, I would support splitting this up into more logical groupings. Even just separating the orthographic-addition-of-<-es> group from the Greek-and-Latin-irregularities would be a big improvement.
- —RuakhTALK 05:41, 27 March 2013 (UTC)
- Strongly agree, either that or just allow any plural ending in -es such as pieces, races and so on. One way or the other, but not this. Mglovesfun (talk) 16:23, 27 March 2013 (UTC)
- I will see to it this weekend. Cheers! bd2412 T 02:01, 28 March 2013 (UTC)
- I have created and populated Category:English irregular plurals ending in "-ces", Category:English irregular plurals ending in "-des" and Category:English irregular plurals ending in "-ges". The question remains whether plurals formed merely by changing a final "-is" to a final "-es" should be categorized separately from other "-es" plurals (and if so, what should the category be named), and whether plurals formed merely by the addition of an "-es" to a singular ending in "-o" should be categorized separately. My opinion is that the difference in pronunciation for words like heroes and mangoes does not make the ending "-es" any different, as that merely follows from the "-o" itself. bd2412 T 03:29, 28 March 2013 (UTC)
- I will see to it this weekend. Cheers! bd2412 T 02:01, 28 March 2013 (UTC)
- Strongly agree, either that or just allow any plural ending in -es such as pieces, races and so on. One way or the other, but not this. Mglovesfun (talk) 16:23, 27 March 2013 (UTC)
- Another example I saw today: mamey sometimes keeps its Spanish plural mameyes in English. Equinox ◑ 16:25, 27 March 2013 (UTC)
Increasing default font-size [edit]
The Vector skin’s default body font-size is 13px. This is 80% of the HTML and web-browser default of 16px, equivalent to CSS font-size small or 0.8em, or HTML size=2.
Wiktionary’s style sheet (Common.css) has 54 declarations increasing font-size for language scripts, plus one for IPA. Their average value is 123% and median is 125% (e.g., 1.25em is equivalent to 125%), both equivalent to 16px (= medium, 1.0em, or size=3).
If we simply set the website’s default font-size to the normal HTML default of 16px, then we can remove 44 of these exceptional font declarations, and reduce the contrast of the remaining 11. Advantages to readers and editors would include:
- Improved readability on small and large screens
- Better consistency in different scripts and languages
- Better consistency in font rendering (e.g., stroke-width discrepancy between Latn and Arab text in Arial)
- Consistency in IPA (e.g., /abcde/ vs. abcde on the same line)
- Fewer exceptions to futz with in our CSS and templates
- Paving the way to more modern CSS
I have been browsing with the medium font-size for some days (see User:Mzajac/vector.css), and I find it to be an improvement on both the desktop and the mobile. —Michael Z. 2013-03-28 17:51 z
So I am proposing making the font bigger. No objections or comments at all? —Michael Z. 2013-04-02 15:32 z
- No objection (it might even let me move from monobook). SemperBlotto (talk) 15:39, 2 April 2013 (UTC)
POS labels and different languages [edit]
I'm puzzled by the POS labels I see for Lojban. Take rafsi, for instance. This shows up as the POS label in terms such as rin. But what the heck is a rafsi? Apparently, it's the Lojban word for an affix. So why not use the POS label Affix?
Does this imply that we are allowed to use the POS labels of the source language? This would obviate some of the difficulties we JA editors have had in finding a fitting English label for the Japanese POS known as 形容動詞 (keiyō dōshi, literally “adjectival verb”) (except they aren't at all verbs, and some of them are more like nouns). Functionally, these are basically a class of adjectives, which includes a few specific terms that can also be used as nouns.
However, using the grammar labels of the source language as POS headers introduces new difficulties, as our target audience consists of English-language readers, and English-language readers can't be expected to know what 形容動詞 means, nor what keiyō dōshi means. (Heck, I've got growing reservations about our current header of Adjectival noun for this POS, thinking more that we should just use standard EN grammar labels where possible; and "adjective" would work for this POS... but anyway.)
Similarly, English-language readers can't be expected to know what a rafsi is.
Can anyone explain what the deal is with Lojban? Is it just a weird enough language that Lojban entries are given a pass with regard to WT:ELE? Is it just that no current editors care enough to fix these? Or is this carte blanche to get creative with POS headers, and to heck with WT:ELE? -- Eiríkr Útlendi │ Tala við mig 18:50, 28 March 2013 (UTC)
- The problem with Lojban in particular is that it is different. Lojban really doesn't have nouns or verbs. What Lojban calls a gismu corresponds to a noun, an adjective, adverb or a verb. This is because such words are technically predicates: they don't represent an object, but rather a certain truth about an object. To take the top two entries in Category:Lojban gismu... bacru is what we might call a verb, because it expresses that the subject performs an action. But badna is more like a noun because it says what something is. However, this distinction isn't at all meaningful in Lojban itself; "badna" is equally a verb and then means "is a banana", and the two are completely interchangeable (insofar as they take the same number of objects).
- For other languages, the problem is similarly that we can't ever hope to adapt the terminology that is appropriate for that language, to English. Some parts of speech are not familiar to English speakers because they don't exist in English. That's something we will just have to cope with. To limit ourselves to the words used to describe English also means that we try to artificially force other languages to fit an English-shaped mold. Imagine if we tried to reverse it, like in a language where every adjective were a verb (this does exist, to greater or lesser degrees, in many languages). Would it be appropriate to give green a "Verb" PoS header, or would we instead use some word that means "Adjective" more exactly, but isn't familiar to many speakers? —CodeCat 19:56, 28 March 2013 (UTC)
-
- WT:About Lojban includes helpful advice like “All text in Wiktionary should be in English” and “For a gismu, list here the lujvo and type-3 fu'ivla derived from it.” Ha!
-
- Are these words attested English terms? —Michael Z. 2013-03-28 20:00 z
- In the context of discussions about Lojban, I assume most definitely yes. Outside that, no, because they are only meaningful within that context (just like quasar only means something in astronomy). But you can of course RFV them. The only problem is, if they fail, how do we describe Lojban if we have no words to describe it with? —CodeCat 20:05, 28 March 2013 (UTC)
- Are these words attested English terms? —Michael Z. 2013-03-28 20:00 z
-
- Some words in Japanese do double duty as nouns and verbs, such as 混雑 (konzatsu, “a crush, congestion; to be crowded, to be jammed in or together”). In these cases, we have been listing these under both ===Noun=== and ===Verb=== POS headers.
- Other words in Japanese function as adjectives, but can also be used predicatively without a verb, such as 良い (yoi, “good”). In strict functional grammatical analyses, these have been variously described as adjectives, stative verbs, and adjectival verbs, among other things. However, such strict functional grammatical analysis belongs in an encyclopedia article, so for purposes of POS header in EN WT entries, we describe these as adjectives.
- If badna in isolation equates to EN noun banana, and as a predicate it equates to the EN verbal phrase is a banana, then it would be much more useful for English-language readers to label this as a ===Noun=== and include links to relevant articles on Lojban grammar that explain how things that function as nouns can also be used in other ways.
- Māori could be analyzed as functioning somewhat similarly. He wai tēnā works out literally to A (or some) water that. There is no real verb that means "to be"; you just use the noun.
- But what's going on here is relevant to the syntax of the language, and how different words are used in relation to each other. Wai is still a noun, even when used predicatively -- the word is a label for a person, place, or thing, ergo it is a noun (hearkening back a bit to Schoolhouse Rock grammar lessons). Similarly, I would argue that badna is a "noun" for purposes of discussion in English. Calling this a "gismu" with no other POS or grammatical information (and not even any links to the term gismu) is just obtuse and unhelpful when your target audience is English-language learners, who cannot be assumed to have any foreknowledge of Lojban.
- For that matter, the definitions given in the gismu entry are not exactly helpful either -- apparently readers need to know Lojban and obscure notation before they can make any sense of purportedly English-language definitions of Lojban terms. Not very user-friendly.
- If the EN WT is intended to be a dictionary of many languages into English, then we must use English to describe the source-language terms. Using the source language to describe the source language fails at this. -- Eiríkr Útlendi │ Tala við mig 21:12, 28 March 2013 (UTC)
- Why should Wiktionary make up its own terminology instead of using the words that are normal in a given field? Within Lojban discussions, gismu and so on are the standard words. If we don't use them, then yes, we might no longer confuse the occasional person who isn't familiar with those terms. But we would now be confusing the vast majority of Lojbanists who now no longer can make sense of our definitions. And as I tried to argue (but this point seems to have missed you), in Lojban, nouns and verbs are the same thing. There is no distinction whatsoever between "words for things" and "words for actions", both are the same, indistinguishable and interchangeable. Distinguishing them is artificial, and would be an attempt at best to fit them into an English-shaped mold. If we decide to distinguish them, how would we make the distinction? There is nothing within Lojban itself that can give any clue as to what is a verb and what is a noun, so editors will be faced, with every entry, with the completely arbitrary decision of whether a word is a verb, noun, adjective and so on. Because those things don't exist in Lojban. —CodeCat 21:25, 28 March 2013 (UTC)
-
-
-
- Our readers mainly do read English and do not read Lojban. This project is a general grammar discussion and not a Lojban one. w:Lojban grammar glosses gismu as “root word,” so I don’t think it would be unreasonable to use that as a POS heading, and explain the details in WT:About Lojban. Anyway, I won’t be convinced to support a “Gismu” header until I can understand our English definition of gismu. —Michael Z. 2013-03-28 21:38 z
-
-
-
-
-
- (after edit conflict)
- @CodeCat, it sounds like you're arguing that the word "noun" is made-up terminology. Just comparing how other WT sites handle Lojban, I find first that few others include Lojban terms, and second that the entry for badna on the Lithuanian WT at [[lt:badna]] uses the POS header daiktavardis, i.e. "noun", while the entry for casnu on the Malagasy WT at [[mg:casnu]] uses the POS header matoanteny, which further googling reveals to be the Malagasy word for "verb".
- Again, the EN WT is for English-language readers. We should be using English in the descriptions. This is not to say that we cannot also use other languages in the descriptions, but at the bare minimum, we must write entries that an English-language reader can understand. This should be our ideal. Many of the Lojban entries I've looked at fail to achieve this ideal.
- Japanese makes distinctions that English does not, so in the EN WT entries for such terms, we (the collection of JA editors here over time) have worked hard to come up with appropriate English-language labels.
- Lojban terms like badna or grute or tsiju all look very noun-ish. These terms all seem to be labels for persons, places, or things.
- Lojban terms like casnu or cusku or tavla all look very verb-ish. These terms all seem to be labels for actions.
- I note too that Category:jbo:Verbs exists, as does Category:jbo:Nouns. These categories have at the top a sentence reading: This category is for Lojban words which would tend to be considered [ verbs | nouns ] from an English speaker's perspective.
- Although Lojban grammar and grammarians might not distinguish between nouns and verbs, the fact remains that some of these words describe persons, places, or things (i.e. "nouns"), and some of these words describe actions (i.e. "verbs"). From an English-language reader's perspective, these words function like nouns and verbs -- it would be better to label them as such. Anyone who has gone through any education regarding English grammar, which we can at least begin to assume for the English-reading target population of the EN WT, will be at least passingly familiar with terms like noun and verb. We cannot assume the same familiarity with terms like gismu, and it is for this same reason that we are not using the term keiyō dōshi as a POS heading. -- Eiríkr Útlendi │ Tala við mig 22:06, 28 March 2013 (UTC)
- I think you have a different (and possibly incorrect) idea about the purpose and meaning of the PoS header. As far as I know, it's not meant as a definition or even part as one. That is what the definition is for. I have tried to understand a bit more about Lojban to see how to explain this, and (as a side note) I noticed that not one grammar actually uses English terms to describe it. They all use bridi, gismu, selbri and so on. From what I understand, as far as Lojban content words go, they all have the same structure, which is one that would be translated as a verb. Even nouns and adjectives. badna really means "(subject) is a banana", and although we might call it a noun, it can take a subject (an argument) as if it were a verb. So in more familiar terms you might call it a stative verb, but stativeness is also a concept that is unknown to Lojban, all it has is predicates (which act like mathematical functions) taking one or more arguments. It's possible to leave the arguments out, with the idea that it's unimportant or obvious. I think badna used in a sentence by itself means "that which is a banana". So if I'm not mistaken, a predicate, when it is itself used as an argument of another predicate, becomes a relative clause. —CodeCat 23:32, 28 March 2013 (UTC)
- @CodeCat, you note that: "I think you have a different (and possibly incorrect) idea about the purpose and meaning of the PoS header." It's entirely possible that I do. :) If so, then the most apparent (to me, anyway) alternate interpretation for what POS headers are for is for labeling the part of speech in terms specific to the language being described, rather than the language being used for the description. This would seem to mean that we should use keiyō dōshi as a header for that class of words in Japanese, given that this class does not exist in English and that it doesn't map entirely to the English POS adjective. I'm certainly open to that argument, but is that the correct extension of what you're saying about Lojban headers?
- Any other editors out there with views on this subject? -- Eiríkr Útlendi │ Tala við mig 00:52, 29 March 2013 (UTC)
- What I think we should do is use the descriptions that are the most common when discussing that language's grammar in English. For Lojban, that definitely means using the Lojban words because those are the terms that are the most familiar for such descriptions. There is also the case of Zulu and other Bantu languages, which have two classes of adjective-like word: one is a closed class that is usually called "adjective" and has more noun-like properties, while the other is an open class and is inflected like a verb/relative clause and is called a "relative" in most grammars. I don't know what term is used for Japanese, but if "keiyō dōshi" is the most common term used to describe those words even in English, then we should probably use that here too. —CodeCat 02:25, 29 March 2013 (UTC)
- I think you have a different (and possibly incorrect) idea about the purpose and meaning of the PoS header. As far as I know, it's not meant as a definition or even part as one. That is what the definition is for. I have tried to understand a bit more about Lojban to see how to explain this, and (as a side note) I noticed that not one grammar actually uses English terms to describe it. They all use bridi, gismu, selbri and so on. From what I understand, as far as Lojban content words go, they all have the same structure, which is one that would be translated as a verb. Even nouns and adjectives. badna really means "(subject) is a banana", and although we might call it a noun, it can take a subject (an argument) as if it were a verb. So in more familiar terms you might call it a stative verb, but stativeness is also a concept that is unknown to Lojban, all it has is predicates (which act like mathematical functions) taking one or more arguments. It's possible to leave the arguments out, with the idea that it's unimportant or obvious. I think badna used in a sentence by itself means "that which is a banana". So if I'm not mistaken, a predicate, when it is itself used as an argument of another predicate, becomes a relative clause. —CodeCat 23:32, 28 March 2013 (UTC)
-
-
- @Eirikr: IMO, Lojban is "just a weird enough language that Lojban entries are given a pass with regard to WT:ELE".
- For natural languages, I prefer to use terminology that is recognisable to English-speaking linguists. (Note that this is not the same as insisting that languages have only parts of speech that English has—English does not use circumpositions, but circumposition is a recognisable part of speech.) For Japanese, that means I oppose using "keiyō dōshi" as a header, but would be amenable to "adjectival verb", "adjectival noun" or "nounal adjective".
- Personally, I also prefer not to use specialised parts of speech when general ones are sufficient, hence Category:Abenaki nouns exists even though almost all of its entries could also, in very unhelpful analyses, be called nounal verbs or verb forms (segôgw (“skunk”, literally “(third-person singular) urinates”)), stative verbs/nouns (sips (“bird”, literally “(is a) bird”)), etc... if "adjective" would work for keiyō dōshi, I do think it ("adjective") should be considered. (Btw, some Abenaki nouns, such as kpiwi (“woods”), can be and might helpfully be listed as also adverbs.)
- Artificial languages are a different matter, because they can be (and Lojban was) designed to have unnatural structures. I have long thought our coverage of Lojban was a mess, because—as Eirikr notes—"readers need to know Lojban and obscure notation before they can make any sense of purportedly English-language definitions of Lojban terms", but I am content to let it remain a mess because I don't imagine anyone but Lojbanists making use of it. - -sche (discuss) 02:26, 29 March 2013 (UTC)
- Well, exactly. It's true you have to be a Lojbanist to understand the POS headers of our Lojban entries (Lord knows I don't understand them), but then you also have to be a Lojbanist to want to use our Lojban entries for anything. Although I too scratch my head in bewilderment at the POS headings of our Lojban entries, I accept that they're meaningful for the people interested in Lojban and that changing them to more familiar terms would be misleading at best and flat-out wrong at worst. —Angr 09:31, 29 March 2013 (UTC)
- I'm not entirely against switching these to English names, so long as they were at least as useful as the ones we have. But the goal here is not to pound a square peg into a round hole, and take words that are the same type of speech and label them differently just because they'd translate into English in different parts of speech. These aren't nouns, verbs, etc.--Prosfilaes (talk) 21:14, 30 March 2013 (UTC)
- As Eirikr pointed out as an aside, Māori doesn't really fit into our current L3 system very well, and a lot of languages I like don't. One Tongan dictionary I have simply categorises most words as substantives or verbs, and that seems to work pretty well. We really ought to step down from shoving other languages into L3s that are comfortable for us. If Lojbanists agree on a coherent way to present a very un-English language, then we need to respect that. —Μετάknowledgediscuss/deeds 18:14, 31 March 2013 (UTC)
- I thought the EN WT was intended as a many-languages-to-English dictionary? If no one here really understands Lojban entries, that seems to be a very firm indictment that our Lojban > English entries fail pretty hard.
- Even if we are to use Lojban-ish POS headers, some of these Lojban labels overlap sufficiently closely to the English labels that it beggars my understanding why we don't use the English. Leaving aside the issue of what the heck a gismu is, a rafsi seems pretty clearly to be an affix. So why don't we use the transparent label ===Affix=== for the POS header, instead of using the label ===Rafsi=== that no English speaker knows?
Underlying this query of mine about Lojban and POS headers is the very real and very deep concern about what we're doing here -- what is the point of the English Wiktionary? If the point is to serve as a many-to-English dictionary, then should we not be writing entries for English speakers?
- (I'm not shouting here, I just really want to emphasize that question.) -- Eiríkr Útlendi │ Tala við mig 18:37, 1 April 2013 (UTC)
- I've always thought that was our target and the justification for getting the resources that we get from WMF. I am not at all sure that they are happy with our net contribution. I have no objection to all kinds of technical and obscure linguistic terms and concepts being used here, but they should almost certainly not be displayed by default to casual users. L3 headers are a prime example of what should be somewhat intelligible to normal users, but also glosses and "context" tags. Non-English L2's that contain definitions that use words that we label obsolete, dated, or archaic should be cleaned up, with more current words substituted. Gratuitous use of arcane terms when a more ordinary and current term is a synonym or near-synonym should be avoided in English definitions as well. And "grammar" context labels really don't need to use terms like ambitransitive, ergative, ditransitive when modest rewriting can eliminate the need.
- I think we need the capability to have context labels that by default do not display, but which can be displayed by user preference. This might allow us to have our cake and eat it too in most cases. DCDuring TALK 19:17, 1 April 2013 (UTC)
- Re: just the bit about labels that don't display by default:
- Sounds like the equivalent of "advanced" settings in configuration UIs. :) That could be handled using CSS classes in the context / label templates, no? Should be relatively easy to implement. -- Eiríkr Útlendi │ Tala við mig 19:24, 1 April 2013 (UTC)
- One might think so, but
{{context}}is surprisingly complicated. Ruakh said he thought it was among the templates that most merited Lua/Scribunto-ization. From my limited knowledge of CSS, it would seem that we would want each context tag to have attributes that determined whether it was hidden by default and categorized. There are further questions as to whether topical categories should display differently when they do not reflect a limited usage domain (eg, airplane is topically in an aviation category, but is not at all limited in its usage). DCDuring TALK 20:24, 1 April 2013 (UTC)
- One might think so, but
- If we're writing entries for English speakers, why do all these entries have crap about gender? Words don't have gender!
- If you actually want to use the dictionary, you probably don't wanted it dumbed down for people who don't know anything about the language. If you're an English speaker who knows nothing of the language, our dictionary--any dictionary--will drive you nuts. Wann fängt man an, das Haus zu sanieren? We don't even mention that an is a seperable prefix in German (which we probably should), but even if we did, the English reader would still have to figure out whether it's a seperable prefix (of what? sanieren? Wann? Probably man, because that's closest) or one of many prepositions.
- To grab another example, suomi is of type ovi (?!?) and has declensions of inessive, illative, adessive, abelative, allative, essive, translative, abessive, and comitative. They all look made-up to me, and even with the language-savvy people here, I'll be impressed if you've already realized I made up one and know which one that is. I, however, do not believe that just because I don't understand it, there must be an easier way to describe Finnish.
- People who actually want to use our Lojban entries know what gismu means, because if you don't, you don't have a shot in hell of making more then word salad from a Lojban sentence no matter what we do. There's an argument for using affix instead of rafsi, but to me there's a stronger argument for using a consistent set of terminology for Lojban instead of mixing gismu and affix. If you wish to propose a set of replacements for the Lojban POS, I'm all ears, but consistency and usefulness to someone who knows enough Lojban to actually use a Lojban-English dictionary is important to me.--Prosfilaes (talk) 21:50, 1 April 2013 (UTC)
- I agree with everything Prosfilaes said... I couldn't have said it better! —CodeCat 22:25, 1 April 2013 (UTC)
- I'd love some evidence about who actually uses en.wikt for Lojban. I expect that most of the learning that takes place is by the contributors.
- If someone is a serious student of Lojban, for how long in the language learning process do they need English glosses? There is a Lojban Wiktionary, after all, just a click away for a non-novice user, which gives the Lojban PoS. DCDuring TALK 22:34, 1 April 2013 (UTC)
- So the world can be divided up into people that don't need Lojban definitions because they don't know enough Lojban grammar, and those who can use a Lojban dictionary? Why is that true of Lojban and not any other language?--Prosfilaes (talk) 01:17, 2 April 2013 (UTC)
- (...after edit conflict...) Re: Finnish:
- ovi isn't the POS. That's just plain ===Noun===. There's also a link, leading to an explanation in relatively clear, if somewhat obscure, English about what an ovi-type is. (FWIW, ovi means door, and is listed in that header as an exemplar term for that class of nouns.)
- By contrast, gismu is given as the POS. There's no link. Manually looking up the term gives me a gibberish definition that's more notation than description, such that I have no clear explanation of what this is, even if I take the bother to try to find out. I note that other editors have commented that the definition of gismu, such as it is, leaves them similarly confused.
- I've previously made the argument that Lojban terms that are labels for persons, places, or things should be listed under a ===Noun=== header, while those terms that are labels for actions should be listed under a ===Verb=== header, specifically from the point of view that noun and verb are generally understandable English terms, while gismu is not. The fact that a noun-ish gismu can be used predicatively strikes me much more as a matter of syntax, and thus something that should be explained in Appendix:Lojban_grammar or some similar place. I find it interesting that Category:Lojban_appendices doesn't even exist, particularly given how deliberately odd this language is.
- As best I can tell from reading the the gismu section of the EN WP Lojban grammar article, the label gismu is more a statement of the term's intended functioning within the language as a root word from which other words may be constructed. This might be helpful for learning about the morphology, but it tells us nothing about the semantics, and is not a very useful POS label for lexicographical purposes. One may as well adopt a similar position for root English terms like talk or sink, and simply list all senses under a single ===Root=== POS header. -- Eiríkr Útlendi │ Tala við mig 22:52, 1 April 2013 (UTC)
- And as an aside about gender, while English may not have grammatical gender aside from pronouns, the language does have the word gender, complete with an entry that is understandable by English readers. Grammatical gender and other features of non-English languages are at least described in English. Lojban alone seems to be described using Lojban. -- Eiríkr Útlendi │ Tala við mig 22:58, 1 April 2013 (UTC)
-
- I feel like you missed much of the import of what Prosfilaes said. You still seem to be intent on catering to the lowest common denominator without thinking about how Lojban dictionaries are written. The only conclusion I draw is that we should wikilink L3 headers that use obscure words. That's a policy I could support. PS: As soon as I saw the joke grammatical case, I noticed it and it started to bother me. But declension is a special interest of mine, even though I know no Finnish. —Μετάknowledgediscuss/deeds 01:08, 2 April 2013 (UTC)
- I don't see the distinction between the POS and anything else there. Adessive is about as much an English word as gismu; that is, complete and total gibberish to the vast majority of English speakers, but hopefully familiar to the people who need to know what it is. Go ahead and improve gismu; I don't see that as relevant.
- I think this is getting confused, because you're conflating two somewhat different things, the choices of the words we use for the parts of speech and what the parts of speech are themselves. As I said, I'm open to discussion on the first part. The second part, however, is absurd. Are we going to cram adessive words in with nominative, dative, accusative or genitive words? Because those are the English options, and the fact that all Finnish grammarians put them in their own class is irrelevant. Gismu is a part of speech in Lojban; stop trying to force it into English shaped boxes.--Prosfilaes (talk) 01:17, 2 April 2013 (UTC)
- Well... technically gismu isn't a part of speech. As far as I know a gismu is a kind of brivla that is not a compound or foreign word, but is a basic root, much as you might call green an English root word but not greenish or blue-green. Brivla is the actual part of speech, because the different kinds of brivla, gismu included, are interchangeable. However, Lojban is unusual in that it generally requires loanwords to be specially marked as such using particles, so technically "loanword" (called fu'ivla in Lojban) is a part of speech in Lojban. It's comparable to the Japanese use of Katakana, except that the distinction is also spoken. —CodeCat 01:44, 2 April 2013 (UTC)
- Allow me to restate. What is the purpose of the EN WT? If it is to serve as a many-to-English dictionary, should we not write entries that English readers can understand?
- It seems no one here understands the word gismu well enough to explain what one is. This therefore seems to be a very poor choice of term to use as an entry label, especially one as important as the POS header.
- Some have brought up the issue that other labels can also be obscure. While I might agree that adessive is not a word I use in daily conversation, and not one that I can claim to know very well, I must also point out that at least we have the resources available here for interested readers to find out what adessive means. I'd also like to point out that adessive is not a POS, but rather a declensionary subset of nouns. Much as adessive is a category of sorts for nouns, I see no reason why gismu (or, ideally, something more intelligible to the general English-reading public, such as root word) could not be a category for Lojban nouns, verbs, etc.
- @Prosfilaes, you asked above,
So the world can be divided up into people that don't need Lojban definitions because they don't know enough Lojban grammar, and those who can use a Lojban dictionary? Why is that true of Lojban and not any other language?
- Ironically, that's a different take on part of what I'm asking, only I think you're trying to make the opposite point. My point is that the labels for Lojban entries are in Lojban, which no one here seems to know very well, and for which we have pretty wholly inadequate definitions -- inadequate for any English reader who doesn't already know Lojban. So why is that true of Lojban and not other languages? It seems to be true in part because we're using Lojban to describe Lojban.
- Reading through w:Part of speech, I see plenty of room for argument that a Lojban > English dictionary would do well to use labels like noun, verb, etc. Note that a dictionary that only Lojbanists can use does not meet my definition of a Lojban > English dictionary.
- -- Eiríkr Útlendi │ Tala við mig 05:50, 2 April 2013 (UTC)
- Then your definition of a Lojban > English dictionary is pretty weird and limited. As above, to make proper use of a dictionary like ours, you have to know something of both languages. There's other choices out there; I'm sure there are Germanic dictionaries that store Yiddish and Gothic in the Latin script, since most English speakers don't read Hebrew script. It certainly would help me when investigating what part Yiddish vocabulary played in Esperanto, but instead we made a Yiddish -> English dictionary that can only be used by people with some familiarity with Yiddish.
- Reading through w:Part of speech a general summary of a subject with emphasis on Latin and English tell you everything you need to know about the structure of Lojban? No disrespect, but the Dunning–Kruger effect comes to mind. You haven't established that "noun" and "verb" are reasonable headings to use for Lojban words. Again, what words are being used for as Lojban POS labels is much more up for discussion to me then whether or not Lojban words should be split up into categories that Lojban grammarians don't.
- Moreover, who is actually working on Lojban? If nobody who knows Lojban is interested into splitting them into nouns and verbs, the idea is dead on arrival. Renaming is doable by bot; rearrangement needing intelligence isn't.--Prosfilaes (talk) 07:25, 2 April 2013 (UTC)
- Well, gast my flabber. With your comment that my "definition of a Lojban > English dictionary is pretty weird and limited" (i.e., my definition that a Lojban > English dictionary must be something that doesn't require being a Lojbanist to use), you seem to be saying that the EN WT is not a many-to-English dictionary, but rather a project for specialists in their individual fields (i.e., in this specific case, that the Lojban portion of the EN WT should only be for Lojbanists, and need not be intelligible to anyone else). Do I understand you correctly? Is that what you are saying? If that's not your intended meaning, then clearly I'm confused as to your position. I'm increasingly getting the sense that you and I are talking past each other. -- Eiríkr Útlendi │ Tala við mig 15:24, 2 April 2013 (UTC)
-
- Did you read beyond that? A Yiddish-English dictionary written in Hebrew script, like ours and most are, is not intelligible to English speakers. Any Foo-to-English dictionary is a tool for people with some knowledge of Foo. People who know enough Lojban to understand gimsu and rafsi aren't specialists; they're the natural market for a Lojban to English dictionary.--Prosfilaes (talk) 21:45, 2 April 2013 (UTC)
- We do try to get contributors in languages that use non-Latin script to take the trouble to add non-idiosyncratic transliterations that can be found from our search box, don't we? Or is that too much to expect, too much catering to the lowest common denominator, the most scriptively challenged?
-
- The use of the Gismu (“Root”) PoS header would almost certainly not have been tolerated if folks knew that's all that was meant. We don't have "Root" as a PoS for any other language, do we? That is not because other languages don't have such morphemes. It is because the Lojban PoS headers effectively excluded outside review. The Lojban "parts of speech" seem to confound etymological ("loanword"), morphological, and functional categories even more than our PoS headers for English. And a "name word" can't be called a Noun? Would we accept Metaphor as a PoS?
- Perhaps the real problem is that Lojban is not a natural language and does not belong in a dictionary of natural languages. DCDuring TALK 22:34, 2 April 2013 (UTC)
- It's amazing the number of people who argue that we don't know what gismu means who also argue that we know what it means.
- I've never opposed adjusting the PoS of Lojban. So far nobody that knows Lojban has suggested it and proposed an adjustment, and certainly nobody has offered to make the complex changes that are being proposed here. And the changes you're proposing are not trivial; is red an adjective, a noun (a name word, "red object"), or a verb (łichííʼ)? The lines between adjective and noun can be thin enough in English; you're telling me we know certainly that it must exist in Lojban?--Prosfilaes (talk) 23:30, 2 April 2013 (UTC)
- (...after edit conflict...) Yes, I read beyond that. And I read enough into it to realize that, if our underlying operating assumptions were so radically different, then many of the points I've been trying to communicate would not be understood in the way I had intended them. Consequently, I thought it best to try to clarify what your operating assumptions are regarding the goal of the EN WT.
- Incidentally, I still don't know that -- what is your take on the point of the English Wiktionary? Is it to be a many-to-English dictionary, or is it to be a collection of specialist reference materials?
- You mention Yiddish. I don't work on Yiddish, and I have almost no knowledge about that. I've run into similar problems of intelligibility in researching etymologies that led me to Sanskrit terms that had no romanization. This site has some very profound issues in terms of information accessibility -- I think either that many of us have gotten so far into our specialized mindsets that we forget that beginners can't understand what we write, or that many of us do not have any real goal of making this information available and understandable, or at the very least discoverable, to the average English reader. (This covers cases like adessive as a label, which can be discovered and at least somewhat understood by clicking on the entry, but not gismu, which no one here seems to understand, and which entry is quite opaque.)
- My personal point in tackling Japanese entries here is in part to lower the barriers of entry to English readers who might try to understand that language. I recall my own painful frustration at feeling like I needed to read the whole dictionary just to be able to understand a single entry. I had multiple reference books commonly open on my desk -- an EN > JA dictionary, a JA > EN dictionary, and a kanji dictionary so I could try to puzzle out the other two. Until Kenkyusha came along with a furigana dictionary for non-Japanese-speaking learners, working with any JA dictionary was a real chore -- the information was not very accessible, nor very discoverable. Wiktionary offers tools to overcome these two profound issues, and I hope that some of my work here might make things easier for the next generation of English-reading learners of Japanese.
- My understanding of the whole underlying point of the English Wiktionary is that it is intended to be a many-to-English dictionary, as a resource targeting the average English-reading dictionary user, with (ideally) no bias towards any specific target language. My misgivings about Lojban, and indeed about lack of romanization for Sanskrit or Yiddish or Georgian entries, arises entirely from this foundational precept.
- Prosfilaes, what is your view of the basic point of the English Wiktionary? Is the ideal to create a resource for potentially any reasonably fluent English-language reader? Or is the ideal to create a resource for specialists in the field? -- Eiríkr Útlendi │ Tala við mig 22:47, 2 April 2013 (UTC)
- Is it to be a many-to-English dictionary, or is it to be a collection of specialist reference materials?? You're trying to force a dichotomy that I do not accept in the least. The Oxford English Dictionary is a rare specialist reference material; why would you expect a many-to-English version of that to be any different? The places where we fall short of being maximally generalist, in deleting things that are SoP yet people might look up, are things that lean towards specialist reference material as opposed to the more generalist dictionaries.
- You can't click on adessive and find out what it means; it doesn't link anywhere. You could improve gismu, and until you do I don't think you're competent to discuss the splitting up Lojban PoS.
- The Lojban entries in Wiktionary will be most used by people who know some Lojban. Since all the grammars use words like gismu, that means those words will be familiar to the people who are actually using Lojban entries in Wiktionary. Changing them to some other words will probably make harder for the people trying to use those entries. Changing them to something that doesn't reflect the underlying nature of the language will be a substantial dumbing down that is not worthy of any work calling itself a dictionary and seriously hurt the few people actually trying to use these entries.--Prosfilaes (talk) 00:57, 3 April 2013 (UTC)
- Did you read beyond that? A Yiddish-English dictionary written in Hebrew script, like ours and most are, is not intelligible to English speakers. Any Foo-to-English dictionary is a tool for people with some knowledge of Foo. People who know enough Lojban to understand gimsu and rafsi aren't specialists; they're the natural market for a Lojban to English dictionary.--Prosfilaes (talk) 21:45, 2 April 2013 (UTC)
-
- I haven't read the full discussion but having specific POS for some languages is not so uncommon. Consider Arabic "masdar" (مصدر (máṣdar)), sometimes called a "verbal noun" but that's only an approximation, the usage of masdars is wider. Russian predicatives (Category:Russian_predicatives) were also considered non-standard at some stage. We have to cater for the English audience but also educate users about language-specific parts of speech. The Japanese のadjectives (Category:Japanese の-no adjectives) got deleted, even though English linguist often use this term but Japanese linguists consider them noun phrases, e.g. 病気の人 (byōki no hito) a "a sick person", lit. "person of the sickness". I have no hard feelings about the deleted category but can this collocation really be considered the same as "fur coat" by the English reader? A new term "adjectival noun" is used to describe such nouns. Just my two cents. I haven't read the full discussion, so it may not be quite to the point. --Anatoli (обсудить/вклад) 01:21, 3 April 2013 (UTC)
- Because my previous comment did not make this clear, I want to point out that I oppose changing Lojban's headers from "gismu" etc to "noun" etc. It would be inaccurate to call gismus "nouns"—Lojban was designed, as artificial languages can be, not to have nouns, verbs etc, with the unsurprising result that English-speaking linguists don't describe it as having nouns, verbs etc. ("Lojban nouns" is barely attested on raw Google, and nonexistent in Books.) - -sche (discuss) 03:38, 3 April 2013 (UTC)
- If it helps, I've changed the definitions of gismu and brivla somewhat, hopefully to make it more clear. I'm not sure if "gismu" is actually a part of speech in Lojban, I think brivla really is the PoS. On the other hand, a fu'ivla (loanword) often behaves differently from other brivla so the idea that all brivla are interchangeable doesn't work here. They do all work more or less the same, though. —CodeCat 14:05, 3 April 2013 (UTC)
- I oppose changing the headers to noun, etc, but I think it would probably be helpful if we could link the headers to an Appendix:Lojban gismu which could explain exactly what it means. --Yair rand (talk) 01:34, 4 April 2013 (UTC)
Romanization and definition line [edit]
FYI: Wiktionary:Votes/pl-2013-03/Romanization and definition line.
Let us discuss as needed, and then start the vote; let us postpone the vote as needed.
I propose that each romanization entry is required to have a definition line in the wikitext. This is already the case with Pinyin romanization entries and Gothic romanization entries. Entries that direct the reader to another page hosting definitions include alternative form entries and inflected form entries; these do have a definition line in the wikitext per common practice. See also Category:Gothic romanizations and Category:Mandarin pinyin. --Dan Polansky (talk) 18:01, 30 March 2013 (UTC)
Recognizing User-Page Spam [edit]
We've been getting lots of user pages lately created by spambots. Looking through the deletion logs, I see quite a few deletion comments showing that admins aren't recognizing them for what they are.
Spam isn't just for putting advertising where people will see it. The main purpose, lately, is to get Google to see it. Google partly bases the order in which search results are listed on how many links and references there are on high-traffic sites like ours. The content of the spam page is irrelevant: as long as it contains links to the site or key phrases associated with links to the site anywhere on the page, it will improve the page ranking on Google.
The most common type of user-page spam is text taken from other websites with links and brand names hidden in it. The purpose of the text is to camouflage the links and brand names so the page is less likely to be deleted. Such pages aren't just spam, they're also copyvio. Delete them as promotional material, and permanently block the poster for spamming (some bots will re-create the pages if they're deleted and the account isn't blocked).
The other type is a fake profile, with a randomly-generated name, randomly-generated personal details, and a spam link said to be a favorite site or the user's home page. The combination can be really funny: a user name starting with "Dave" that belongs to a 16-year-old girl in Switzerland who likes horses, and whose favorite site sells erectile-dysfunction meds, or tractors in India. These should also be deleted as promotional material and the user permanently blocked.
The first of the types I just mentioned also gets posted to talk pages, so it should treated as spam there, as well. Chuck Entz (talk) 18:48, 30 March 2013 (UTC)
- The easiest way to patrol these is to look for the "new-user-page" tag. Not all edits with that tag are bad, but most of them will be. —CodeCat 19:13, 30 March 2013 (UTC)
Standards of identity and legal definitions [edit]
Following this RFD, this BP discussion, and this old discussion, I have created Wiktionary:Votes/pl-2013-03/Standards of identity and legal definitions of terms. Most of the credit for the proposal, which I hope I've done an adequate job of wording, goes to bd2412. Please discuss here or on the vote's talk page any change to the wording or any entirely different approach you'd like to see, after you read the previous BP discussion (link) for background. - -sche (discuss) 21:34, 30 March 2013 (UTC)
April 2013
Idea for proper noun entries that belong in an encyclopedia [edit]
There seem to be a lot of proper nouns that show up on WT:RFD. Many of these have articles in the EN WP. Since people are clearly looking for these entries, and some editors mistakenly think such entries belong here, while some readers mistakenly think they can find those entries here, it's clear there's some demand for having proper noun entries here at EN WT.
What would folks say to allowing the creation of proper noun entries, such as Mona Lisa or Mini Cooper or Hound of the Baskervilles, but just as redirects (soft or hard, as deemed appropriate) to the corresponding EN WP article? This would meet the apparent demand for such entries, while not wasting EN WT editor time writing and maintaining them, and while avoiding the inclusion of encyclopedic material in this dictionary project. -- Eiríkr Útlendi │ Tala við mig 17:06, 2 April 2013 (UTC)
- I don't think hard redirects to Wikipedia are even possible; they'd have to be soft. Wikipedia itself already has w:Template:Wiktionary redirect for pages that will only ever be dictionary entries; all we need to do is make a corresponding template here. —Angr 17:32, 2 April 2013 (UTC)
- Sounds good to me. I see, however, that Semper Blotto deleted Template:Wikipedia redirect way back in 2006... -- Eiríkr Útlendi │ Tala við mig 17:35, 2 April 2013 (UTC)
- I don't see why this is a good idea. How is it better than just not having the entries at all? How do we decide which entries need
{{only in|{{in wikipedia}}}} and which are red links? Or do we create such redirects for all entry titles which have Wikipedia articles? Mglovesfun (talk) 17:38, 2 April 2013 (UTC)- It seems fine to create
{{only in}}redirects to WP for all proper nouns (Why not all entries of any kind?) for which we do not have an entry. Editors can replace the redirect with an entry, which is subject to the usual reviews. At the very least we should use the redirects for proper noun entries that have failed RfD for whatever reason. DCDuring TALK 17:58, 2 April 2013 (UTC)
- Sorry, I thought my initial comment explains the "why" -- users, both as editors and as readers, are clearly coming to Wiktionary in search of such entries.
- As to which entries to convert, any proper noun entry that editors think should not be in Wiktionary would be a candidate for such redirection. If deemed necessary for clarity, the redirection template could include text explaining that Wikipedia might not yet have such an article, but that if anyone were to create such an article, it belongs in Wikipedia and not here.
- I'm simply floating an idea about how to respond to apparent user demand for encyclopedic proper noun entries in a way that 1) meets that demand, 2) points users to the appropriate place for such entries, and 3) and doesn't require much work from editors. -- Eiríkr Útlendi │ Tala við mig 17:59, 2 April 2013 (UTC)
- It seems fine to create
- I don't see why this is a good idea. How is it better than just not having the entries at all? How do we decide which entries need
- Sounds good to me. I see, however, that Semper Blotto deleted Template:Wikipedia redirect way back in 2006... -- Eiríkr Útlendi │ Tala við mig 17:35, 2 April 2013 (UTC)
A good idea, but there already is a page that comes up when someone goes to an undefined proper noun. See, for example Starry Night, Mini Cooper S, or A Study in Scarlet. It just doesn’t serve the required needs. The page that comes up for starry night, mini cooper s, or a study in scarlet is a bit better, but still could be improved.
I wonder if there is a way to improve “perhaps there is a page xxx in our sister encyclopedia project, Wikipedia.”
Anyway, let’s improve the 404 page instead of reinventing the wheel. —Michael Z. 2013-04-02 18:18 z
- The fact that people are searching for things doesn't mean we should include them, even as redirects to Wikipedia. The number one search on a user-generated replacement for Special:WantedPages was in fact the Mandarin for 'naked porno movies'. Mglovesfun (talk) 20:59, 2 April 2013 (UTC)
- Well, dang it, someone should create that Wikipedia article already.
- <ahem.> On a more serious note, the issue is not just that folks are searching for such pages, but that they are actually creating them. This generates maintenance overhead for WT editors. Redirecting users to Wikipedia might help reduce this overhead. -- Eiríkr Útlendi │ Tala við mig 21:03, 2 April 2013 (UTC)
- But redirects are bluelinks. If we make tens of thousands of redirects, how will anyone notice the few bluelinks which have, wrongly, been created as full entries that we (by our current policies and culture) tend to subject to WT:RFD? I agree with Michael: improve the "404" that comes up when someone clicks on [[Some Proper Noun]], goes to [11] or uses the search bar to search for "Some Proper Noun". - -sche (discuss) 21:28, 2 April 2013 (UTC)
- We already make color distinctions in our links: a lighter blue for links to other projects, orange for links with the wrong section. A bot could replace links to
{{only in}}entries with{{w}}links or "w:" piped plainlinks. Improving the 404 only partially addresses the problem, though it has the enormous advantage of, in principle, being easier to implement. DCDuring TALK 22:04, 2 April 2013 (UTC)
- We already make color distinctions in our links: a lighter blue for links to other projects, orange for links with the wrong section. A bot could replace links to
- But redirects are bluelinks. If we make tens of thousands of redirects, how will anyone notice the few bluelinks which have, wrongly, been created as full entries that we (by our current policies and culture) tend to subject to WT:RFD? I agree with Michael: improve the "404" that comes up when someone clicks on [[Some Proper Noun]], goes to [11] or uses the search bar to search for "Some Proper Noun". - -sche (discuss) 21:28, 2 April 2013 (UTC)
-
-
-
-
- I have never noticed light blue or orange, and as far as I know I have a good computer displays and good color vision. —Michael Z. 2013-04-03 14:28 z
- Orange links have to be turned on in your Per-browser preferences; as for light blue links, don't you see a difference between blue and blue? For me the difference is subtle but real. —Angr 15:24, 3 April 2013 (UTC)
- I have never noticed light blue or orange, and as far as I know I have a good computer displays and good color vision. —Michael Z. 2013-04-03 14:28 z
-
-
-
-
-
-
-
-
-
- Exactly. We also have greenlinks for no page corresponding to inflected forms, if you have the gadget for accelerated creation of these selected on user preferences. (See conquest#Verb.) DCDuring TALK 15:33, 3 April 2013 (UTC)
-
-
-
-
- I realize that I had possibly misinterpreted Mglovesfun's previous comment as suggesting we shouldn't even rework our 404. To clarify, I am not advocating that we start creating scores of pages solely for the purpose of redirecting to WP. My intent instead was originally just to ask if perhaps proper noun pages, particularly those that fail RFD (which I should have stated more specifically earlier), would benefit by having redirects to WP. Michael's suggestion of reworking our 404 sounds like a wonderful idea, either alongside specific redirects for pages that failed RFD, or as a replacement for that idea. -- Eiríkr Útlendi │ Tala við mig 22:19, 2 April 2013 (UTC)
-
- Then I'll back Michael's idea as well. Mglovesfun (talk) 09:16, 3 April 2013 (UTC)
-
-
-
-
-
-
- MediaWiki:Noarticletext contains the "Wiktionary does not yet have a mediawiki page for Noarticletext" message; you can change the message by editing that page. (There's also MediaWiki:Noexactmatch, but I don't know that it's used anywhere.) MediaWiki:Searchmenu-new, and possibly other pages, control(s) what's displayed when someone searches for a term we don't have. - -sche (discuss) 20:22, 3 April 2013 (UTC)
-
-
-
-
-
-
-
-
-
-
- Thanks. And do you know where to find the 404-from-a-link page, e.g. mini cooper s, and the additional wrong-case message added to Mini Cooper S? —Michael Z. 2013-04-03 21:09 z
-
-
-
-
-
-
-
-
-
-
-
-
- I presume it's one of these pages, but I don't know which one. - -sche (discuss) 21:40, 4 April 2013 (UTC)
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- The easiest way to find out is to visit http://en.wiktionary.org/w/index.php?title=mini_cooper_s&action=edit&redlink=1&uselang=qqx and examine the indicated messages. For example, (creating: mini cooper s) holds the place of a message generated by MediaWiki:Creating with $1 set to mini cooper s. (qqx is in the "private use" range of language-codes, so some enterprising MediaWiki developer decided to appropriate it for this purpose. I'm guessing the feature's primary target audience was interface translators, so they could find the message that they need to translate, but I've found it very useful myself.) —RuakhTALK 04:39, 8 April 2013 (UTC)
-
-
-
-
-
-
-
Gothic romanisation template [edit]
I have created Template:got-romanization (different from Template:got-romanization of!) and a sample entry "afdrausjan" (modified to use the new template). As with Japanese Template:ja-romaji the definition line with # is generated by the template, so it has both the headword and a definition. It has the same look and feel as a new romaji entry. Like Japanese, the Gothic entries only link to the main entry, no other information. --Anatoli (обсудить/вклад) 03:38, 4 April 2013 (UTC)
- How is it different from
{{got-romanization of}}? The output seems to be the same: they both say, "See XYZ" where XYZ is the spelling in the Gothic alphabet. I preferred it when it said "Romanization of", though. —Angr 10:52, 4 April 2013 (UTC)
-
- It's an attempt to make romanisation entries of different languages more similar to each other. Template:ja-romaji is increasingly used for Japanese romaji entries and there are two votes Dan Polansky has created in the protest of the change that was agreed on by JA editors after a very long discussion in BP. The votes: 1. Wiktionary:Votes/pl-2013-03/Japanese Romaji romanization - format and content and 2. Wiktionary:Votes/pl-2013-03/Romanization and definition line. The second vote is specifically about the approach on how definition line is added. Usually it's # on a new line in the wikitext. The new Japanese and the proposed Gothic template generate the definition line, thus not editable directly.
-
- User:Mzajac raised a concern that Japanese and Gothic are different from each other. Both Japanese and Gothic by default don't produce any definition as such, only a link to the main entry. Using a template will enforce this rule. The definition line will still be there (thus complying with Wiktionary:ELE#Definitions) but a new definition line is only added when a new parameter is added. The suggested template is much shorter and as proved by the current work on Japanese romaji entries can be generated very quickly both by people and bots.
-
- Re: "See" and "Romanization of". Again, just to make both templates (Gothic and Japanese) look similar. There's already the word "romanization" at the header level.
-
- New:
==Gothic==
===Romanization===
{{got-romanization|𐌰𐍆𐌳𐍂𐌰𐌿𐍃𐌾𐌰𐌽}}
-
- Old:
==Gothic==
===Romanization===
{{got-rom}}
# {{got-romanization of|𐌰𐍆𐌳𐍂𐌰𐌿𐍃𐌾𐌰𐌽}}
--Anatoli (обсудить/вклад) 11:23, 4 April 2013 (UTC)
Transpondine Portuguese [edit]
There is nothing in Wiktionary:About Portuguese concerning spellings on opposite sides of the Atlantic. I have been adding Brazilian forms as "alternative forms" of the spelling used in Portugal. But often, I see that the Portuguese Wiktionary does the exact opposite. Does anyone have an opinion on what we should do - or should it be up to the personal preference of our editors? SemperBlotto (talk) 15:55, 4 April 2013 (UTC)
Possible inadequacies in Template:Han char [edit]
In a discussion with User:Gdbf137, we discovered that Mac and MS seem to use different Cangjie input sequences. The Unihan database entry for 农 gives a Cangjie input sequence of LBV. Apparently, that works correctly on Mac OS X Lion. On Windows 7, however, MS's Changjie IME accepts HBV to input this character, while LBV just generates an error beep and no character is output.
Does anyone else have a handle on what's going on? Do we need someone to change {{Han char}} to allow for multiple Cangjie input strings, one per OS? Or, more frighteningly, has Microsoft and/or Apple been changing things willy-nilly, and we need to allow for multiple Cangjie input strings, one per OS version? -- Eiríkr Útlendi │ Tala við mig 17:35, 4 April 2013 (UTC)
- Is this related to Cangjie_input_method#Versions_of_Cangjie? "Currently, version 3 (第三代倉頡) is the most common; it is the version of Cangjie supported natively by Microsoft Windows ... The Cangjie input method supported on the Mac OS is somewhat like Version 3 and somewhat like Version 5." I don't know what the solution to this would be other than to specify what version the template is referring to. DTLHS (talk) 04:49, 5 April 2013 (UTC)
Cross-script/mutated semi-borrowings [edit]
This seems to be a repeated question, and it's come up again at Wiktionary:Requests for deletion#da. What do we do with half-borrowed words? Stuff like "da", which is clearly a Russian word being used in English, or "si", which is clearly a Spanish word being used in English, even if both would never be spelled that way in their original language. google books:si senor gets a lot of hits of English hits, even once we've excluded "Sí, Señor". Writings across the world are dropping a little bit of foreign language that their audience will understand in their text, and whenever there's orthographic differences, we'll probably see this type of change. "Da" can probably be attested in every major European language in this sense. Instead of creating senses under da for all the languages, maybe we could create a orthographically mangled (for foreigners) version of|да template (name to be changed, of course) and stick it under Russian. Same thing with si and danke schon and probably some mangled Latin we've deleted, etc. (This doesn't intend to change real borrowings, just one language stuck into another.) (The template could maybe use a foreign lang tag, so {{orthographically mangled (for foreigners) version of|old_lang=de|new_lang=en|[[danke schön#]]}}; I do suspect that da and friends are used in multiple Latin-script languages, but it's a too common particle to make that easy to check.)--Prosfilaes (talk) 05:17, 5 April 2013 (UTC)
- There are so many edge cases that it's hard to draw the line. Da might be meant as transliterated Russian, in which case WT:ARU disallows its existence. But I have a Hispanophone friend who sometimes says /siː ˈsɛn.nɚ/ as a joke, and si senor might be a valid English entry. I don't think you've made it crystal clear when to use this hypothetical template and when to create a normal entry, so I can't really support it yet. —Μετάknowledgediscuss/deeds 19:50, 7 April 2013 (UTC)
-
-
- I'm confused too. What if we use the normal process of finding citations? Words from one language used in another could be labeled as such, using
{{context}}or{{qualifier}}. It's dangerous if we go too far, e.g. if we start quoting all English words in Latin letters in another language, especially in non-Roman based languages. For the moment, I wouldn't go with romanised Russian either. --Anatoli (обсудить/вклад) 02:36, 8 April 2013 (UTC)
- I'm confused too. What if we use the normal process of finding citations? Words from one language used in another could be labeled as such, using
-
-
-
-
- I'm not suggesting we don't find citations; I'm worried about the stuff where we can find plentiful citations that establish it's between two languages. What I'm most concerned about is that stuff like "Do svidanya", that English speakers can find in English texts and want to look up, but is likely to get treated as Russian, and then get deleted because it's romanized. There seems to be a hole here where things that can be cited, and might actually get looked up, are deleted because they aren't real Russian or Latin, etc. I think si senor is a good example; it's not English, it's clearly Spanish or at least pseudo-Spanish. But we deleted danke schon for the same reasons, as not German. I'm not comfortable if it's created as English, it will survive. I am sure that eminently citable words and phrases like that need to be stored on Wiktionary in the spelling that people will find them used under, and what language tags them is less important then that.--Prosfilaes (talk) 06:27, 9 April 2013 (UTC)
-
-
-
-
-
-
- I'm not sure I understand your suggestion. We could have redirects for commonly known foreign words if they incorrectly spelled or written in the wrong script. do svidanya -> до свидания, danke schon -> danke schön (danke schon previously failed RFD but I see in the history, it was a full entry, not a redirect). Note: schon in German is a different word from schön. I don't think si senor or si señor merit an entry, English or Spanish. konnichi wa already exists as a romaji entry and can be looked up. --Anatoli (обсудить/вклад) 06:58, 9 April 2013 (UTC)
-
-
-
Template:sense [edit]
This template is used to label specific synonyms or antonyms. With antonyms that leads to problems though, like this edit shows: diff. People get confused because they expect that the sense being shown is the sense of the words listed after it. And that isn't really a strange assumption either, except that it's not how we use the template. So, would it be ok if some extra text were added to the template, so that it displays this instead: of the sense "(sense)" ? —CodeCat 14:19, 7 April 2013 (UTC)
- What is the before and after of your proposal in the general case? DCDuring TALK 14:54, 7 April 2013 (UTC)
- What do you mean? —CodeCat 16:05, 7 April 2013 (UTC)
- This is a fairly well-known problem; what are you actually proposing? Mglovesfun (talk) 16:34, 7 April 2013 (UTC)
- Um... I'm proposing to change the text that the template displays, like I said? —CodeCat 17:24, 7 April 2013 (UTC)
- To exactly what. DCDuring TALK 18:51, 7 April 2013 (UTC)
- Quote: "So, would it be ok if some extra text were added to the template, so that it displays this instead: of the sense "(sense)" ?" —CodeCat 18:58, 7 April 2013 (UTC)
- If anything, it should display "definition" or "def". "Sense" communicates mostly to us, perhaps to linguists. DCDuring TALK 19:12, 7 April 2013 (UTC)
- Status quo ante:
- CodeCat proposal (as I [incorrectly] understood it):
- CodeCat proposal (from below):
- Alternative proposal 1:
- Alternative proposal 2:
- There are numerous other arrangements of brackets, font types, and wording possible. I don't know that any of these will solve the problem of communicating the intent of the antonym section (and the less familiar semantic relations) while simply providing a breadcrumb back to the definition. We could also try putting "NOT" in front of the gloss for the antonyms heading only or we could skip trying to communicate to ordinary users. DCDuring TALK 19:42, 7 April 2013 (UTC)
- Quote: "So, would it be ok if some extra text were added to the template, so that it displays this instead: of the sense "(sense)" ?" —CodeCat 18:58, 7 April 2013 (UTC)
- To exactly what. DCDuring TALK 18:51, 7 April 2013 (UTC)
- Um... I'm proposing to change the text that the template displays, like I said? —CodeCat 17:24, 7 April 2013 (UTC)
- This is a fairly well-known problem; what are you actually proposing? Mglovesfun (talk) 16:34, 7 April 2013 (UTC)
- What do you mean? —CodeCat 16:05, 7 April 2013 (UTC)
- @CodeCat you said you wanted to change the text of the template, just not what you wanted to change it to. Mglovesfun (talk) 19:48, 7 April 2013 (UTC)
- I’ve proposed:
- in a previous discussion. — Ungoliant (Falai) 11:40, 9 April 2013 (UTC)
- Missed that. It has the advantage of brevity over the other proposals. And it makes sense if one read from linearly from the headings to the individual items: 'Antonyms of "definition"', 'Coordinate terms of "definition"' etc. How could a user misread it? Perhaps by ignoring the quotation marks and reading the "of" as part of the following text. Should "of" also be italicized? DCDuring TALK 11:57, 9 April 2013 (UTC)
- If the gloss is italicised and the ‘of’ isn’t, it will help prevent misreading. — Ungoliant (Falai) 12:05, 9 April 2013 (UTC)
- Actually I meant to include quotes around the sense, but that kind of got list in translation. —CodeCat 13:01, 9 April 2013 (UTC)
- So some possibilities with "of" are:
- (of definition):
- (of "definition"):
- (of definition):
- (of "definition"):
- Of these my favorite is the last, because: 1., we often put glosses in quotes, eg in
{{term}}, 2., 'Of' needs to distinguished, 3., the whole thing needs to be visually distinct from the terms following, including any that are not links, eg SoP circumlocutions. DCDuring TALK 14:00, 9 April 2013 (UTC)
- If the gloss is italicised and the ‘of’ isn’t, it will help prevent misreading. — Ungoliant (Falai) 12:05, 9 April 2013 (UTC)
- Missed that. It has the advantage of brevity over the other proposals. And it makes sense if one read from linearly from the headings to the individual items: 'Antonyms of "definition"', 'Coordinate terms of "definition"' etc. How could a user misread it? Perhaps by ignoring the quotation marks and reading the "of" as part of the following text. Should "of" also be italicized? DCDuring TALK 11:57, 9 April 2013 (UTC)
- The wording "of [sense]" or "of the sense [sense]" works for 'nyms and pronunciations, but not for usage notes.
I propose "in the sense '[sense]'" (or "in the sense of [sense]" or whatever), which is I think how normal people speak about a particular sense of a word. It works for 'nyms and pronunciations also: in fact, for me at least, it seems much more natural even for 'nymsand pronunciations.—msh210℠ (talk) 18:53, 9 April 2013 (UTC) ← Portions struck through at 04:45, 10 April 2013 (UTC).—msh210℠ (talk)- I was thinking of "of". Mglovesfun (talk) 22:06, 9 April 2013 (UTC)
- How about allowing an alternative wording, specified, say, by an "alt=" parameter for whatever cases cases not well served by "of". There are, in English at least, relatively few uses of
{{sense}}in Usage notes AFAICT. Is it commonly used there in other languages? DCDuring TALK 22:52, 9 April 2013 (UTC)
-
- I guess my issue is partially that
{{sense}}is often with not a gloss but a usage restriction or a field of endeavor as its parameter. For example, work (which currently has no 'nyms listed at all) might list 'nyms of the "Said of one's workplace (building), or one's department, or one's trade (sphere of business): He mostly works in logging, but sometimes works in carpentry" sense using {{sense|of a workplace or trade}} and 'nyms of the "(zymurgy) To cause to ferment" sense using {{sense|zymurgy}}. I've definitely seen examples of each of these types of uses of {{sense}}. Adding "of" would make no sense in those cases either. (The 'nyms aren't 'nyms of zymurgy.)And even in the more common case, viz even when the parameter of {{sense}} is a gloss of the headword, what we're really listing aren't 'nyms of "to cause to ferment" — as the wording "of cause to ferment" (or the awkward "of to cause to ferment") would imply. Rather, what we're listing are 'nyms of work in the sense of "to cause to ferment". So adding "of" doesn't cut it, in my opinion — not even for 'nyms and pronunciations.
Perhaps best would be "for [pagename] in the sense of:" with a colon at the end and no quotation marks around what follows. Quotation marks (and even italicization if the prefatory text isn't italicized) wouldn't work in the zymurgy (or field of endeavor) case, as it'd seem like "zymurgy" is a gloss. The colon is then necessary, as "in the sense of [gloss]" doesn't flow. Using only "in the sense of:" is still slightly ambiguous, not solving the problem we started with here: it could be referring to the listed antonyms rather than the headword. I think "for [pagename] in the sense of:" takes care of all these issues — though of course there may be others I haven't thought of.—msh210℠ (talk) 04:45, 10 April 2013 (UTC)
- I guess my issue is partially that
Some small changes to Mandarin (also Cantonese, Min Nan) entry structure and about topic categories - suggestion [edit]
I will run this by all our active Chinese contributors but I'd like to suggest to dump the rs (radical sort) value in Chinese entries, e.g. {{cmn-noun}}.
The rationale is the following:
- Finding the sorting order for the Chinese character entries is not straightforward, although Wiktionary itself is has this info. Lack of the knowledge impedes casual editors and any people who is sure about words but not sure about the structure to add new entries.
- The mistakes are numerous, I have fixed some when I noticed but I'm sure I missed many.
- Simplified and traditional topic categories are sorted differently but there is no real reason for it, e.g. 標準 (biāozhǔn) (“standard”) is sorted by "木11標準" (so will appear under "木" (tree) radical but its simplified equivalent 标准 by "biao1zhun3" and will appear under letter "B".
- A Chinese person who would rely on the radical sorting and very familiar with radicals and their order would probably be better just entering the word they are searching in Chinese and find it, rather than searching in the category listings
Take a look at this Category:cmn:Intermediate_Mandarin_in_traditional_script:
You see, a small number is sorted by a Latin letters, others are by radicals. Those under Roman letters are incorrectly formatted. Errors are often introduced when a traditional entry is created by copying a simplified entry and the initial character is different.
I suggest to remove the "rs" from entries and from category sorting and just sort by numbered pinyin (e.g. "biao1zhun3"), perhaps stop splitting topical Mandarin categories into simplified/traditional. Serbo-Croatian entries don't separate Cyrillic/Latin entries into separate categories. Or we need to check/fix all incorrectly formatted entries, for which we just don't have enough resources.
I'm not insisting on this change but User:A-cai is no longer very active here who did a great job and we could get more people on board if Mandarin entries were simpler.
Just want to check the mood and get opinions. We have tens of thousands of entries in traditional script, so there needs to be an agreement before anything happens. --Anatoli (обсудить/вклад) 04:24, 8 April 2013 (UTC)
- I have no strong opinion on this. The rs value is autogenerated when using
{{cmn new}}, which relies on{{zh-sortkeys}}to produce the rs of the first character in the page title. So doesn't really bother me. (I wish the language sections are just a single template, with various parameters included, eg.- {{language_name|標|準|p1=biāo|p2=zhǔn|jy1=biu1|jy2=zeon2|poj=piau-chún|n|[[standard]]|eg=|syn=基準|syn2=|ant=}} (effectively everything needed to generate 標準),
- and all the rest (trad-simp detection/conversion, pinyin analysis, sort key, even generating pinyin for character) are automated.) Wyang (talk) 05:16, 8 April 2013 (UTC)
-
- Thanks. You're well equipped, others are not so lucky. :)
- What about maintenance of topic categories. Many have been moved or deleted, just because they don't follow the structure of other languages.
- Category:Mandarin terms derived from English exists on its own (35 entries), although initially was meant to be split.
- Category:Mandarin terms in simplified script derived from English (356)
- Category:Mandarin terms in traditional script derived from English (301)
-
- Category:Mandarin terms derived from Japanese is now a separate category (21) but Category:Mandarin terms in simplified script derived from Japanese and Category:Mandarin terms in traditional script derived from Japanese deleted or moved (like many others, they are not empty!). It's a mess. Some long time editors like Tooironic seems to be confused about categories in Mandarin, so people just stopped categorizing Mandarin entries or categorise them at random (with or without words traditional/simplified). Well, the reason is simple - trad. and simpl. entries are sorted differently and therefore categorised differently. --Anatoli (обсудить/вклад) 05:54, 8 April 2013 (UTC)
-
- I do like the idea of getting rid of the duplication in categories- it always struck me as rather kludge-y. The main drawbacks/issues I can see would be characters that have multiple pronunciations, and the fact that we would instantly increase the membership of most categories and decrease the number of distinct entries per page. Also, the difference between traditional and simplified characters isn't as easy to see for those who don't know one or the other as for the difference between Latin and Cyrillic. I can see how there might be confusion about which terms in a category are traditional, simplified, or the same in both, and even which ones are paired with which. I'm sure those aren't terribly difficult to deal with, so I'm in favor of changing the category sorting.
- As for the rs parameter: we wouldn't have to get rid of it. It would be easier to just make it non-mandatory and ignore it in category sorting. Maybe someday we can give users the option of choosing which sort order to use, though we'd have to populate the rs parameters by bot, first. Chuck Entz (talk) 07:18, 8 April 2013 (UTC)
- Of course we should keep separate entries for simplified and traditional characters and words. Wiktionary after all aims to catalogue all words in all languages, in whatever forms. However I too support the abandoning of the old system under A-cai. It's simply not worth the extra effort. At present I add about 50 or so Mandarin entries a week. I imagine I, along with other editors, could create double the number of entries if we didn't have to deal with the rs field. But now Wyang says the rs field is generated automatically. Is that really the case? I just created a new Mandarin entry at 扇贝 - where is this automatic rs field you speak of? Did I do it wrong? If so advise me how. Cheers. ---> Tooironic (talk) 09:49, 8 April 2013 (UTC)
- When you create the entry, you can use the code {{subst:cmn new/a|p1=shàn|p2=bèi|n|[[scallop]]}} in both forms, and this will generate the entire content. Wyang (talk) 12:03, 8 April 2013 (UTC)
- Wow, that script is powerful. I just created 拆開 and 拆开 in seconds. Wish someone had told me about that earlier. But is the IPA on those entries correct? It doesn't look right to me... ---> Tooironic (talk) 23:09, 8 April 2013 (UTC)
- When you create the entry, you can use the code {{subst:cmn new/a|p1=shàn|p2=bèi|n|[[scallop]]}} in both forms, and this will generate the entire content. Wyang (talk) 12:03, 8 April 2013 (UTC)
-
-
- @Tooironic. Re: simplified/traditional separation. With Serbo-Croatian it's easier. The words in Cyrillic and Roman sort themselves differently automatically. As you know, the parameter "t" in
{{cmn-noun}}is an indicator that the noun is traditional, "s" is simplified. They are automatically added to Category:Mandarin nouns in traditional script or Category:Mandarin nouns in simplified script or both if the value is "ts". A word, which is both simplified and traditional will appear in both categories but if you just want Category:Mandarin nouns they will appear in the alphabetical order - both forms. We could apply the same sorting for both traditional and simplified noun categories but abandon trad/simp approach for topical categories? What do you think? - In a nutshell - I don't suggest removing "t", "s" and "ts" params, so SoP will always be separated into trad/simp categories as parts of speech. I suggest sorting by numbered pinyin instead of radical + number of strokes, i.e. "biao1zhun3" ("pint" parameter) instead of "rs" - "木11" for both simplified and traditional entries and remove words traditional/simplified from topical categories.
- @Tooironic. Re: simplified/traditional separation. With Serbo-Croatian it's easier. The words in Cyrillic and Roman sort themselves differently automatically. As you know, the parameter "t" in
-
--Anatoli (обсудить/вклад) 13:15, 8 April 2013 (UTC)
-
- I personally don't have any issues at finding the "rs" value, it only takes a few seconds longer to create a Mandarin entry and I have to open another tab. Don't get me wrong, guys. I am just worried that most templates we use for other languages don't work for Mandarin, like for example
{{etyl}}. Japanese entries also use sorting parameters (hiragana) but it's more consistent. Consider entries like 傍晚. It's adding to Category:cmn:Elementary Mandarin using "人10" as "skey" and Category:cmn:Elementary Mandarin in simplified script using "bang4wan3" as the sorting key. Why is it not categorised as a traditional version? If we treat simplified and traditional categories equally (using one sorting key) and move all topic categories to match other languages, then it would be easier for everyone. Musical instruments categories - trad/simp and without suffix all seem independent from each other - these entries ended up belonging to three topic categories, obviously using whatever sort order.
- I personally don't have any issues at finding the "rs" value, it only takes a few seconds longer to create a Mandarin entry and I have to open another tab. Don't get me wrong, guys. I am just worried that most templates we use for other languages don't work for Mandarin, like for example
-
- Category:cmn:Capital cities in simplified script and Category:cmn:Capital cities in traditional script don't have a common supercategory, they go directly under generic Category:Capital cities. Whatever category you take, there are problems. I stopped categorising a while ago, except for HSK, which is still OK, sort of.
-
- Allowing a bot to load rs value may not be such a bad thing but it's probably better to normalise categorise (make them similar to other languages - no trad/simp suffixes) and use numbered pinyin or radical sort (whatever we decide) but equally for both trad and simp entries. --Anatoli (обсудить/вклад) 13:03, 8 April 2013 (UTC)
-
-
- @Tooironic. I have modified your 屌絲 and created 屌丝. With my suggested way of categorising -
# {{slang|vulgar|lang=cmn|skey=diao3si1}}. Now both entries appear in Category:Mandarin slang and Category:Mandarin vulgarities sorted by chai1kai1 (under letter "D") (note categories are without words "traditional"/"simplified". - They are still in Category:Mandarin nouns in traditional script and Category:Mandarin nouns in simplified script - not suggesting to change that but we could change the sorting of the traditional term to be the same as simplified (pinyin, not rs), if we are in agreement.
- Please check whoever is interested, if this is worth attention. --Anatoli (обсудить/вклад) 00:24, 9 April 2013 (UTC)
- @Tooironic. I have modified your 屌絲 and created 屌丝. With my suggested way of categorising -
- I don't have any problem with this. I've never liked the idea of separating categories based on script types, especially two that share some characters. I wasn't even aware that some traditional terms were sorted differently. If this goes ahead, you will get my support. Jamesjiao → T ◊ C 01:41, 9 April 2013 (UTC)
-
-
-
- Great stuff. Will invite the creator - User:A-cai. I hope he will not be upset. We could still have some bots to do tricks with automatically adding rs values to Mandarin values, right?
- Wyang, you expressed suggestions how to add rs automatically but have not expressed your opinion on categories and sorting. What do you say?
- The hardest bit would be converting or automating this change but as I said, Mandarin topical categories are in a mess, anyway. --Anatoli (обсудить/вклад) 01:48, 9 April 2013 (UTC)
- I think simp/trad should be merged into one single category and sorted by pinyin. Adding the pinyins everywhere would be troublesome, but like I said I would prefer if all the templates in one language section are merged into one template {{language_name|..., with various things defined by various parameters, including definitions and context labels. But I can't see this being actualisable on Wiktionary any time soon, so... Wyang (talk) 04:24, 9 April 2013 (UTC)
-
-
-
-
-
- Both entries 動能 动能 belong to Category:cmn:Physics (not in
Category:cmn:Physics in simplified scriptorCategory:cmn:Physics in traditional script!) and are sorted by "dong4neng2", so appearing under letter "D", not under radical "力". If everyone is OK with this, I will update Wiktionary:About Sinitic languages. All entries in Mandarin categories with "...in simplified script" and "...in traditional script" should gradually be moved to categories without these suffixes, with the numbered pinyin sort order e.g.skey=dong4neng2or just by adding|dong4neng2in the category name, e.g.[[Category:cmn:Physics|dong4neng2]] - It's a lot of work and I am currently busy with other things but will get to this eventually.
- Parts of speech categories remain as they are for now, with the traditional/simplified distinction. We could change the sorting key for traditional entries to use pint rather than rs but I don't how. Simplified entries are sorted by pinyin. --Anatoli (обсудить/вклад) 00:17, 11 April 2013 (UTC)
- Both entries 動能 动能 belong to Category:cmn:Physics (not in
-
-
-
- Sorry for not responding sooner. I haven't had as much time to devote to the project in recent years. I'm all for automation and making things easier in order to make the site attractive to more contributors. No objections to your modification proposals. -- A-cai (talk) 17:49, 27 April 2013 (UTC)
En dash in {was wotd}? [edit]
Per user request at Template talk:was wotd#request to exchange hyphen for en dash, is it ok to change the hyphen “-” for an en dash “–” in {{was wotd}}?
This is a v. minor change, but it’s highly visible, so I thought it best to ask.
- I support. Good thing you asked, as some editors seem to really hate the use of typographic characters instead of plain ASCII ones. — Ungoliant (Falai) 12:31, 8 April 2013 (UTC)
-
-
- I would like to know what kind of person (other than a trained Wikipedia pedant) actually writes Bose–Einstein condensate rather than Bose-Einstein condensate. Equinox ◑ 21:47, 8 April 2013 (UTC)
-
-
-
-
- Writes? I don't think anyone uses a hyphen-minus in writing. People type it, but typesetters (who have nothing to do with Wikipedia) have always had to choose the correct dash-type character from the type tray or now character set. Pick up any properly typeset book, and you will find that Bose–Einstein condensate is typeset with an en dash.--Prosfilaes (talk) 01:35, 9 April 2013 (UTC)
-
-
It’s a weird world where we set type and publish it to the world on a typewriter keyboard. —Michael Z. 2013-04-09 02:58 z
- DC, regarding typographical correctness: yes, an en dash is correct here, while a hyphen is incorrect; see hyphen and dash. The hyphen is reserved for intraword usage, such as line-wrapping and compounds (such as line-wrapping ;), while en dash is used in varied contexts, including interword use such as this. See Wikipedia:Manual of Style: Hyphens for usage at ’pedia.
- Beyond correctness, there’s also aesthetics – a hyphen jumps out at me here as conspicuously too short (it’s sized for intraword use, and thus feels stubby surrounded by spaces), which is the standard typographical judgment.
- The main objections to use of non-typewriter typographical characters I’ve heard are:
- Rendering problems – non-ASCII characters render poorly on some computers, particularly older ones.
- Input or editing difficulties – some editors have difficulty entering non-ASCII characters (due to needing to use a character picker) or editing entries with non-ASCII characters (esp. due to rendering issues).
- Personal preference – some users prefer typewriter characters over book-style typographical characters.
- Use of typewriter characters is naturally common online, due to ease of input, though we needn’t be limited by it. In the case of templates (as opposed to use in entries), there aren’t any editing difficulties, and we have lots of Unicode throughout Wiktionary, so I don’t think there are significant problems, but want to check.
- Sounds like people are generally supportive (or “meh”); will wait another few days for more comments.
- —Nils von Barth (nbarth) (talk) 15:38, 9 April 2013 (UTC)
-
- Just go ahead and change it. Why on earth start a discussion about using the correct character in a template, where it will never have to be re-entered?
There being no opposition, I have gone and dunnit. —Michael Z. 2013-04-09 22:02 z
Facebook [edit]
I set up this page on Facebook for promoting Wiktionary of all languages. You are welcome to become co-administrators of the page, so you can update the page with inspiring messages. --LA2 (talk) 20:55, 8 April 2013 (UTC)
- Where do I apply? — Ungoliant (Falai) 21:29, 8 April 2013 (UTC)
- I noisily hate social meeja and would prefer us to "promote" ourselves through just making a good dictionary that people want to use. But I suppose it can't hurt :) Equinox ◑ 21:34, 8 April 2013 (UTC)
- I am boycotting Facebook, but why not promote Wiktionary there? DCDuring TALK 21:44, 8 April 2013 (UTC)
-
- I'm using that page to pull people out of Facebook and into Wiktionary. Whether you boycott Facebook doesn't matter, since you are already here. However, if someone would like to help to pick a "word of the day" for the Facebook page, I think that could make the page quite popular. --LA2 (talk) 22:37, 8 April 2013 (UTC)
-
-
- That's not the first page on Wiktionary in Facebook. Earlier this one was advertised. I liked both. Don't see why not. Would also be useful if we could recruit some native speakers and talented editors but promoting among users is also important, Wiktionary is for users, not for editors :) --Anatoli (обсудить/вклад) 00:57, 9 April 2013 (UTC)
- I really don't understand the amount of hate here for Facebook. Use it wisely and use it to your advantage. Don't post info that you don't want others to see.... Simple... I will take a look at the page on my home btw. I liken this attitude to the one on StackExchange towards Wiktionary. Take a look at this: How much should I trust Wiktionary?. I tried to defend Wiktionary and provide my own arguments (thanks Hippietrail for chiming in), but I can't change everyone's mind I guess. Jamesjiao → T ◊ C 01:56, 9 April 2013 (UTC)
- That's not the first page on Wiktionary in Facebook. Earlier this one was advertised. I liked both. Don't see why not. Would also be useful if we could recruit some native speakers and talented editors but promoting among users is also important, Wiktionary is for users, not for editors :) --Anatoli (обсудить/вклад) 00:57, 9 April 2013 (UTC)
-
-
-
-
-
- It is correct that in a Beer parlour discussion in March 2012, the existing Facebook page was mentioned, but that is a placeholder page that Facebook created based on a Wikipedia entry. That page doesn't get updated and there is no way to claim it, it's a dead end. The page I created now has a dozen co-administrators that are able to update the page and appoint more co-administrators. It's an anarchy of the same kind as the Wikisource page on Facebook, that I set up last year. It gets updated sometimes, but not very often. Right now, the Wikisource page has 418 fans and Wiktionary has 69. --LA2 (talk) 13:58, 9 April 2013 (UTC)
- 69? That's a good position to be in. Mglovesfun (talk) 22:04, 9 April 2013 (UTC)
-
- groans loudly* OK, seriously, Facebook pages have some sort of automated system in which you can write a bunch of posts and they'll come out on a schedule. Assuming somebody's willing to put some time in, we could easily have posts and to spare. —Μετάknowledgediscuss/deeds 15:11, 13 April 2013 (UTC)
-
- 69? That's a good position to be in. Mglovesfun (talk) 22:04, 9 April 2013 (UTC)
- It is correct that in a Beer parlour discussion in March 2012, the existing Facebook page was mentioned, but that is a placeholder page that Facebook created based on a Wikipedia entry. That page doesn't get updated and there is no way to claim it, it's a dead end. The page I created now has a dozen co-administrators that are able to update the page and appoint more co-administrators. It's an anarchy of the same kind as the Wikisource page on Facebook, that I set up last year. It gets updated sometimes, but not very often. Right now, the Wikisource page has 418 fans and Wiktionary has 69. --LA2 (talk) 13:58, 9 April 2013 (UTC)
-
-
-
- WE could have a Facebook widget on our Front Page, that users could click on. I think the code is something like <a title="Tell Facebook" href="http://www.facebook.com/sharer.php?u=http://en.wiktionary.org/;t=Wiktionary">Facebook</a> SemperBlotto (talk) 15:20, 13 April 2013 (UTC)
Proposal of a pronunciation recording tool [edit]
Hello, Rahul21, a developer, offers to develop a pronunciation recording tool for Wiktionary, helped by Michael Dale as part of GSoC. The tool would allow to record and add audio pronunciations to Wiktionary entries while browsing them (see background discussion on Wiktionary-l). Please read and comment the proposal! Regards, Nemo 22:37, 9 April 2013 (UTC)
A slightly different way to show etymologies derived from Latin verbs [edit]
Romance languages use the infinitive as the lemma, but for Latin we use the 1st person singular present. This means we can't write "from Latin cantō" in any of the etymologies at cantar, because the infinitive derives from cantāre. Most entries solve this by just saying "from Latin cantāre, present active infinitive of cantō". But that is rather wordy, moreso than what's really needed to get the point across: the word cantar derives from cantāre, but its Latin lemma/paradigm entry is at cantō. For that reason I've started to use another approach, by writing {{term|canto|cantāre|lang=la}}. So it will show "cantāre", but link to canto. Since not many entries have this, I wondered if nobody had considered doing it that way yet, so I'm sharing the idea here. :) —CodeCat 02:22, 11 April 2013 (UTC)
- That’s a good idea. — Ungoliant (Falai) 02:37, 11 April 2013 (UTC)
- I'd done that and waited for someone to complain about it. The case that CodeCat mentions seems ideal for that approach. What about derivations from participle forms? DCDuring TALK 03:30, 11 April 2013 (UTC)
- I've done this for months :) —Μετάknowledgediscuss/deeds 15:00, 13 April 2013 (UTC)
- I just do "from Latin cantō", exactly as you say we "can't". (I guess I've found a way! :-P) The French verb chanter really does come from the Latin verb cantō, so it's straightforward and correct. It's only a problem when people try to gloss cantō as "I sing" (as though they were glossing the specific form) instead of the correct "to sing" (which is how we gloss verbs). —RuakhTALK 16:39, 14 April 2013 (UTC)
- I like CodeCat's suggestion. Also, had I in the past noticed any entry glossing "canto" as "to sing" rather than "I sing", I would have changed it and (though this discussion informs me not to do so) I would have marked the edit as minor, assuming I was uncontroversially correcting a simple error by a random IP unfamiliar with Latin grammar. - -sche (discuss) 00:10, 15 April 2013 (UTC)
- Another issue is the descendant section of Latin verbs. Should, say, video’s descendants be linked to as
{{l/pt|ver}}or{{l/pt|ver|vejo}}? — Ungoliant (Falai) 01:02, 15 April 2013 (UTC)
- I have no problem with the concept, but you need make another template for this purpose instead of overloading the meaning of
{{term}}'s parameters. To anyone not familiar with this usage, it's confusing and looks like an error. I've probably accidentally "corrected" one of these before so that the macron and no-macron versions matched. Pengo (talk) 02:12, 7 May 2013 (UTC)- Why would we need another template to do effectively the same thing? The template doesn't mandate that the linked entry and the displayed term are in any way "the same", and as far as I know other people have been doing this for a long time. For example, you sometimes see definitions like this: # [[break|broken]]. I don't see anything wrong with that in principle. —CodeCat 02:20, 7 May 2013 (UTC)
Appendix:1000 Japanese basic words [edit]
This may not be appropriate for the BP but since this is the most visible spot, I want to ask everyone their opinion about Appendix:1000 Japanese basic words and what to do with it. (I wrote something on the talk page too.) It's a good appendix now, but it's "1000 Japanese basic words" and the description is "This appendix is a specific list of one thousand basic words," and yet there are about 700 words in it.
Some background: I don't know the full story but as far as I can tell, in a nutshell, the Japanese Wiktionary was building the list ja:Wiktionary:日本語の基本語彙1000 some time ago, and the editors here decided to copy it. At the time the original list was incomplete. Since then, the original list has grown but en.WT's list has not been maintained. Now, ja.WT's list has surpassed 1000 words and their list says "作業中 現在:989項目 2008年11月16日 一旦、1,000以上挙げ、その後取捨選択するなり基本語彙2,000にタイトルを変更するなりする方針としたいと思います。" which means that their list broke 1000 entries and that they are considering changing the name to 2000 basic words.
We can go two routes: depart from ja.WT and keep it a list of 1000 basic words, or mirror their version, and exceed 1000 words in the process.
I don't have exact numbers, but if you search for "Japanese word list" on Google, our appendix is the first result. That suggests to me that the wider world is making use of it as a resource. While ja.WT's version is good, it lacks essential words such as 可愛い (kawaii), いっぱい (ippai, "very"), たくさん (takusan, "many"), or すごい (sugoi, "very/wow!".) You can't have a 30-second conversation with high school students without using those words. Conversely, ja.WT's appendix has quite specific words such as ミミズ (mimizu, "earthworm") and 十二指腸 (jūnishichō, duodenum). Duodenum is a basic word?
How about both routes? I would like to combine the most basic of the "basic" words and the Japanese Language Proficiency Test Level 5 appendix (the lowest level) for a "1000 basic Japanese words" appendix, and maybe mirror ja.WT's appendix on a different page. --Haplology (talk) 05:02, 12 April 2013 (UTC)
- Your last paragraph sounds eminently reasonable, and I fully support that method (although I think perhaps mirroring ja.wikt's appendix is less important, because it would appear that we are a better arbiter of basicness than they are). —Μετάknowledgediscuss/deeds 14:59, 13 April 2013 (UTC)
-
- This appendix is not a very scientific one and was made by amateurs. It's worth adding words to make a thousand, choosing carefully from JLPT or frequency list and/or removing that are identified as not being basic.
- The valuable time could be spent on making Appendix:JLPT better - fixing the word format and choosing the spelling we actually have here, e.g. we have 上がる but not 上る, or create the alternative spellings.
- JLPT appendices could be made similar to Appendix:HSK list of Mandarin words with new categories like
Category:JLPT/N5Category:ja:JLPT-5 or similar. --Anatoli (обсудить/вклад) 01:53, 15 April 2013 (UTC)- I'm glad we all agree. I've been adding common words from the N5 list to the category, and once the category reaches 1000 items, I plan to add them to the appendix and add the sort keys to the categories. I've been though the whole N5 list once and added common words at my discretion (but not all of them,) and there are now almost 900 words in the category. I plan to go through N5 again, and also look at the N4 list and try to find any other essential words that may have been missed. The original list is biased toward nouns, so other parts of speech would be good places to look for new candidates. It also ignores casual words like ちゃう, which is also essential to high school students, or pretty much anybody. To anyone who is so inclined, if you see anything that strikes you as essential in the real world, then please add it. --Haplology (talk) 05:42, 17 April 2013 (UTC)
- I have just created new categories. What I meant is something like this: Category:Japanese by difficulty level with five categories. I only added two words as examples: 会う to level 5 (Category:ja:JLPT-5) and 安心 to level 4 (Category:ja:JLPT-4). The actual names of categories and templates, format and links can be discussed. The HSK categories provide a bit more info and look better. Please take a look. --Anatoli (обсудить/вклад) 06:09, 17 April 2013 (UTC)
- Sure, that sounds good. I just have a few questions. So basically this means completing the JLPT appendices project, as well as the 1000 basic words project, and having both exist in parallel? That's what I would hope for, as both projects have already been made, and they serve slightly different purposes. I assume that no new words would be added to the JLPT categories, only the ones already on the appendices? In the process of reviewing the appendices, it sounds like you want some revision to be done to them, such as adding more common forms like 上がる rather than 上る. I agree with that. I just changed "掃除 そうじする to clean" to "掃除 そうじ cleaning", but perhaps "掃除する そうじする to clean" would be better, and have that link to 掃除? I think there is also 近く, so what should be done with that? In the past there was some opposition to creating pages like 近く, but I think there's precedent for pages like that in other languages and there's no policy against them. It's mainly just that the Japanese editors have enough work with lemmas, and if there are going to be forms like 近く with their own entries, I'd rather a bot add them. The L5 appendix was a bit slow to edit, but did not time out or have any problems like that, so I guess there's no need to break it up like L1 (which was too much for the server to display.) What do you think about breaking up appendices? --Haplology (talk) 04:28, 18 April 2013 (UTC)
-
-
- Yes, I think both templates and category groups could easily coexist.
- する-verbs, I'd link to lemma but display lemma + する because they are verbs. Having "掃除 to clean" would look weird because 掃除 is a noun. I have adopted this for translations. Same thing for な-adjectives.
- Cleaning sounds good but I don't know if JLPT would prescribe 上る for the tests, not 上がる. JLPT is a bit more strict in nature than 1000 basic words but I have no idea who made original lists, how accurate and up-to-date they are. Should students for level 5 know both forms? We can always have simple entries with links to main entries, even skipping conjugations, etc. to save time. What do you think?
- No strong opinion on 近く but since く-adverbs are simple in structure, I don't see why we should discourage them, also for the sake of back translations from English. No need to create them, if a bot could do it but I wouldn't delete if they exist.
- Breaking up appendices - OK. You already did one. --Anatoli (обсудить/вклад) 04:53, 18 April 2013 (UTC)
-
tt [edit]
A lot of editors are used to typing <tt> to make things look typewritery. In HTML5, tt is “entirely obsolete, and must not be used by authors.”[12] The W3C suggests:
Where the
ttelement would have been used for marking up keyboard input, consider thekbdelement; for variables, consider thevarelement; for computer code, consider thecodeelement; and for computer output, consider thesampelement.
It looks to me like code is a good general replacement. More specific semantics can be conveyed with samp, kbd, and var. Continuing to use tt in discussions won’t break anything, but we should replace it in templates and entries, so we don’t have to endure the shame of unnecessary validation errors after the MediaWiki software is brought up to par. —Michael Z. 2013-04-12 17:51 z
By the way, also gone the way of the rotary dial are acronym, big, center, font, strike, and u, and all of those styling attributes on table elements. —Michael Z. 2013-04-12 17:59 z
- What does "obsolete" mean in HTML-world? I went to an HTML class today, and we were using some of these (well, definitely font) without any indication that they could ever be a problem. —Μετάknowledgediscuss/deeds 03:39, 14 April 2013 (UTC)
-
- Font? Ouch – I should have a word with your teacher.
-
- During the 1990s’ browser wars, every browser was making up new features and displaying them differently, and web development was a fragmented nightmare. Since then, the W3C approves the official open standards that make up the web based on feedback from browser developers, and we can mostly write HTML for one standard instead of for five current and twenty-seven past browsers (but don’t get me started on MSIE 6). The wide adoption of CSS, which allows for the separation of presentation from document structure, has led to newer versions of HTML deprecating and obsoleting purely presentational elements.[13] Unfortunately, the nature of wikitext encourages editors to include lots of presentation guff repeated many times in every page, but this is bad practice because it bloats pages and makes maintenance difficult. Like templates, style sheets let us centralize presentation and reduce page bloat.
</pedantry>
- During the 1990s’ browser wars, every browser was making up new features and displaying them differently, and web development was a fragmented nightmare. Since then, the W3C approves the official open standards that make up the web based on feedback from browser developers, and we can mostly write HTML for one standard instead of for five current and twenty-seven past browsers (but don’t get me started on MSIE 6). The wide adoption of CSS, which allows for the separation of presentation from document structure, has led to newer versions of HTML deprecating and obsoleting purely presentational elements.[13] Unfortunately, the nature of wikitext encourages editors to include lots of presentation guff repeated many times in every page, but this is bad practice because it bloats pages and makes maintenance difficult. Like templates, style sheets let us centralize presentation and reduce page bloat.
-
- Browsers are built for backwards-compatibility, so most of the old elements will still work. But as an organization for openness, we should follow the recommendations of current open standards, and certainly abandon practices deprecated in the last century.
-
-
- Thanks for that explanation. Specifically, my teacher recommended using CSS (which I'm learning now), but said that for basic formatting, just using the HTML tags is fine (although it may not be much faster than inline CSS). I agree with replacing them in templates but not giving a damn on discussion pages. —Μετάknowledgediscuss/deeds 04:44, 15 April 2013 (UTC)
-
100 million edits [edit]
According to our sources, the 100 millionth edit was made to Wiktionary (all languages taken together, humans and bots included) during Friday April 12. Congratulations to us all! About 20% of the edits have gone into the English Wiktionary. --LA2 (talk) 02:05, 13 April 2013 (UTC)
Foreign word of the day: reconstructed terms, constructed terms and name. [edit]
In the vote for creating the FWOTD feature, the points “eligibility of reconstructed languages” and “eligibility of constructed languages” didn’t achieve consensus (except conlangs which don’t meet CFI, which failed) by the end of the vote.
Also, we’ve had a few of people complain about the name “Foreign word of the day,” so if anyone wants to suggest a change feel free to do so.
Summarising, I’m consulting the community on:
- whether terms in reconstructed languages (Proto-Indo-European, Vulgar Latin, Proto-Germanic, etc.) should be allowed to be foreign words of the day;
- whether terms in constructed languages that meet CFI (Esperanto, Ido, Lojban, etc.) should be allowed to be foreign words of the day;
- whether the feature’s name should be changed.
— Ungoliant (Falai) 14:18, 13 April 2013 (UTC)
- I support the eligibility of reconstructed languages, because they are some of our most interesting content. Naturally, for reconstructed terms we shouldn’t require pronunciation and should require a reference from a trustworthy source instead of citations.
- I support the eligibility of constructed languages that meet CFI. Don’t see why not.
- I oppose changing the name. I don’t find it offensive in any way whatsoever.
- — Ungoliant (Falai) 14:18, 13 April 2013 (UTC)
- I support the first two, and I kind of oppose the third because I don't see anything wrong with the current name. In Dutch, there is a nice word anderstalig, but I don't know if English has an equivalent word. Maybe that would be a good word to feature? :) —CodeCat 14:43, 13 April 2013 (UTC)
- I oppose the eligibility of reconstructed languages since they are by definition uncitable. That's why they're not in mainspace, too. I support the eligibility of constructed languages that meet CFI. I abstain on the issue of the name; I don't understand what could be offensive about it, though I can see it might be misleading, but I can't think of a better name besides "non-English word of the day" which sounds dumb. Incidentally, although you didn't ask, I also oppose allowing mentions rather than uses to count as cites in FWOTD nominations. I know that mentions are good enough for RFV when it comes to LDLs, but I think FWOTD ought to have higher standards than RFV/CFI. Note that FWOTD already requires pronunciations, even though nothing at CFI requires them. —Angr 14:51, 13 April 2013 (UTC)
- While I sympathise with your point, this would make it much harder to feature words from languages without contributors who speak them, like Kaingang and Quechua, and it’s already Indo-European dominated enough as it is. — Ungoliant (Falai) 15:16, 13 April 2013 (UTC)
- The trouble with allowing a single mention is that there's no protection against errors. If the single source we use for Kaingang or Quechua has a fictitious entry (whether deliberate or accidental) or even just a typo, then we are at risk of propagating that error if we don't confirm it elsewhere. Bad enough when that happens in any entry, but worse when it happens in an entry being featured on the main page. —Angr 17:08, 13 April 2013 (UTC)
- While I sympathise with your point, this would make it much harder to feature words from languages without contributors who speak them, like Kaingang and Quechua, and it’s already Indo-European dominated enough as it is. — Ungoliant (Falai) 15:16, 13 April 2013 (UTC)
-
- I vote per Ungoliant, although I also support the eligibility of terms in conlangs,
which Ungoliant took no stance on. —Μετάknowledgediscuss/deeds 14:54, 13 April 2013 (UTC)
- I vote per Ungoliant, although I also support the eligibility of terms in conlangs,
-
-
- I did. — Ungoliant (Falai) 15:16, 13 April 2013 (UTC)
- Sorry. Rectified above. —Μετάknowledgediscuss/deeds 03:37, 14 April 2013 (UTC)
- I did. — Ungoliant (Falai) 15:16, 13 April 2013 (UTC)
-
-
- I vote per Angr. I'm undecided on whether the name needs to change; we don't have a great alternative, but I do understand why people might want to change it.--Prosfilaes (talk) 19:56, 13 April 2013 (UTC)
Not much point in voting against a title if there is no clear proposal for a replacement.
What exactly were the complaints against “foreign?” It’s not exactly offensive, but kind of ignorant when it’s a minority of English speakers who live in countries where other languages are truly foreign. Calling French a foreign language in Canada, for example, is incorrect and at least off-putting to a francophone Quebecker who accepts his or her first or only language for granted as native.
What alternatives are there?
- foreign-language word of the day
- non-English word of the day
- other-language word of the day
- alterlingual word of the day (is there a real Latinate word?)
- alloglossal word of the day (ditto Greek?)
- interlingual word of the day
- international word of the day
- global word of the day
- world word of the day
- exotic word of the day
- other word of the day
—Michael Z. 2013-04-13 19:21 z[updated list —Michael Z. 2013-04-14 14:40 z]
- But suppose you are an anglophone Canadian who learned French. If someone asks you “do you speak any foreign language?”, isn’t “French” a correct answer? — Ungoliant (Falai) 19:45, 13 April 2013 (UTC)
-
- No? I would regard it as sloppy usage of the word "foreign" = from a different country. In any case, suppose you are a francophone Frenchman; why would French be foreign?--Prosfilaes (talk) 19:50, 13 April 2013 (UTC)
-
-
- Well, foreign also means “from a different language,” and many Canadians live with only one of the official languages, which is why such misunderstandings can happen.
-
-
-
-
- "From a different language" is not listed as a definition at foreign, and it doesn't sit right with me when ASL or Native American languages get lumped in as foreign languages, though the lack of a better term often means they do. At Distributed Proofreaders, we got in a habit of using "languages other than English (LOTE)", precisely because they weren't foreign to our site or users.--Prosfilaes (talk) 10:40, 14 April 2013 (UTC)
-
-
-
-
-
-
-
-
-
-
- Can we agree to enhance the name by moving WT:Foreign Word of the Day to WT:Foreign-Language Word of the Day? We can get used to that in a month or two and see if it still raises readers’ ire. And reconsider renaming if it appears warranted later? —Michael Z. 2013-04-14 17:47 z
- I support "non-English"; "foreign-language" strikes me as having pretty much all the problems "foreign" does.--Prosfilaes (talk) 07:10, 16 April 2013 (UTC)
-
-
-
-
-
-
-
Regarding the entry foreign, defintion 2, example "eating with chopsticks was a foreign concept to him": Certainly, this use of "foreign" is not restricted to other cultures? Things can be "a foreign concept" to a person that has never met that idea before. I think good synonyms are "unfamiliar, unknown, strange", and that these should be added to the explanation. But English is not my native tongue. --LA2 (talk) 23:43, 14 April 2013 (UTC)
Increasing default font-size [edit]
I proposed this a couple of weeks ago, and had little feedback. Not sure if everyone doesn’t care or just didn’t notice. So I’m posting this reminder, and will change the site’s default font-size, shortly. —Michael Z. 2013-04-14 02:02 z
- It looks perfectly readable to me so I see no reason to change it. Why do you think it's too small? —CodeCat 03:04, 14 April 2013 (UTC)
-
- As I wrote in the original post, editors have used Common.css to enlarge the font for 54 languages and scripts, affecting thousands of entries. The discrepancies bug me. —Michael Z. 2013-04-14 05:14 z
- It is odd to me that the existing "default" font size for the site would not be the default for the user's browser, i.e. not medium. But Web designers seem to work upon contrarian principles of their own. Bigger is fine by me, but I hope it can be set to browser default rather than a hard-coded "what looks good on this year's monitors". Equinox ◑ 03:32, 14 April 2013 (UTC)
-
- I did the math. Browser default.. For a preview, copy the first bits from my vector.css. —Michael Z. 2013-04-14 05:19 z
-
-
- I will update MediaWiki:Vector.css within the hour. Complaints welcome. —Michael Z. 2013-04-14 15:34 z
-
-
-
-
-
-
- I thought two BP discussions with no opposition would constitute consensus to try out a harmless improvement. Your single subjective opinion after a one-minute look at a major visual change doesn’t constitute any consensus or evidence either. Thanks for speaking for everybody. —Michael Z. 2013-04-14 16:18 z
-
-
-
-
-
-
-
-
- I'll be opting out anyway. I didn't like it at all. Mglovesfun (talk) 16:00, 14 April 2013 (UTC)
-
-
-
Could someone actually respond to the evidence I have cited, instead of blowing away a major change based on “I don’t like it,” without even using it? —Michael Z. 2013-04-14 16:19 z
- Sorry, I don't see anything that I would call "evidence". In the previous discussion you gave a list of putative advantages, but seemingly no "evidence" for them. (Perhaps you and I define the term differently?) At any rate, if you want people to reply to something specific, please indicate what. In particular, if you could highlight some part of your argument that would justify increasing the font size even if no one liked the result, that would certainly be interesting! —RuakhTALK 16:45, 14 April 2013 (UTC)
-
- The biggest objective evidence that our font-size is small is that other editors have been increasing it, to the tune of over 50 CSS declarations in our style sheet, the majority setting the font-size to the browser default. No one has mentioned any disadvantage of setting the font-size to the browser default.
-
- I’ve put in significant time doing research and testing, tried to outline my reasoning, and did my best to get feedback. Not one objection was made. Now, could someone here at least do me the courtesy of actually trying to use this for an hour or a day, instead of taking one glance, blurting out “I don’t like it” because it is different, and blowing off my effort completely? —Michael Z. 2013-04-14 17:00 z
-
-
- Re: "could someone here at least do me the courtesy of actually trying to use this for an hour or a day": http://en.wiktionary.org/wiki/User:Ruakh/common.css?diff=20160735. —RuakhTALK 17:23, 14 April 2013 (UTC)
-
-
-
-
- Thank you for that. Sorry to get cranky. I included a list of what I see as concrete advantages in my original proposal. I think things can be improved, and I would appreciate critical feedback. —Michael Z. 2013-04-14 17:50 z
-
-
- I've been trying out the larger size for the past several days. While it's more legible, there are other drawbacks. While this larger size may correspond to the "de jure" default browser size, it doesn't correspond to the "de facto" default size for web pages. Almost every other text-based website I look at has smaller text, much closer to the "traditional" Vector size. People get used to one font size on webpages and when they encounter something noticeably smaller or (as in the proposed new Vector size) much larger, it looks absurd. And more urgently, if we change the default Vector size here at English Wiktionary we're out of sync with every other Wikimedia project's Vector skin. I know that perfect unity isn't possible across languages, but at least every English-language project's Vector should look like every other English-language project's Vector. If I'm looking at Wikipedia, then at Wikisource, then at Commons, and then at Wiktionary, it's startling when Wiktionary's text is so much larger than every one else's. And if I didn't know that it's that way because I deliberately set it that way on my own CSS page, I would be baffled and put off by it. —Angr 21:31, 17 April 2013 (UTC)
-
- Some good points I hadn’t considered in detail.
-
-
- WikiMedia branding. Indeed, most WikiMedia projects use 13px font size. I see that zh and ja Wiktionaries use 15px, Arabic, Pashto and Farsi 14px. However, explicit branding elements in the other projects vary a lot. Among Wiktionaries, even the site logos (!), home-page layout, use of tone and colour, icons, etc., vary wildly. The only thing all these sites have in common is the basic MediaWiki interface with grey and white background and blue rules. Also, the favicon is identical on all but cs. and en.Wiktionary. Choosing font-size for branding over readability would be poor prioritizing, when it would make an insignificant difference in the visual identity, but potentially a large one in readability. If we value our uniform branding at all, why don’t we coordinate site design, or unify even the most basic branding elements before compromising readability?
- The appearance of credibility. It’s true that 13px may be the the most popular font-size,[22] but that isn’t a “de facto default” in any sense I can think of, nor does being widely used make it the best choice for anything specific.[23] A website doesn’t look smart or credible by picking the most popular font size for no other reason. It does it by considering the factors that font size affects, and choosing an appropriate size for the particular site. Increasing font size for over 50 languages while sticking to a13px default looks “absurd” to me.
- Readability. As you say, a larger font than 13px is more legible. This is particularly true on both the extra-small and extra-large screens that more readers are using these days. Still more so for many of the language scripts we use, as we have concretely demonstrated in our style sheet.
- Accessibility. Overlaps with the above, but it should be mentioned that many of the designers of the average 13px websites have good eyes, good displays, and are poorly schooled in accessibility and internationalization. Many of these “average” websites are aimed at youthful or moneyed markets. Ours is the broadest possible audience, including non-native readers, aging, vision-impaired, impoverished, having only mobile internet access, etc. Failing to optimize readability harms segments of our audience that many other websites ignore.
-
-
- I still think any disadvantages of increasing font size are minor at worst, and far outweighed by the concrete benefits. —Michael Z. 2013-04-21 00:52 z
- Wikimedia projects' appearance varies widely from language to language, but not so much from project to project within a single language. When I had my Wiktionary font size set larger, I found it genuinely distracting to go from Wikipedia to Wiktionary because of the font size difference. It makes you notice the print rather than the content, which is a sign of poor typography. As for column width, as the page you linked to says, the solution is to define columns as being a certain number of ems wide, not to force the text to appear larger. But that too is something that ought to be done to all (English-language) projects' Vector skins, not just Wiktionary. —Angr 22:02, 23 April 2013 (UTC)
- I still think any disadvantages of increasing font size are minor at worst, and far outweighed by the concrete benefits. —Michael Z. 2013-04-21 00:52 z
Tracking category for missing inflected forms [edit]
Feel free to let me know if there is a better way of doing this already in place, but an idea struck me recently upon seeing red links in inflection lines. I think that we should have a system to track these links, since they are either valid missing entries for inflected forms of lemma entries or incorrect inflections being displayed on entries (for example, words lacking plurals or different feminine forms where the editor has not changed the template's default behavior). In both cases, they should be actively dealt with, either by creating pages for missing inflected forms or correcting the inflection templates. This seems like low-hanging fruit, since it is simple work and a motivated editor could do dozens of these in a sitting, or far more with acceleration. It would be relatively simple to use the ifexist parser function so that pages with red links in their inflection templates are put in a maintenance category recording that, so that editors can come along and address them.
As an example of what I am talking about, I made an edit to {{es-adj}}, so that it now puts entries with red-linked feminine singular forms in inflection templates into Category:Missing Spanish feminine adjectives. Have a look at that category to see what I mean. There are 884 of these (as of now) being detected, which means potentially 884 missing entries just in looking at Spanish singular feminine adjective forms alone. Ideally, I think this kind of category could be useful across all of the inflection templates and all of the inflected forms they output, but I wanted to raise the idea here for comment. We may want to have broader categories than the "Missing Spanish feminine adjectives" one I created; maybe all entries with missing inflected forms should go in a single big maintenance category. Is this a useful idea? Dominic·t 07:02, 16 April 2013 (UTC)
- There is one major difficulty with that. To check whether a given page exists is considered "expensive" by the MediaWiki software, and we're limited to about 100 of those checks per page. Once a page reaches that limit, any remaining checks will return "does not exist". So, we can't use this too much on pages because there is a danger that it will break the page if overused. —CodeCat 12:43, 16 April 2013 (UTC)
- Agreed. This is a bot job; we just need to convince somebody like SB to take it on. —Μετάknowledgediscuss/deeds 13:55, 16 April 2013 (UTC)
- I would be more afraid of false positives from people not changing the inflection template defaults if we just created them all at once than I would be of pages which will hit the limit of parser function checks from adding this new one. Do we have any reason to think there would be many, or any, pages that would break? I am fairly sure the limit is actually 500 calls, not 100. That's a lot of inflection templates for one page. Also, once the limit is reached, it does not make the functions return false, creating false positives. It actually just refuses to expand the templates after the limit. Dominic·t 14:53, 16 April 2013 (UTC)
- Agreed. This is a bot job; we just need to convince somebody like SB to take it on. —Μετάknowledgediscuss/deeds 13:55, 16 April 2013 (UTC)
암글 [edit]
Can somebody delete 암글--wasn't sure how/where to ask? King jakob c 2 (talk) 20:47, 16 April 2013 (UTC)
Done. Thanks. Adding the {{delete}}template is enough. — Ungoliant (Falai) 20:58, 16 April 2013 (UTC)
Template term and lang parameter [edit]
I oppose template {{term}} requiring the "lang=" parameter, showing "???" before the term if the lang parameter is not provided. This change seems to have been introduced to the template today or yesterday by CodeCat (talk • contribs). An example of use of template "term" without lang parameter: physics. --Dan Polansky (talk) 08:07, 20 April 2013 (UTC)
- Something like this seems to have been discussed at Template_talk:term#lang. People should not use such obscure pages to discuss significant changes! --Dan Polansky (talk) 08:09, 20 April 2013 (UTC)
I feel the same way. Ƿidsiþ 08:52, 20 April 2013 (UTC)
- Why do you oppose it exactly? Not specifying the language leaves many problems: the link does not link to the correct section, the script template is not applied, and the word is marked in HTML as English (which creates usability problems). I wonder what justification there can be for ignoring those problems. —CodeCat 12:33, 20 April 2013 (UTC)
- This change breaks many, many, many discussion pages. -- Liliana • 12:36, 20 April 2013 (UTC)
- I don't think displaying a small notification really breaks anything. It's just a friendly reminder that something is missing and needs to be corrected. I don't know how to make it less obvious without making it so unobvious that nobody sees it. —CodeCat 12:39, 20 April 2013 (UTC)
- Others' posts should never be edited, even in case of incorrect syntax and such. At best, this should be restricted to the main namespace. -- Liliana • 12:41, 20 April 2013 (UTC)
- We've edited or broken people's posts in the past. Whenever a template is deleted, if that template is used in a past post, deleting it will break the page, but we do it anyway. In some cases we've replaced the template with an equivalent, but in other cases the pages remain broken. For example look at the transclusions of
{{hr}}; some were replaced by "sh" but some still remain. Similar with{{zh}}. This isn't really any different. We can't always guarantee backwards compatibility, and indeed we shouldn't try to go too far out of our way for it. —CodeCat 12:52, 20 April 2013 (UTC)
- We've edited or broken people's posts in the past. Whenever a template is deleted, if that template is used in a past post, deleting it will break the page, but we do it anyway. In some cases we've replaced the template with an equivalent, but in other cases the pages remain broken. For example look at the transclusions of
- Others' posts should never be edited, even in case of incorrect syntax and such. At best, this should be restricted to the main namespace. -- Liliana • 12:41, 20 April 2013 (UTC)
- I don't think displaying a small notification really breaks anything. It's just a friendly reminder that something is missing and needs to be corrected. I don't know how to make it less obvious without making it so unobvious that nobody sees it. —CodeCat 12:39, 20 April 2013 (UTC)
- @CodeCat: Naturally, I am not opposing using "lang=" for non-English languages to add script, and whatnot. I am opposing making "lang=en" mandatory for English. What you wrote does not seem to apply to English terms without lang=: "the link does not link to the correct section, the script template is not applied, and the word is marked in HTML as English". What I am saying is, if there is no lang=, let "term" template assume the term is English, as it did before your edits. --Dan Polansky (talk) 13:09, 20 April 2013 (UTC)
- I think you are a bit mistaken. It always has been mandatory, because specifying lang=en has never, in the history of the template, been equivalent to specifying no language. So it never assumed that the term is English, not before my edits and not after them. That is one of the biggest flaws in this template in particular, which others (which do default to English) never had because they were created properly from the beginning. The result is that we now have thousands of entries that use this template both for English and for many other languages, without specifying which. Simply changing the template so that English is the default is therefore not an option, because it would not be correct for the many thousands of non-English words that lack a language. The only option that I know of is to mark lack of a language as an error so that it be corrected. I am currently running a bot to correct some of the most obvious ones (uses where the
{{term}}template is preceded by{{etyl}}, which allows the bot to figure out the correct language), but there are still many many more that need to be fixed. —CodeCat 13:19, 20 April 2013 (UTC)- Re: "It always has been mandatory, ...": That seems incorrect. If lang= really were mandatory, the template would complain of a missing parameter. The parameter could only have been "mandatory" in a sense that I do not know. --Dan Polansky (talk) 13:24, 20 April 2013 (UTC)
- What I meant is that the template doesn't do what it should do if the language is left out. The correct behaviour, when lang=en is given, is to use Latn as the script, "en" as the language, and link to the English section. But when no language is given, it uses None as the script, "" as the language, and links to no section. Therefore, to correctly link to English terms, the language is mandatory. —CodeCat 13:29, 20 April 2013 (UTC)
- All of that is irrelevant. This is one of our most heavily-used templates, especially by our less-template-sophisticated editors. Changes that significantly affect its behavior should be discussed thoroughly in an appropriate venue before being implemented. Most of the people who use it aren't going to have a clue what the ??? means, and a good many won't know where to go to find out. There should have been some steps taken to educate people before implementing it. Chuck Entz (talk) 16:24, 20 April 2013 (UTC)
- There is a help message when you hover the cursor over it. That may not be entirely obvious, but actually writing the message out would look really bad and would have made even more people angry. The real "education" has been in
{{term}}'s documentation, which I presume is the proper place to put it. —CodeCat 17:03, 20 April 2013 (UTC)
- There is a help message when you hover the cursor over it. That may not be entirely obvious, but actually writing the message out would look really bad and would have made even more people angry. The real "education" has been in
- All of that is irrelevant. This is one of our most heavily-used templates, especially by our less-template-sophisticated editors. Changes that significantly affect its behavior should be discussed thoroughly in an appropriate venue before being implemented. Most of the people who use it aren't going to have a clue what the ??? means, and a good many won't know where to go to find out. There should have been some steps taken to educate people before implementing it. Chuck Entz (talk) 16:24, 20 April 2013 (UTC)
- What I meant is that the template doesn't do what it should do if the language is left out. The correct behaviour, when lang=en is given, is to use Latn as the script, "en" as the language, and link to the English section. But when no language is given, it uses None as the script, "" as the language, and links to no section. Therefore, to correctly link to English terms, the language is mandatory. —CodeCat 13:29, 20 April 2013 (UTC)
- Re: "It always has been mandatory, ...": That seems incorrect. If lang= really were mandatory, the template would complain of a missing parameter. The parameter could only have been "mandatory" in a sense that I do not know. --Dan Polansky (talk) 13:24, 20 April 2013 (UTC)
- I think you are a bit mistaken. It always has been mandatory, because specifying lang=en has never, in the history of the template, been equivalent to specifying no language. So it never assumed that the term is English, not before my edits and not after them. That is one of the biggest flaws in this template in particular, which others (which do default to English) never had because they were created properly from the beginning. The result is that we now have thousands of entries that use this template both for English and for many other languages, without specifying which. Simply changing the template so that English is the default is therefore not an option, because it would not be correct for the many thousands of non-English words that lack a language. The only option that I know of is to mark lack of a language as an error so that it be corrected. I am currently running a bot to correct some of the most obvious ones (uses where the
- This change breaks many, many, many discussion pages. -- Liliana • 12:36, 20 April 2013 (UTC)
Support the change.
No one has changed any discussion pages, but if you want your talk posts to continue looking the same, don’t leave live templates in them. Use subst. —Michael Z. 2013-04-20 23:24 z
- Totally support making lang= obligatory, but we should wait until the bot run is over before displaying the ???s, and not display them at all outside the content namespaces. — Ungoliant (Falai) 23:53, 20 April 2013 (UTC)
- The bot doesn't really have anything to do with the ??? either, the bot works from a category that can be added or removed independent of the question marks. But from the way the bot is running now, it's not really making a serious dent in the amount of pages. It is making the occasional change but it's skipping most of the pages in the list without doing anything (because it sees no change it can make). There were around 45 thousand pages in the list when it started, and I expect it won't be able to get rid of more than a few thousand of them currently; it's at 41 thousand now. —CodeCat 00:14, 21 April 2013 (UTC)
- But in this revision of warlock the term lie has ???s, and after the bot edit it doesn’t. — Ungoliant (Falai) 01:01, 21 April 2013 (UTC)
- That's true, but that's only because the bot has changed something that happened to both remove the ??? and remove it from the category. What I am saying is, the bot works from the category, and the ??? doesn't influence that. If we removed the ??? the category would still be there, and we could also put in ??? and remove the category. —CodeCat 01:06, 21 April 2013 (UTC)
- But the bot does influence the ???s. What I was saying is that we should wait for the bot run to be over before displaying them, because there would be no benefit displaying something that makes our entries look bugged when it’s going to be automatically fixed soon enough. But I changed my mind, since the bot isn’t going to make a serious dent (unfortunately). — Ungoliant (Falai) 01:18, 21 April 2013 (UTC)
- That's true, but that's only because the bot has changed something that happened to both remove the ??? and remove it from the category. What I am saying is, the bot works from the category, and the ??? doesn't influence that. If we removed the ??? the category would still be there, and we could also put in ??? and remove the category. —CodeCat 01:06, 21 April 2013 (UTC)
- But in this revision of warlock the term lie has ???s, and after the bot edit it doesn’t. — Ungoliant (Falai) 01:01, 21 April 2013 (UTC)
- The bot doesn't really have anything to do with the ??? either, the bot works from a category that can be added or removed independent of the question marks. But from the way the bot is running now, it's not really making a serious dent in the amount of pages. It is making the occasional change but it's skipping most of the pages in the list without doing anything (because it sees no change it can make). There were around 45 thousand pages in the list when it started, and I expect it won't be able to get rid of more than a few thousand of them currently; it's at 41 thousand now. —CodeCat 00:14, 21 April 2013 (UTC)
- Support considering lang= obligatory (meaning only that it must be present: I think it's fine for it to be explicitly blank, but English should be lang=en), probably oppose whatever "bot run" Ungoliant and CodeCat are referring to (it doesn't seem like it was ever discussed or approved?), weakly support some sort of visual indication of missing lang= once that's rare (though I'd strongly support such a visual indication if it were visible only to admins and opters-in), and oppose distinguishing content namespaces from non-content namespaces in this respect, since that will just make it harder for editors to learn what they're supposed to be doing. —RuakhTALK 00:30, 21 April 2013 (UTC)
- The bot run is adding lang= to uses of
{{term}}in etymologies where it can use a preceding{{etyl}}template to determine the correct language. Basically, it's replacing{{etyl|xx|yy}}{{term|word}}with{{etyl|xx|yy}}{{term|word|lang=xx}}. It didn't seem like a very controversial change. —CodeCat 00:35, 21 April 2013 (UTC)
- The bot run is adding lang= to uses of
Perhaps we should have a class of error messages that are hidden from readers but displayed for all logged-in editors. —Michael Z. 2013-04-21 01:18 z
- That might both be a good idea and a detrimental one.
{{nl-noun}}shows "error" messages when some of its parameters are missing, and calls on the viewer to provide them. Since those messages were added to the template, I have seen quite a lot of editors - IPs, newly registered and experienced alike - take the messages to heart and provide the forms. We even have an editor, User:DrJos, who registered specifically to provide the forms and has now made it his life's work to fix them all. :) So I would say that's first-hand evidence that this kind of notice not only works, but it even gets IPs to lend a hand. So if we decide to hide these requests from IPs, we will be losing some of the editors who might help out. —CodeCat 01:26, 21 April 2013 (UTC)- Don't forget that we also serve a lot of site visitors who don't edit and have no idea what "lang=" is. Why should some 10-year-old doing his or her homework have part of the content replaced by ??? so you can send a wake-up call to someone else? Are we the "dictionary that anyone can edit", or "the dictionary that everyone has to edit"? Chuck Entz (talk) 02:49, 21 April 2013 (UTC)
- Re: " […] part of the content replaced by ??? […] ": That's a straw man, since the version with ??? still has all the same content. (The ??? appears before the term, not instead of the term.) Maybe you meant to say that the 10-year-old would think that the ??? had replaced actual content? —RuakhTALK 03:43, 21 April 2013 (UTC)
- My mistake. I had already forgotten what the actual effect was, having only seen it on one page. Although I obviously overstated the effect, it still seems a bit much to clutter the main body of the text used by non-editors with stuff aimed strictly at editors. It might indeed cause concern among non-editors that something was broken that they didn't know how to fix.
- I'm not opposing the eventual implementation of such a change, just the massive scale of the change combined with the lack of effort taken to get consensus and to get feedback about what effect it might have, let alone to prepare people for it. Something that noticeably changes the appearance of a significant percentage of our millions of entries should require more than a general mention of the principle behind it here and there, followed by a discussion on the template talk page that only a very few would even know about. Chuck Entz (talk) 04:15, 21 April 2013 (UTC)
- Re: " […] part of the content replaced by ??? […] ": That's a straw man, since the version with ??? still has all the same content. (The ??? appears before the term, not instead of the term.) Maybe you meant to say that the 10-year-old would think that the ??? had replaced actual content? —RuakhTALK 03:43, 21 April 2013 (UTC)
- Don't forget that we also serve a lot of site visitors who don't edit and have no idea what "lang=" is. Why should some 10-year-old doing his or her homework have part of the content replaced by ??? so you can send a wake-up call to someone else? Are we the "dictionary that anyone can edit", or "the dictionary that everyone has to edit"? Chuck Entz (talk) 02:49, 21 April 2013 (UTC)
- I have rolled back CodeCat's edits to
{{term}}because currently, it seems that only CodeCat and Michael support the ???s, whereas Dan, Widsith, Liliana, Chuck, and I oppose some aspect of CodeCat's edits altogether, and Ruakh and Ungoliant do not support putting in the ???s until non-lang-specified uses become much more rare. That's only 22% of editors in support so far. This is why we need to have BP discussions before making sweeping changes to the interface as readers view it. —Μετάknowledgediscuss/deeds 02:03, 21 April 2013 (UTC)
-
- Not read the whole discussion, but do we need '???'? Is there any way of making these stick out less like a sore thumb, this is a dictionary after all, readers come here for lexical information, not to correct wiki syntax. PS there is a line in User:Mglovesfun/vector.js that converts
{{term|foo}}into{{term|foo|lang=en}}. Mglovesfun (talk) 09:45, 21 April 2013 (UTC)- I think that so far, the majority of people in this discussion agree that it's a good idea to make sure
{{term}}always has a language code. But that immediately brings up the question, how do we get there? Even if people want to add a language where it's missing, how can they do it? The reason why I added ??? was that it would make it obvious to editors that something needs fixing there. Making the problem visible and apparent is the first step towards fixing it, and that has been a real problem before. I also argued that showing a similar message on{{nl-noun}}has indeed helped to make the problem visible and therefore has led to more people fixing it. The bot I am running is helping, but it can only do so much; it has almost passed over all entries with a missing language but it has only managed to fix about 10% of the total (from 45 thousand to 40 thousand). A bot could never fix the majority of the entries that remain. So I suppose the real goal of this discussion is: if at least some of us agree that adding a language in all cases is a good thing, what can we do to make that happen and make it happen more quickly? If adding ??? to the entry is not the right way, then what is? —CodeCat 12:06, 21 April 2013 (UTC)- If there isn't a bot solution for the remaining 90%, then I guess we'll just have to use MG's JS (or a modified form of it) on every page we're already editing. The reason why the ???s don't work is that instead of solving the problem, they create a new one. It looks messy and unprofessional, and users have to go for an unintuitive tooltip to find what's gone wrong. (Don't get me wrong, I love xkcd, but tooltips are not what people try first upon seeing a cryptic message.) This is not an acute crisis, so if a chronic solution is the best we have, so be it. —Μετάknowledgediscuss/deeds 14:39, 21 April 2013 (UTC)
- What exactly does the script do? Blindly adding lang=en is not correct... if it were, we probably would have done that already. I think there is one approach that we could try in the long term. If we could weed out all the uses that are not English (which are presumably a minority) then it becomes more feasible to add lang=en to the remainder. Using Lua, we might be able to recognise some of the languages, and we can use other means as well. For example, anything with
{{polytonic}}as the script is bound to be Ancient Greek (and that template even sets lang="grc" if nothing is provided), so adding lang=grc whenever sc=polytonic is present is safe. Adding lang=got where sc=Goth is also safe, and many other scripts are only used for one language so we can derive the language from the script. We can also look at the characters in the term being linked to. Templates can't recognise which characters a word consists of, but Lua can. So if a word contains, say, Hiragana or Cyrillic, we can be pretty certain it's not English. We could also separate out calls to{{term}}that use Latin characters that are not used in English, like å. Granted, none of those approaches is absolutely failsafe, but it would probably be right more than 99% of the time, and it would make it much easier to chip away gradually at the number until it becomes more manageable. And making a few mistakes (marking a link with the wrong language) is not serious, especially not considering that currently 40000 are marked with the wrong language (it can only get better!). —CodeCat 14:55, 21 April 2013 (UTC)- I was imagining that most would be English, and then it would be easy for a human to scan it and fix the langcode if necessary. I don't know what percentage the script/character method can handle, but I'm sure it's noncontroversial for you to attempt it. —Μετάknowledgediscuss/deeds 14:59, 21 April 2013 (UTC)
- I don't know how many there would be either, but I can add an invocation to a module (that needs to be made) which would add a category to the page when lang= is not present. That module can then decide to add the page to different categories depending on other factors like the script code or the characters in the word. The number of entries in each category would then be used to gauge what needs to be done. And even if one category contains only a few hundred entries, that's still a few hundred fixed and done. Every little bit helps, and we'll need to do this in little bits. :) —CodeCat 15:05, 21 April 2013 (UTC)
- I was imagining that most would be English, and then it would be easy for a human to scan it and fix the langcode if necessary. I don't know what percentage the script/character method can handle, but I'm sure it's noncontroversial for you to attempt it. —Μετάknowledgediscuss/deeds 14:59, 21 April 2013 (UTC)
- What exactly does the script do? Blindly adding lang=en is not correct... if it were, we probably would have done that already. I think there is one approach that we could try in the long term. If we could weed out all the uses that are not English (which are presumably a minority) then it becomes more feasible to add lang=en to the remainder. Using Lua, we might be able to recognise some of the languages, and we can use other means as well. For example, anything with
- If there isn't a bot solution for the remaining 90%, then I guess we'll just have to use MG's JS (or a modified form of it) on every page we're already editing. The reason why the ???s don't work is that instead of solving the problem, they create a new one. It looks messy and unprofessional, and users have to go for an unintuitive tooltip to find what's gone wrong. (Don't get me wrong, I love xkcd, but tooltips are not what people try first upon seeing a cryptic message.) This is not an acute crisis, so if a chronic solution is the best we have, so be it. —Μετάknowledgediscuss/deeds 14:39, 21 April 2013 (UTC)
- I think that so far, the majority of people in this discussion agree that it's a good idea to make sure
- Not read the whole discussion, but do we need '???'? Is there any way of making these stick out less like a sore thumb, this is a dictionary after all, readers come here for lexical information, not to correct wiki syntax. PS there is a line in User:Mglovesfun/vector.js that converts
How about an error message like the one after this term [! Editors: the preceding term template lacks a language code]. Visible to all, relatively unobtrusive, self-explanatory and ignorable. The copy could be made more accessible; it should convey that an improvement is needed but doesn’t affect the accuracy of the information. —Michael Z. 2013-04-21 15:57 z
-
- How about Greek όρος (óros, “term”) [! Editors: the preceding term template lacks a language code]. I think it belongs after the whole template, because it refers to that construction. If it were before the brackets, it would look more like it was referring to the term itself.
-
-
- If we wanted a bit more urgency and context, a background appearing on hover or expand could tie it all together like Greek όρος (óros, “term”) ⊕ Editors: this term template lacks a language code. —Michael Z. 2013-04-21 23:27 z
-
-
-
-
-
- Can you point me to one, or tell me how to generate one? I remember something like that, but now when I try to save a module with a script error, I just see the big red box at the top of the page. —Michael Z. 2013-04-22 00:50 z
- You can look in Category:Pages with script errors. —CodeCat 01:03, 22 April 2013 (UTC)
- Can you point me to one, or tell me how to generate one? I remember something like that, but now when I try to save a module with a script error, I just see the big red box at the top of the page. —Michael Z. 2013-04-22 00:50 z
-
-
-
Support making lang mandatory. It may also be possible to include automatic transliteration later. Perhaps rather than "???", it should say "which language???". --Anatoli (обсудить/вклад) 23:52, 21 April 2013 (UTC)
- I don't like calling it an error. For one thing, it's beside the point and adds extra verbiage, but mostly, it gives the impression that things are falling apart. I would suggest following the lead of some of our rf- templates: "This term template is lacking a language code. If you know it, please add it as a lang= parameter". Still verbose, but it would only show on hover. The symbol should be something small and innocuous, like the one Michael suggested above, or maybe a bullet (•). Even the question marks might not be so bad- as a trailing superscript. Or how about: όρος (óros, “term”)[→?] (I'm sure there are attributes that would make it look more like a live control, but you get the idea). Chuck Entz (talk) 02:45, April 21, 2013 (UTC)
-
- I have no strong opinion on the message and the format of the warning. Whatever the community decides but making the "lang" mandatory is important, otherwise, just use square brackets or something. I also think
{{etyl}}should have the second parameter mandatory as well. Otherwise, people just add to English loanwords, even if they mean another language. --Anatoli (обсудить/вклад) 02:56, 22 April 2013 (UTC)- There are a good number of uses of etyl with a null lang= parameter, as a way of standardizing the language name. I even used to do it myself, before I was aware of things like template overhead. I suppose they would be pretty easy to locate and subst out using a bot, though. Chuck Entz (talk) 03:30, 22 April 2013 (UTC)
- I have no strong opinion on the message and the format of the warning. Whatever the community decides but making the "lang" mandatory is important, otherwise, just use square brackets or something. I also think
I support keeping lang mandatory (i.e., not defaulting to English) for now, fixing transclusions, and then defaulting to English. Or just keeping it mandatory. But I oppose any error message visible to not-logged-in users. This is a technical error, not a content error: it is a missing language parameter in the HTML, not a missing etymology or pronunciation. There's no need for visitors to see the error message.—msh210℠ (talk) 05:12, 22 April 2013 (UTC)
I oppose CodeCat's recent change to {{term}} for "keeping lang mandatory" (msh210) and for using ??? as "just a friendly reminder that something is missing" (CodeCat), as it is a little well- and a lot ill-done. What's that ill-done at all? First of all, void is "just a friendly reminder." Liliana wisely noted: "At best, this should be restricted to the main namespace." So did Chuck Entz.
CodeCat's change made all the past discussions look so ugly, freckled with so many ???, looking like subduing again the "global readership at CodeCat's disposal" (User:KYPark/mulberry, 16 April 2013), doing without due consensus again; again and again in my cases! As I've discussed most for a year with {{term}} heavily used, mine must look ugliest so that CodeCat looks like aiming at me, megalomanically speaking.
Again, I'd attend to Liliana saying to CodeCat: "Others' posts should never be edited, even in case of incorrect syntax and such." This should be so because so vital and prior is the global readership of end users as end judges. Just unjust would be the interference or intervention of unequal, intermediary administrators with "others' posts" being arbitrarily edited. Delete could be the worst edit. My posts are supposed to suffer the worst blocking in effect. I wish CodeCat and others could learn a lesson from this happening.
--KYPark (talk) 12:46, 22 April 2013 (UTC)
- I'm sorry if I can't take your arguments seriously if you're turning this into a personal vendetta against me. Please go and do something more useful. —CodeCat 12:52, 22 April 2013 (UTC)
- How dare you accuse CodeCat of “aiming at” you? Whether the changes to
{{term}}were ultimately good or not, she is just trying to improve Wiktionary. The world doesn’t revolve around you. — Ungoliant (Falai) 14:28, 22 April 2013 (UTC)
- If you're legitimately arguing against the change to the term template (belatedly, since it's already reverted), it would be best not to bring your own disputes with CodeCat into it, since I don't think anyone agrees with your assessment of them- some may disagree with her methods, but I don't know of anyone who doesn't sympathize with her reasons for doing what she's been doing. If you're trying to use this discussion as a forum for complaining about that matter, please don't. You'll just get people annoyed at you for cluttering an already-too-long discussion with unrelated issues. Chuck Entz (talk) 14:31, 22 April 2013 (UTC)
-
-
Editors are always wanted to do their best, say, even in using so many hard templates. Such is simply an ideal, esp. of wikis, more or less away from the reality in theory and practice. Rational choice theory is a mere theory heavily counter-balanced by bounded rationality.
-
-
-
- The better editorship, the better readership. Both go together in concert. Much easier is to interfere with editorship than readership, ill or well. Liliana would advise CodeCat not to interfere (too much or trivially) with editorship in discussion, and I would with readership. It is regretable indeed if the past discussions remain freckled with so many a ???, only to be hardly corrected in response to the "friendly reminder".
-
-
-
- If
''[[term]]''is valid without adding#English, then{{term|term}}is valid as well withoutlang=en. For someone else to edit to add such additives, esp. in discussions, is overdone or ill-done, I fear, as far as I understand Liliana. Why? We can only talk more or less perfectly, hence either strength or weakness worth to be archived as given.
- If
-
-
-
- Anyway the technical phase of mess and fuss is over; the majority deny mandatory lang=. Yet, so ain't the moral phase behind that, to be taken seriously at least right here. I'd argue it is painfully arbitrary and immoral to ignore the priority global readership and do without the due community consensus, repeatedly. The validity of my argument should not be upset by such a "tail of speech" as taking my own examples, however double-barreled I may look.
- --KYPark (talk) 05:04, 23 April 2013 (UTC)
-
-
-
-
-
It depends on too many factors to sum up easily! I prefer to talk in detail, focuslessly, not always wisely, while Liliana may prefer the short cut. The shorter speech, the more penetrating, like the proverb. You could speak only of the first thing first. And we could wisely interpret what is implied below the tip of the iceberg. Anyway, Liliana made me a perfect, most impressive sense, I guess. Fair enough? --KYPark (talk) 06:42, 23 April 2013 (UTC)
-
-
-
-
-
-
-
"Do you think you could sum [it] up in a simple sentence?" That is, for me to respond to three most unfriendly plus you unwittingly unleveling with them? Even omniscient and omnipotent God couldn't do so in human language of all imperfection, I guess. This ridiculous fuss was caused by making non-mandatory lang= mandatory, arbitrarily, as if the global readership and editorship should be at CodeCat's mercy! Originally, Z wished to do without that boring parameter, and suddenly CodeCat complicated it "horrible" (Z) for some reasons, said and unsaid. This is a genius for making one out of another at will. Such was the case with WT:Beer parlour/2013/March #Wiktionary:Etymology scriptorium/March 2013. Incidentally, the above three were most responsible for so doing. Assisted by Ungoliant, CodeCat did make a horrible ending out of Chuck Entz's unwitting beginning, remindful of the idiom make a mountain out of a molehill. This case implies too much for me to keep from saying much more, yet ...- --KYPark (talk) 07:38, 28 April 2013 (UTC)
- Re: "Do you think you could sum that up in a simple sentence?". From all available evidence, no. Nothing will ever be said simply that can be bloated with tables, graphics, massive blocks of text taken from other pages, rambling discourses in poor English, etc. CodeCat started moving his more irrelevant topics from the Etymology Scriptorum to his user space, so now he's roaming through the discussion pages looking for any excuse to take potshots at her. Chuck Entz (talk) 15:43, 28 April 2013 (UTC)
- The really sad thing is that I warned him some time ago against annoying everyone else so much that he'd lose sympathy, but it looks like he's doing just that. —CodeCat 16:05, 28 April 2013 (UTC)
-
-
-
Support the change, but "???" looks horrible, I prefer "[language code?]". --Z 12:07, 25 April 2013 (UTC)
-
-
-
-
-
- This is not the right place for you to blame me, but perhaps over there. I am just staying here to inform you where you'd better respond and trivially to know from you what is "lolwut" at all in English. No reason to stay any more. --KYPark (talk) 15:18, 28 April 2013 (UTC)
- lolwut is equally English as the next term, deal with it. Also, I support the change. User: PalkiaX50 talk to meh 15:30, 28 April 2013 (UTC)
- This is not the right place for you to blame me, but perhaps over there. I am just staying here to inform you where you'd better respond and trivially to know from you what is "lolwut" at all in English. No reason to stay any more. --KYPark (talk) 15:18, 28 April 2013 (UTC)
-
-
-
-
-
-
-
-
-
-
-
- Oh really? Ungoliant looks like a genius for pushing WTF, lolwut, etc., to me of en-2 so as to embarrass me and perhaps to delay. Anyway why do you jump in on his behalf? I don't really care you support the change but the arbitrary change at CodeCat's mercy, as if the world should revolve around CodeCat. Do you like that way? --KYPark (talk) 16:04, 28 April 2013 (UTC)
- Oh please, I never said (nor was I implying) that Ungoliant is a genius. Secondly, I am not specifically highlighting to you that I support I just decided to say seeing as others have opposed and supported as well. User: PalkiaX50 talk to meh 16:22, 28 April 2013 (UTC)
- Oh really? Ungoliant looks like a genius for pushing WTF, lolwut, etc., to me of en-2 so as to embarrass me and perhaps to delay. Anyway why do you jump in on his behalf? I don't really care you support the change but the arbitrary change at CodeCat's mercy, as if the world should revolve around CodeCat. Do you like that way? --KYPark (talk) 16:04, 28 April 2013 (UTC)
-
-
-
-
-
-
- I think I have a solution that would please everyone. With CSS, we can change the formatting of a word depending on whether it has a language or not. So,
{{term}}could be changed so that it applies a CSS class to the text when no language has been specified. That way, everyone can decide individually how they want the "error" to appear to them, while the default would just appear as normal. So it would be opt-in and customisable for each user. Is that ok? —CodeCat 14:19, 9 May 2013 (UTC)
Portuguese reflexive verbs [edit]
I have just added compadecer-se, but have no idea how to show its inflections. There is nothing in Wiktionary:About Portuguese and no obvious templates. The entry in Portuguese Wiktionary has no conjugation table. Any ideas? SemperBlotto (talk) 10:50, 21 April 2013 (UTC)
- I don't know Portuguese, but in general, it is worth considering whether to direct the reader from compadecer-se to compadecer, along the likes of mračit se directing the reader to mračit. Nonetheless, as regards reflexive forms, different languages seem to use diffferent approaches. Portuguese entry dirigir-se directs the reader to dirigir for conjugation, as does encaminhar-se. --Dan Polansky (talk) 14:55, 21 April 2013 (UTC)
- What do you do in cases where the non-reflexive verb doesn't exist? Then there is no entry to direct the reader to. On the other hand, let's imagine dirigir didn't exist and was not attestable, only dirigir-se. Then dirigir-se would have to have a conjugation table. But what should it contain? Suppose that it contains forms with the reflexive particle attached, so that it has te diriges. Then that would violate "all words in all languages" because diriges gets no entry, and that would confuse users who don't realise that "te diriges" is one term. Suppose on the other hand that the table instead displays te diriges, linked separately. Then we're faced with another dilemma: what would the entry diriges contain? It can't say "second person singular present of dirigir-se" because that's not correct, "te diriges" is the second person singular of dirigir-se, not "diriges". On the other hand, it can't be "second person singular present of dirigir" either, because dirigir doesn't exist. —CodeCat 15:37, 21 April 2013 (UTC)
- (after edit conflict) For Czech, I always create a non-reflexive entry even if all its uses are reflexive. Thus, for "mračit se", the definition is at mračit, where "se" is stated on the definition line; "mračit" is always used with "se". As for inflected forms, there would be e.g. mračila. Note that, in Czech, the reflexive particle se or whatever that is is separated from its verb, as in "pořád se na něho mračila", so I do not see it necessary to have mračila se as an inflected-form entry. --Dan Polansky (talk) 15:51, 21 April 2013 (UTC)
- What do you do in cases where the non-reflexive verb doesn't exist? Then there is no entry to direct the reader to. On the other hand, let's imagine dirigir didn't exist and was not attestable, only dirigir-se. Then dirigir-se would have to have a conjugation table. But what should it contain? Suppose that it contains forms with the reflexive particle attached, so that it has te diriges. Then that would violate "all words in all languages" because diriges gets no entry, and that would confuse users who don't realise that "te diriges" is one term. Suppose on the other hand that the table instead displays te diriges, linked separately. Then we're faced with another dilemma: what would the entry diriges contain? It can't say "second person singular present of dirigir-se" because that's not correct, "te diriges" is the second person singular of dirigir-se, not "diriges". On the other hand, it can't be "second person singular present of dirigir" either, because dirigir doesn't exist. —CodeCat 15:37, 21 April 2013 (UTC)
- Yes, it's awkward, isn't it. In Italian, we hard code the pronoun in the inflection table (with no wikilink) and wikilink the inflected verb (even in the few cases in which the non-reflexive form doesn't exist (Hmm)). In French, we redirect the "pronoun + infinitive" to "infinitive".SemperBlotto (talk) 15:42, 21 April 2013 (UTC) (See lavarsi and se laver as typical of these)
- In Dutch we don't have separate entries for reflexive verbs either. But that may not really be the best idea for all languages, because in some there is no space to separate the particle from the verb. Spanish and Portuguese are examples, but Catalan also has many pronouns that contract with the verb when next to a vowel (like in French). So Catalan might have adormir-se with the form m'adormo, and the imperative of acostumar-se is acostuma't. —CodeCat 16:46, 21 April 2013 (UTC)
- My practice has been using:
====Conjugation====
See {{l/pt|compadecer}}.
- Listing each combination would be too messy. A verb form like compadeceria can give se compadeceria, compadeceria-se and compadecer-se-ia. — Ungoliant (Falai) 17:09, 21 April 2013 (UTC)
-
- Hmm, Czech and Dutch verbs don't have entries for reflexive verbs but Polish and German do (not too many). Russian reflexive verbs are included because they are always spelled together and have variations in stress (на́чался or начался́) and the actual particle (-ся and -сь) can be different, -ся - after consonant, -сь - after vowel. I think it would be beneficial to have entries for reflexive verbs in Portuguese and other language, even if as a soft redirect. --Anatoli (обсудить/вклад) 03:26, 24 April 2013 (UTC)
Should Wiktionary really include entries for characters? [edit]
Dictionaries are normally about words, and not the things that those words refer to. A definition on Wiktionary is therefore mainly concerned with giving enough information so that someone who is familiar with the referent knows that the word refers to it. So the goal of Wiktionary is not to describe in detail what the thing is that a word refers to. That's encyclopedic information, and belongs on Wikipedia. When you look at letters and other characters, it's really the same thing. When seen as a character in themselves, they are symbols and aren't really any different from, say, a triangle or a sine wave. They're concepts, not words. Presumably, Wiktionary has decided to include them because they form words, but I'm not sure if that is the best decision. It is definitely lexicographical to say C is pronounced /siː/ and has the plural Cs. But is it really lexicographical to say "C is the 3rd letter of the English alphabet"? I don't think it is, because that definition refers to the symbol C itself, not to its use as a lexical term. Definitions should say what something means, not what it is. C might indicate the third of a sequence, but that's what it means, not what the letter C actually is, so not quite the same. Similar for the etymology: describing where the shape of the letter C came from doesn't strike me as particularly dictionary-worthy. So I would like to ask whether this should be reconsidered? —CodeCat 20:14, 23 April 2013 (UTC)
- Amusingly, WT:CFI doesn't specifially include alphabetic characters. It mentions "characters used in ideographic or phonetic writing" but no mention of syllabic or alphabetic characters is ever made. Do what you want with this bit of trivia. -- Liliana • 20:18, 23 April 2013 (UTC)
- I agree. In addition to not being lexicographic information, I see the following issues:
- Because scripts like Latin and Cyrillic are used in many languages, entries for characters in those scripts end up being excessively large.
- Who is the target audience of character entries? My best guess is people who are starting to learn a language. In that case, it is much better to have per-language appendices containing “entries” for every character of a language.
- — Ungoliant (Falai) 21:09, 23 April 2013 (UTC)
- I'm not sure if my suggestion would really make the pages a lot smaller. In every language, letters still have a pronunciation, which is definitely material for a dictionary. —CodeCat 21:15, 23 April 2013 (UTC)
- I’d move the pronunciations to the appendix pages I suggested as well. — Ungoliant (Falai) 21:19, 23 April 2013 (UTC)
- I don't mind including characters like letters and punctuation that are used linguistically. But I really don't see any lexicographic value in having entries for things like →, ∟, ┌, ▒, and ☺. I once removed ⍾, ⎙, and ⎆ from WT:Wanted entries because they aren't words, but I got reverted. I still don't think they're dictionary-worthy, though. —Angr 22:11, 23 April 2013 (UTC)
- There is a relatively small universe of such terms. With respect to those that are actually used as components of words, that seems to me to be lexical significance enough to keep them - even if most did not have additional meanings capable of being reported in a dictionary. bd2412 T 01:45, 24 April 2013 (UTC)
- I don't mind including characters like letters and punctuation that are used linguistically. But I really don't see any lexicographic value in having entries for things like →, ∟, ┌, ▒, and ☺. I once removed ⍾, ⎙, and ⎆ from WT:Wanted entries because they aren't words, but I got reverted. I still don't think they're dictionary-worthy, though. —Angr 22:11, 23 April 2013 (UTC)
- I’d move the pronunciations to the appendix pages I suggested as well. — Ungoliant (Falai) 21:19, 23 April 2013 (UTC)
- I'm not sure if my suggestion would really make the pages a lot smaller. In every language, letters still have a pronunciation, which is definitely material for a dictionary. —CodeCat 21:15, 23 April 2013 (UTC)
- I don't see why we shouldn't. They are generally included in single language dictionaries, and they are lexical information, not encyclopedic. Even the non-letter characters I find useful, convenient to be able to look up like words.--Prosfilaes (talk) 08:48, 24 April 2013 (UTC)
- I would rather keep entries for letters of Latin alphabet. Even for →, it is kind of nice to find the Unicode code point for the symbol in Wiktionary. So I would rather keep all Unicode codepoints. --Dan Polansky (talk) 20:00, 24 April 2013 (UTC)
S Yes, dictionary entries are for terms, including names, but not for things. Most of the letter entries would remain, because in English, at least, A is the name of the letter A, among a few other senses or subsenses.
But punctuation marks and diacritics certainly aren’t words, nor are mathematical and logical symbols. Just look at any professional dictionary, and see what is included as entries, and what appears in tables and appendices.
A “Unicode code point” isn’t a even a character, it is an encoded representation of a character. We don’t have an entry for the code point U+0041, any more than we should have one for Morse code “dot-dash” or for the signal flag
, or the CDC 1604 key-punch card code 31 – these are all ways to encode the letter A, and not lexical entities in themselves. —Michael Z. 2013-04-25 20:39 z
- We aren't a printed dictionary; we're a computerized dictionary. There's no concern about how people are supposed to look up →, whether it should go before A or after Z or under arrow, since we don't have to worry about order. We also don't have to worry about space or many other things; we can worry about what people want to look up.
- I don't get your point about Unicode code points; I don't think Dan Polansky wants us to add U+0041, but for A and → and 倀 and ─ and the rest. "A" may be a string of bits referring to Unicode, etc., but for our purposes we can just call it a character or word.--Prosfilaes (talk) 11:03, 26 April 2013 (UTC)
-
- I don’t get your point about dictionaries. Professional dictionaries don’t refrain from “defining” symbols like arrows (→) because they can’t be printed or sorted – they certainly can – they omit them because they are not words.
-
- This is exactly my point about Unicode. We include words (technically, “terms” or “lexical items”). Some editors think that having a code point in the ultimate encoding scheme makes a thing a lexical item, but it does not. Or that it proves that these characters are significant lexical entities, because each has a code point: { → ⇾ ➙ ➔ ➛ ➝ ➞ ➡ ➤ ➧ ➨ ➫ ➯ ➱ ➺ ➻ ➼ ➽ ⟶ → }. (They are arguably not even characters in the sense of writing. I can “encode” another three dozen such “characters” with a pen on paper, but that doesn’t make them dictionary items any more than having a Unicode code point does.) No matter how great it is, Unicode is merely one way of representing text, and does not define language. —Michael Z. 2013-04-26 22:30 z
-
-
- I think anything used to convey meaning in human language is good. Don't ask me to give a robust definition of that because I can't. Mglovesfun (talk) 22:50, 26 April 2013 (UTC)
-
-
-
- Didn't we have this discussion before, about encoding things like Mʳ that we could actually find encoded that way?
-
-
-
- We don't include words; we include strings of letters or code points. color and colour are different pages, for example, and yet we combine /kɑt/ (caught) and /kɔt/ (caught) and separate /kɑt/ (cot). I don't see any reason to get overpure here; Unicode is the substrate for our system and we should rely on that and use it.
-
-
-
- From another direction, we are a part of Wikimedia; Unicode code points is not something that any other project covers in depth, and thus we should stretch our ambit so that Wikimedia covers everything.--Prosfilaes (talk) 23:40, 26 April 2013 (UTC)
- Michael Z's argument makes a bit more sense to me than Prosfilaes's. While Wiktionary is encoded in Unicode, it's not tied to Unicode; we shouldn't be making editorial decisions based on the encoding we use, we should be independent of it. From a lexical/typographical point of view, "Mʳ" is a capital M with a superscript small R, and that's the way that Wiktionary should treat it as well. I think Michael Z's point about encoding your own codepoints by drawing little doodles on a paper is very interesting, because it makes it clear how detached Unicode can be from the written reality that we are actually trying to document. In our modern society, we've become almost enslaved to our computer's capabilities and what we write is determined in many ways by what a computer is capable of producing. But just 50 years ago, that wasn't the case, and people happily made up new characters and used them in their works. Esperanto (late 19th century) introduced a whole new set of letters with diacritics, and APL (a programming language of all things!) made up a whole set of characters that nobody else used. Going back even further, you see that people made letter types so that they could print what they wrote, even going so far as to make up ligatures that mimicked handwriting. In medieval times the situation was still further removed from Unicode's "reality", where people would happily stack characters on top of each other, write little lines and squiggles all over the text, and made up all kinds of abbreviations which would use whatever formatting they found useful. So if Wiktionary's task is to document usage, then we can't let Unicode decide what to document because it's clear that Unicode is quite far from an accurate representation of the usage our CFI wants us to record and cite. If Unicode and its characters are the lexical reality, then I guess the sky must be made out of 5 megapixels. :) —CodeCat 23:53, 26 April 2013 (UTC)
- From a modern typographical point of view, "Mʳ" is U+004D U+02B3. Both in the computer, and in the way that it's written, it's not a superscript small R, and the font maker will have to deal with that.
- (Esperanto wasn't new letters, then or now; both the typography of the time and Unicode have no problem with arbitrary combinations of accents on existing characters.)
- As I mentioned in my post, we don't handle the language that a computer is capable of recording and playing back; neither /kɔt/ nor more accurately
are handled by Wiktionary. As a practical thing, being say or play a word into your phone and have it come up with a definition and spelling would be worlds more useful and used then anything based on medieval handwriting.Audio (US) (file) - In any case, whether or not we should try and handle all the non-Unicode stuff strikes me as irrelevant to the question of whether we should handle the Unicode stuff. Whatever they did in the past is irrelevant to the fact that Unicode is the dominant system today, and no matter what you can imagine creating, people are likely to select Unicode characters and enter them into Wiktionary and not random doodles.--Prosfilaes (talk) 10:51, 27 April 2013 (UTC)
- Michael Z's argument makes a bit more sense to me than Prosfilaes's. While Wiktionary is encoded in Unicode, it's not tied to Unicode; we shouldn't be making editorial decisions based on the encoding we use, we should be independent of it. From a lexical/typographical point of view, "Mʳ" is a capital M with a superscript small R, and that's the way that Wiktionary should treat it as well. I think Michael Z's point about encoding your own codepoints by drawing little doodles on a paper is very interesting, because it makes it clear how detached Unicode can be from the written reality that we are actually trying to document. In our modern society, we've become almost enslaved to our computer's capabilities and what we write is determined in many ways by what a computer is capable of producing. But just 50 years ago, that wasn't the case, and people happily made up new characters and used them in their works. Esperanto (late 19th century) introduced a whole new set of letters with diacritics, and APL (a programming language of all things!) made up a whole set of characters that nobody else used. Going back even further, you see that people made letter types so that they could print what they wrote, even going so far as to make up ligatures that mimicked handwriting. In medieval times the situation was still further removed from Unicode's "reality", where people would happily stack characters on top of each other, write little lines and squiggles all over the text, and made up all kinds of abbreviations which would use whatever formatting they found useful. So if Wiktionary's task is to document usage, then we can't let Unicode decide what to document because it's clear that Unicode is quite far from an accurate representation of the usage our CFI wants us to record and cite. If Unicode and its characters are the lexical reality, then I guess the sky must be made out of 5 megapixels. :) —CodeCat 23:53, 26 April 2013 (UTC)
- From another direction, we are a part of Wikimedia; Unicode code points is not something that any other project covers in depth, and thus we should stretch our ambit so that Wikimedia covers everything.--Prosfilaes (talk) 23:40, 26 April 2013 (UTC)
-
-
-
-
-
-
- Unicode’s development follows language, imperfectly, and not the other way around. Unicode is also designed to represent non-linguistic writing, like typographical ornaments, mathematical equations, computer code, and UI elements. We are limited by Unicode in how we can represent the language. But we are a dictionary, not a code book. Our subject is written language. —Michael Z. 2013-04-27 14:45 z
-
-
-
-
-
-
-
-
-
-
- ☞ Re: Mʳ: this is an encoding error. It contravenes the Unicode standard: The fact that the latter two letters contain the word “superscript” in their names instead of “modifier letter” is an historical artifact of original sources for the characters, and is not intended to convey a functional distinction in the use of these characters in the Unicode Standard. ¶ Superscript modifier letters are intended for cases where the letters carry a specific meaning, as in phonetic transcription systems, and are not a substitute for generic styling mechanisms for superscripting of text, as for footnotes, mathematical and chemical expressions, and the like.[25]
-
-
-
-
-
-
-
-
-
-
-
-
- I don't see why you see a difference between someone willfully using a number for a letter or a phonetic letter for another letter; "pr0n" is as much an encoding error as "Mʳ". In any case, the point was not to restart the argument, just to remind you that we'd had that discussion before.
-
-
-
-
-
-
-
-
-
-
-
-
-
- Our organization is not inconsistent; each page is denoted by a string of characters. The occasional redirect is the only break from that.
-
-
-
-
-
-
-
-
-
-
-
-
-
- Don't tell me not to do something; tell me why I shouldn't do it. Dictionaries frequently include a lot of stuff that's not just words; many dictionaries come with biographical, geographical, and scientific data. Storing Unicode codepoints is something that would have unique value for us and no more not make us a dictionary then including entries on "George Washington" turned other dictionaries into not dictionaries.--Prosfilaes (talk) 22:46, 27 April 2013 (UTC)
-
-
-
-
-
-
-
-
-
If WT after all is to help readers resolve the semantic ambiguity anyway involved in speech and writing, then the punctuation marks (PM's) should necessarily come in as semantic functors. However, the trouble is that the main pages are designed or structured around words rather than PM's. Then we'd have a few options, say, as follows:
- A main page for each PM in spite of inadequate design.
- A main page for all PM's. In this and next cases, REDIRECT's may be well used.
- WT:Puncutation marks, probably subpaged for each PM.
-
-
|
Why not? It should! Traditionally, lexicographically, and perhaps lexicologically. Generally, it is quite desirable to review anything, esp. from the bottom up, as low as possible. Biased, however, you'd fall into the pitfall or vicious cycle of circular reasoning, as usual. Say, you presuppose, to begin with: "Dictionaries are normally about words, and not the things that those words refer to." This may be enough for you to begin too wrong! See first: A most commonsensical fatal fallacy is such that the word in and of itself does refer or relate to the thing or referent, likely magically, remindfully of "word magic". Unconvinced by my words, be convinced by: This includes the opening quotation of:
In a nutshell:
That is, "Words are magical" only to "the minds of those who use them." Put more precisely, cognitive minds are magical, rather than words, hence cognitive sciences since the late 70s! Recall "The Delta Factor" (1975). This recognition was quite a queer revolution in sheer silence or sheer mystery. All life comes back to the question of our ideas -- the medium through which we relate words to things, ill or well. This is my parody of:
This is the first of the ten quotations that open:
As far as my knowledge goes, you'd better believe me, this is the origin or center of the cognitive earthquake or revolution, vividly evolved since the late 70s but in sheer mystery! All I'm saying is perhaps U're doing too wrong more often than not! |
- (This'd best be where CodeCat, etc., would respond to "Why not? It should!" above. Thanks. --KYPark (talk) 08:18, 26 April 2013 (UTC))
Request for comment on inactive administrators [edit]
(Please consider translating this message for the benefit of your fellow Wikimedians. Please also consider translating the proposal.)
Read this message in English / Lleer esti mensaxe n'asturianu / বাংলায় এই বার্তাটি পড়ুন / Llegiu aquest missatge en català / Læs denne besked på dansk / Lies diese Nachricht auf Deutsch / Leś cal mesag' chè in Emiliàn / Leer este mensaje en español / Lue tämä viesti suomeksi / Lire ce message en français / Ler esta mensaxe en galego / हिन्दी / Pročitajte ovu poruku na hrvatskom / Baca pesan ini dalam Bahasa Indonesia / Leggi questo messaggio in italiano / ಈ ಸಂದೇಶವನ್ನು ಕನ್ನಡದಲ್ಲಿ ಓದಿ / Aqra dan il-messaġġ bil-Malti / norsk (bokmål) / Lees dit bericht in het Nederlands / Przeczytaj tę wiadomość po polsku / Citiți acest mesaj în română / Прочитать это сообщение на русском / Farriintaan ku aqri Af-Soomaali / Pročitaj ovu poruku na srpskom (Прочитај ову поруку на српском) / อ่านข้อความนี้ในภาษาไทย / Прочитати це повідомлення українською мовою / Đọc thông báo bằng tiếng Việt / 使用中文阅读本信息。
Hello!
There is a new request for comment on Meta-Wiki concerning the removal of administrative rights from long-term inactive Wikimedians. Generally, this proposal from stewards would apply to wikis without an administrators' review process.
We are also compiling a list of projects with procedures for removing inactive administrators on the talk page of the request for comment. Feel free to add your project(s) to the list if you have a policy on administrator inactivity.
All input is appreciated. The discussion may close as soon as 21 May 2013 (2013-05-21), but this will be extended if needed.
Thanks, Billinghurst (thanks to all the translators!) 04:34, 24 April 2013 (UTC)
- Distributed via Global message delivery (Wrong page? You can fix it.)
- Looking at the other projects' policies, and at our own inactive admins, I'd like to propose that we have a policy vote about this. What do you think about removal of adminship from admins who make less than 10 mainspace edits in a year? That sounds reasonable (compare with our voting requirements, for example). —Μετάknowledgediscuss/deeds 05:40, 24 April 2013 (UTC)
-
-
- As that proposal will override local consensus if it passes, I invite all Wiktionarians to oppose the proposal so we can stay independent and govern ourselves. -- Liliana • 08:04, 24 April 2013 (UTC)
-
Category:Hungarian nouns suffixed with -acs, et al [edit]
Why are these suffix categories sorted by PoS? It's especially confusing that Hungarian prefixes aren't. Would anyone object if I changed them to the same format as most (if not all) other languages? Ultimateria (talk) 14:46, 24 April 2013 (UTC)
- I think we should wait,
{{suffix}}allows for this sort of thing (not only{{hu-suffix}}) and last time I talked to User:Panda10 about it, he opposed deleting{{hu-suffix}},{{hu-prefix}}and{{hu-affix}}. We should at least let some of our Hungarian editors comment; a couple of days is nothing. Mglovesfun (talk) 14:53, 24 April 2013 (UTC) - Please do not change it. In several cases, the same suffix will create a different PoS and it is best to keep these categories of words separately. --Panda10 (talk) 13:58, 1 May 2013 (UTC)
Multiple user pages using "/" [edit]
Greetings. I've noticed that some users have created multiple pages for their user by creating pages with a backslash after their user name. (E.g. User:[username]/1000EnglishEntries.) Is this normally accepted, and is there a limit as to how many pages you can have? Thanks. TeragR (talk) 17:16, 25 April 2013 (UTC)
- If the content supports the work of Wiktionary, there is no limit that I am aware of. Any significant volume of content not related to the work of Wiktionary (including maintaining friendly relations useful for that work), whether or not on a subpage, is not permitted. DCDuring TALK 21:09, 25 April 2013 (UTC)
- Normally accepted and no limit. WT:USERPAGE should cover this. Mglovesfun (talk) 21:46, 25 April 2013 (UTC)
- It just says the same rule apply to subpages as to the main user page, that's enough, right? Mglovesfun (talk) 11:32, 27 April 2013 (UTC)
- Normally accepted and no limit. WT:USERPAGE should cover this. Mglovesfun (talk) 21:46, 25 April 2013 (UTC)
Administrator communication [edit]
I'm having trouble with an administrator that often reverts legitimate revisions en masse and deletes entries without bothering to message the user or start a talk page on the matter. I wouldn't be so troubled about it if they was simply an editor, but as an administrator, I expect more from them. I've contacted them, but they refute any culpability. Is there anyone that can intervene and ask them to better communicate with others, in both initiating contact and conducting themselves is a cooperative manner? Thanks. --Victar (talk) 20:37, 25 April 2013 (UTC)
- We do have the problem of having a very high ratio of pages to patrollers. This leads to curt interaction. You could try posting to the entry talk pages or visiting one of the pages like Wiktionary:About Frankish or Wiktionary:About Proto-Indo-European and leaving a message on a talk page there to determine what problem there may have been with your contribution. DCDuring TALK 21:17, 25 April 2013 (UTC)
-
-
- Another thing you could try is recognizing the fact that CodeCat is a knowledgeable, experienced, and respected editor on this project, and you are still relatively new. Showing some politeness, respect, and dare I say even a bit of deference would get you a long way. Simply put, the administrators on this project have all had to put up with new editors who think they know better, which is a tiresome process, and one which is somewhat jading. As far as I can tell, CodeCat has good reasons for their reversions, has been reasonably professional in their conversation with you, and generally met English Wiktionary admin standards that I am comfortable with. -Atelaes λάλει ἐμοί 00:12, 26 April 2013 (UTC)
-
-
-
-
- Again though, it isn't a matter of the quality of their revisions or even the way in which they communicate; it's the lack in communication that I find troublesome. If an admin is going to delete your work en masse without even discussing it with you, why would anyone want contribute? If you need more people, this is not the way to attract them. --Victar (talk) 00:33, 26 April 2013 (UTC)
-
-
-
-
-
-
- You don't realise the vast quantity of vandalism, idiocy, ill-informed edits, and good faith errors we have to clean up after and weed through. If we left a personalised message for everyone who made a mistake, it would take far too much time and (wo)manpower. We were all newbies once too, and we learned the inane template system and confusing structure as well. As long as you don't make it into a conflict, it almost certainly won't become one, and you can learn and move on. —Μετάknowledgediscuss/deeds 00:41, 26 April 2013 (UTC)
-
-
-
-
-
-
-
-
Victar is supposed to be concerning or complaining, as I used to, about both the administrator's moral(ity) and the editor's morale, mismatched, rather than the technicality of administration. No doubt, it is fatally self-defeating and immoral within the participatory wikis to discuourage editors as if vandals anyway. --KYPark (talk) 03:43, 26 April 2013 (UTC)
-
-
-
-
Number/Numeral categories what's the story? [edit]
I'm just wondering since I remember discussion about the categories for numbers and/or numerals a while back. Did any decisions or anything of the like come out of it? I mean, for a given language, what number related categories should we have, and what shouldn't we have? Ever since I noticed the controversy or uncertainty about this issue months and months ago I made sure to ignore any I saw on WantedCats. But I'm just curious atm and probably won't be staying on wikt much longer for the moment today so I was wondering if someone could easily tell me what happened and perhaps even direct me to the relevant discussion if they feel the need. User: PalkiaX50 talk to meh 13:24, 27 April 2013 (UTC)
- As far as I know, the decision was to use numeral when it represented a distinct part of speech, and to use some other part of speech when more appropriate. In particular, ordinal number words are almost never a "numeral" part of speech, they are usually adjectives. The cardinal and ordinal numbers are categorised in their own topical categories, which are based on meaning rather than on part of speech. —CodeCat 13:35, 27 April 2013 (UTC)
Is German infinitive ending -en considered a suffix? [edit]
In German, every verb in its infinitive form ends in the morpheme -en (or in its much rarer variants -ern or -eln). Should it still be considered a suffix? Some users seem to think so. For example, the etymology of the verb vernetzen was recently changed from {{prefix|ver|Netz|lang=de}} to {{confix|ver|Netz|en|lang=de}}. I don't think that makes sense, since -en is just a grammatical morpheme that marks the infinitive rather than a lexical morpheme of word formation. Longtrend (talk) 11:35, 29 April 2013 (UTC)
- I’d say it is. Inflectional suffixes are also suffixes. — Ungoliant (Falai) 11:47, 29 April 2013 (UTC)
- I'd say it isn't. Our suffix categories are usually broken up into several subcategories based on usage, so there's Category:German noun-forming suffixes, Category:German verb-forming suffixes and so on. There's also Category:German inflectional suffixes, which would contain things like -em, -en, -te. -en can be used to form new verbs, something like Category:German verbs suffixed with -en is horribly misleading. -en has to be present in a verb, so it's not true suffixation but really more like adapting the word morphologically into a verb. That's quite different. Latin -us is the same; it's not used as a way to form nouns, but rather as a way to make morphologically non-conforming nouns conform to the grammar of the language. —CodeCat 12:54, 29 April 2013 (UTC)
-
- But that doesn't belong in the etymology. That would mean using
{{suffix}}or{{prefix}}for almost every entry in almost every inflected language. Does falar really have a different etymology from falo? One that's worth distinguishing in the etymology section? Chuck Entz (talk) 12:58, 29 April 2013 (UTC)- I understand your concerns, but when you say "adapting the word morphologically into a verb"....well that's done by means of a suffix, isn't it? When German invents a new verb based on a noun, it sticks the suffix -en on the end of it. So in my opinion, it is a suffix. (In my opinion also, all of these categories are a complete waste of time and energy, but that's a separate issue...) Ƿidsiþ 13:17, 29 April 2013 (UTC)
- And vernetzt is ver + netz +t and vernetzte is ver + netz +te. The inflectional morphology is the result of the conversion to a verb, not the cause. In an analogous English case, would you describe a verb derived from a noun as noun + null ending, since our lemma is the unmarked form? Chuck Entz (talk) 13:55, 29 April 2013 (UTC)
- No, and that is a silly analogy. A better one would be to consider if regular English past-tense forms are formed by adding a suffix -ed. In my opinion (and that of the OED), this is the case (although I see no value in putting them all in a category). Ƿidsiþ 14:40, 29 April 2013 (UTC)
- And vernetzt is ver + netz +t and vernetzte is ver + netz +te. The inflectional morphology is the result of the conversion to a verb, not the cause. In an analogous English case, would you describe a verb derived from a noun as noun + null ending, since our lemma is the unmarked form? Chuck Entz (talk) 13:55, 29 April 2013 (UTC)
- No it wouldn’t. Falar is not derived from a noun + an inflectional suffix, vernetzen is.
- Maybe it’s better to think of inflectional suffixes as sets, instead of single suffixes. For example, the Portuguese 1st conjugation has {-ar, -ando, -ado, -o, -as, -a, -amos, -ais, -am, etc.}, the 2nd has {-er, -endo, -ido, -o, -es, -e, -emos, -eis, -em, etc.} and the 3rd has {-ir, -indo, -ido, -o, -es, -e, -imos, -is, -em, etc.}. Consider the word monitorar, it is the noun monitor + the 1st conjugation paradigm; since the lemma of Portuguese verbs is the impersonal infinitive, the etymology should, in my opinion, display monitor + the 1st conjugation paradigm’s impersonal infinitive suffix (-ar). — Ungoliant (Falai) 13:27, 29 April 2013 (UTC)
- Not in Portuguese, but it ultimately comes from fabula. My point is that it's not the sticking of the inflectional ending on it that made it a verb. Because it became a verb, the inflectional ending was added. Chuck Entz (talk) 13:55, 29 April 2013 (UTC)
- That "inflectional ending" can be considered a suffix though. Ƿidsiþ 14:40, 29 April 2013 (UTC)
- Not really, they are separate things from a formal point of view. In Indo-European at least, much derivation involves adding a suffix to extend the basic stem, which is distinct from the inflectional ending that comes after it. Indo-European words are formed as root + one or more suffixes + inflectional ending. This has become somewhat more muddled in later languages, because many languages have "zero endings" which are inflectional endings that are empty, and also because the distinction between root and suffix is no longer as apparent. So for modern IE languages, it's easier to just consider stem + ending, and treat the stem as the more "invariable" part. In this view, -en is definitely not part of the stem in German, and neither is -ar in Portuguese (although the -a- on its own might be). In German in particular, creating a verb from a noun is often a so-called "zero derivation" where both stems are the same. So there is really no suffixation involved, just a change of endings from the noun set (genitive -s, plural -e(n), dative plural -en) to the verb set (infinitive -en, 3sg present -t and so on). The reason that it appears as suffixation is because the lemma form (nominative singular) of a noun stem generally has a zero ending, whereas the lemma form of a verb stem has an overt ending. But if it had been the other way around (say, nouns had -en in their nominative singular, and the infinitive had a zero ending) then would it still appear as suffixation? I don't think it would. And if you look at Latin or (to a lesser extent) Portuguese, there are few forms that have no ending, so something like "replace -us (la) / -o (pt) with -ar(e)" can hardly be called suffixation by itself. Rather, the real suffixation is adding the verbal derivation morpheme -a- (first declension) to the noun stem, which then creates a verbal stem that requires -r(e) as the infinitive ending. —CodeCat 14:52, 29 April 2013 (UTC)
- Not in Portuguese, but it ultimately comes from fabula. My point is that it's not the sticking of the inflectional ending on it that made it a verb. Because it became a verb, the inflectional ending was added. Chuck Entz (talk) 13:55, 29 April 2013 (UTC)
- I understand your concerns, but when you say "adapting the word morphologically into a verb"....well that's done by means of a suffix, isn't it? When German invents a new verb based on a noun, it sticks the suffix -en on the end of it. So in my opinion, it is a suffix. (In my opinion also, all of these categories are a complete waste of time and energy, but that's a separate issue...) Ƿidsiþ 13:17, 29 April 2013 (UTC)
- But that doesn't belong in the etymology. That would mean using
-
-
-
-
-
- Re: “although the -a- on its own might be”: -a- is the suffix -ar’s thematic vowel.
- Re: “"replace -us (la) / -o (pt) with -ar(e)" can hardly be called suffixation by itself”: it can, because suffixes are usually added to the stem, not the whole word. Cabeludo is cabel- (stem of cabelo) + -udo; here the suffix is added to the stem and, similarly, verbs formed from nouns have the conjugation suffixes added to the noun’s stem.
- Even if “suffixation” and “suffix” aren’t the correct terms used by linguists, our etymology sections don’t use those terms. — Ungoliant (Falai) 15:16, 29 April 2013 (UTC)
-
-
-
-
-
-
-
-
- Irrespective of whether the suffix was added because it became a verb, or whether it became a verb because the suffix was added (and how can you tell?), a new word appeared and this new word is a previous word + a set of suffixes. You could just say that monitorar came from monitor, but then why isn’t it *monitorer or *monitorir? That’s because it’s monitor + {-ar, -ando, -ado, etc.}, not monitor + {-er, -endo, -ido, etc.} nor monitor + {-ir, -indo, -ido, etc.}. The word was derived with a specific set of suffixes, and this set’s lemma suffix should be added to the etymology. — Ungoliant (Falai) 15:16, 29 April 2013 (UTC)
- That's true, but that works only if the lemma form actually shows this distinction in paradigms. In the case of German, the infinitive suffix doesn't show the verb paradigm. Of course, the paradigm is included in the derivation process, but you don't see it. —CodeCat 15:30, 29 April 2013 (UTC)
- How about verbs that are formed from a noun? As in Dutch schaats ( a skate) ==> schaatsenrijden ==> schaatsen (verb). Couldn't you say in that case that affixing -en is a (productive) way of generating new verbs? A productive suffix? Think of faxen or sms'en Jcwf (talk) 15:40, 29 April 2013 (UTC)
- I'm not saying it's not productive, I'm saying that -en isn't the suffix used to perform the derivation. The verb is more than just its infinitive... alongside schaatsen there is also schaats, schaatste, and so on. How would you say schaats (the verb form) is derived from schaats (the noun)? To say that people first create the infinitive and then replace the infinitive ending with a zero ending would be silly. Treating the infinitive as the lemma form is only a lexicographical convenience but not the reality; verbs can exist without their infinitives, and choosing one of the forms as the lemma is arbitrary. What if Dutch verbs were lemmatised as the 1st person singular? How would we denote the etymology then? When schaatsen is derived from schaats, there is no suffixation involved (or rather zero-suffixation). Rather, we just change the lemma form of one part of speech to the lemma form of another, but the actual derivational process is completely independent of which lemma form you choose, so deriving schaats from schaats is not only a valid alternative, it's the exact same thing. —CodeCat 15:52, 29 April 2013 (UTC)
- I think "schaats" (Dutch noun stem) and "schaats" (Dutch verb stem) aren't the same thing and that we can still see the etymology because "schaats" (noun: runner or blade) still rules out "schaats" (noun: the act/result of skating), while we do have the noun "loop" (a competition in "lopen"; a track to "lopen"). --80.114.178.7 00:18, 4 May 2013 (UTC)
- That said, if "-en" is a suffix, it is a suffix to the verb "schaats" to make the infinitive (and 1st person plural of the present tense &c.), not a suffix to transform a noun into a verb. Perhaps it would be good to have a header "Stem" (like we have "Noun" and "Verb"), if only to have terminology and/or a place to link to (do we need "Noun stem"/"Verb Stem"?). --80.114.178.7 00:18, 4 May 2013 (UTC)
- I'm not saying it's not productive, I'm saying that -en isn't the suffix used to perform the derivation. The verb is more than just its infinitive... alongside schaatsen there is also schaats, schaatste, and so on. How would you say schaats (the verb form) is derived from schaats (the noun)? To say that people first create the infinitive and then replace the infinitive ending with a zero ending would be silly. Treating the infinitive as the lemma form is only a lexicographical convenience but not the reality; verbs can exist without their infinitives, and choosing one of the forms as the lemma is arbitrary. What if Dutch verbs were lemmatised as the 1st person singular? How would we denote the etymology then? When schaatsen is derived from schaats, there is no suffixation involved (or rather zero-suffixation). Rather, we just change the lemma form of one part of speech to the lemma form of another, but the actual derivational process is completely independent of which lemma form you choose, so deriving schaats from schaats is not only a valid alternative, it's the exact same thing. —CodeCat 15:52, 29 April 2013 (UTC)
- How about verbs that are formed from a noun? As in Dutch schaats ( a skate) ==> schaatsenrijden ==> schaatsen (verb). Couldn't you say in that case that affixing -en is a (productive) way of generating new verbs? A productive suffix? Think of faxen or sms'en Jcwf (talk) 15:40, 29 April 2013 (UTC)
- That's true, but that works only if the lemma form actually shows this distinction in paradigms. In the case of German, the infinitive suffix doesn't show the verb paradigm. Of course, the paradigm is included in the derivation process, but you don't see it. —CodeCat 15:30, 29 April 2013 (UTC)
- Irrespective of whether the suffix was added because it became a verb, or whether it became a verb because the suffix was added (and how can you tell?), a new word appeared and this new word is a previous word + a set of suffixes. You could just say that monitorar came from monitor, but then why isn’t it *monitorer or *monitorir? That’s because it’s monitor + {-ar, -ando, -ado, etc.}, not monitor + {-er, -endo, -ido, etc.} nor monitor + {-ir, -indo, -ido, etc.}. The word was derived with a specific set of suffixes, and this set’s lemma suffix should be added to the etymology. — Ungoliant (Falai) 15:16, 29 April 2013 (UTC)
-
-
-
- I agree with Widsith and Ungoliant, it's a suffix. I'm surprised that there's debate on this point. I find it no more or less useful to categorise all German verbs suffixed with -en together than to categorise all English past tense forms suffixed with -ed together, but -en and -ed remain suffixes. (And I can conceive of such categorisation being at least slightly useful, in that there are other suffixes—wandeln in German, dreamt in English—and someone might want to find only -en / -ed words.) - -sche (discuss) 20:21, 29 April 2013 (UTC)
- I definitely don't agree with making categories based purely on allophonic grounds. The ending of wandeln isn't somehow a different one from the -en that most other verbs end in. It's the same thing, just with a different shape depending on the stem. And I am surprised that you think there is no debate. What arguments do you have against the ones I've raised? How does one get schaats (“I skate”) from schaats (“a skate”) by suffixing -en? That doesn't make any sense to me at all. —CodeCat 20:48, 29 April 2013 (UTC)
- If one forms words in Dutch the way one forms them in German, then it seems one takes schaats (“a skate”, n), suffixes -en to get a verb, and conjugates that verb like other verbs that end in the suffix -en, resulting in forms like schaats (“I skate”). That seems as obvious to me as your analysis seems to you, so I don't know if we'll be able to do anything but agree to disagree... - -sche (discuss) 23:24, 29 April 2013 (UTC)
- Your analysis depends on treating the infinitive form as the basis from which other forms of the verb are derived. But it doesn't work like that in reality. The fact that we choose different lemmas in different languages is a reflection of that. For example, in Latin any of the Balkan languages, would you suggest that people first add a suffix to create the first person singular, and then conjugate that? I'd say that it's more realistic to say that when deriving a new verb, people create the whole verb and the complete set of its forms, and then select the one they need in that particular situation. Seen that way, the process of word derivation is what creates one paradigm from another, rather than just one lemma form from another. That is why I think it's misleading to treat -en as a suffix: it doesn't actually create new lemmas. The true derivational part is the zero suffix, which is attached to a noun paradigm in order to form a verb paradigm. That the lemma of the noun paradigm has no ending while the lemma of the verb paradigm has -en isn't relevant; this could easily change just by selecting another lemma, since it's arbitrary which lemma you choose. —CodeCat 23:44, 29 April 2013 (UTC)
- If one forms words in Dutch the way one forms them in German, then it seems one takes schaats (“a skate”, n), suffixes -en to get a verb, and conjugates that verb like other verbs that end in the suffix -en, resulting in forms like schaats (“I skate”). That seems as obvious to me as your analysis seems to you, so I don't know if we'll be able to do anything but agree to disagree... - -sche (discuss) 23:24, 29 April 2013 (UTC)
- I definitely don't agree with making categories based purely on allophonic grounds. The ending of wandeln isn't somehow a different one from the -en that most other verbs end in. It's the same thing, just with a different shape depending on the stem. And I am surprised that you think there is no debate. What arguments do you have against the ones I've raised? How does one get schaats (“I skate”) from schaats (“a skate”) by suffixing -en? That doesn't make any sense to me at all. —CodeCat 20:48, 29 April 2013 (UTC)
Note that the definition of desinence considers that a desinence is a kind of suffix. Lmaltier (talk) 20:07, 1 May 2013 (UTC)
- I agree that endings like -en are suffixes from the linguistic point of view, but from the lexicographical point of view it's no help to anyone to have Category:German words suffixed with -en. That category currently has just 48 words, all of them infinitives, but in principle it could have the vast majority of German infinitives, plus all 1st and 3rd person plural preterite forms, plus all 1st and 3rd person plural past subjunctive forms, plus all the plural nouns in -en, plus all the dative plurals in -en, plus the -en form of every single German adjective. No one would be able to use a category like that for navigation. —Angr 21:22, 1 May 2013 (UTC)
- It could be useful if it had sub-categories like Category:German infinitives ending with -en, Category:German 3rd person plural preterite forms ending with -en and so on, and just a few words which don't fall into those categories. But probably almost all words in the base cateegory would just have to be moved to (often several) subcategories. --80.114.178.7 00:26, 4 May 2013 (UTC)
[en] Change to wiki account system and account renaming [edit]
Some accounts will soon be renamed due to a technical change that the developer team at Wikimedia are making. More details on Meta.
(Distributed via global message delivery 03:31, 30 April 2013 (UTC). Wrong page? Correct it here.)
- For the lazy... Ƿidsiþ 07:46, 30 April 2013 (UTC)
- The developer team at Wikimedia Foundation is making some changes to how accounts work, as part of our on-going efforts to provide new and better tools for our users (like cross-wiki notifications). These changes will mean users have the same account name everywhere. This will let us give you new features that will help you edit and discuss better, and will allow more flexible user permissions for tools. One of the pre-conditions for this is that user accounts will now have to be unique across all 900 Wikimedia wikis.
- Unfortunately, some accounts are currently not unique across all our wikis, but instead clash with other users who have the same account name. To make sure that all of these users can use Wikimedia's wikis in future, we will be renaming a number of these accounts to have "~” and the name of their wiki added to the end of their accounts' name. This change will take place on or around 27 May. For example, a user called “Example” on the Swedish Wiktionary who will be renamed would become “Example~svwiktionary”.
- All accounts will still work as before, and will continue to be credited for all their edits made so far. However, users with renamed accounts (whom we will be contacting individually) will have to use the new account name when they log in. It will now only be possible for accounts to be renamed globally; the RenameUser tool will no longer work on a local basis - since all accounts must be globally unique - therefore it will be withdrawn from bureaucrats' tool sets. Once this takes place, it will still be possible for users to ask for their account to be renamed further here on Meta, if they do not like their new user name.
- Oh, Christ, am I going to become Equinox~enwiktionary because of that prior Equinox who made one edit on Wikipedia back in 1843? Equinox ◑ 12:54, 30 April 2013 (UTC)
- The Internet was steam-powered in those days. Ƿidsiþ 13:02, 30 April 2013 (UTC)
- Is there a way to avoid becoming Astral~enwiktionary? I'm pretty sure there are other Astrals scattered throughout various Wikimedia projects. Perhaps by renaming my account before May 27? Astral (talk) 02:35, 8 May 2013 (UTC)
- To both Equinox and Astral: Yes, if there is anyone on any other Wikimedia project with the same username, you will both be automatically renamed. You can seek renaming here now (or seek to have the existing other account(s) usurped in your favor now) or have this done at Meta after the change. bd2412 T 03:07, 8 May 2013 (UTC)
- Is there a way to avoid becoming Astral~enwiktionary? I'm pretty sure there are other Astrals scattered throughout various Wikimedia projects. Perhaps by renaming my account before May 27? Astral (talk) 02:35, 8 May 2013 (UTC)
- The Internet was steam-powered in those days. Ƿidsiþ 13:02, 30 April 2013 (UTC)
[en] Change to section edit links [edit]
The default position of the "edit" link in page section headers is going to change soon. The "edit" link will be positioned adjacent to the page header text rather than floating opposite it.
Section edit links will be to the immediate right of section titles, instead of on the far right. If you're an editor of one of the wikis which already implemented this change, nothing will substantially change for you; however, scripts and gadgets depending on the previous implementation of section edit links will have to be adjusted to continue working; however, nothing else should break even if they are not updated in time.
Detailed information and a timeline is available on meta.
Ideas to do this all the way to 2009 at least. It is often difficult to track which of several potential section edit links on the far right is associated with the correct section, and many readers and anonymous or new editors may even be failing to notice section edit links at all, since they read section titles, which are far away from the links.
(Distributed via global message delivery 18:21, 30 April 2013 (UTC). Wrong page? Correct it here.)
- I see this has gone live. I think it's bad for usability. Finding an edit link now requires horizontal scanning, since its position is relative to the header's text length. It used to be easy: absolute far right. Equinox ◑ 18:27, 1 May 2013 (UTC)
- I like this change; I find it easier to find the "edit" links when they're next to their headers; when they floated right, it took me longer to sort out which edit link went with which section on pages that had multiple immediately adjacent headers, e.g. pages with an L2 immediately followed by an empty Etymology 1 section immediately followed by a POS section, especially if only some of those headers were indented by right-floating Wikipedia boxes. (I've also had experience with this leftist placement for a long time, due to de.Wikt using it.) - -sche (discuss) 19:15, 1 May 2013 (UTC)
- Okay, I've fixed (hopefully all of) the resulting breakages in TabbedLanguages, DefSideBoxes, AddDefinition, RhymesEdit, and VisibityToggles. Did anything else break that anyone's aware of? --Yair rand (talk) 19:17, 1 May 2013 (UTC)
-
-
-
- Yep, thanks. Equinox ◑ 19:37, 1 May 2013 (UTC)
- @Yair: Nope, I'm still getting the crap Equinox reported. I can provide diffs if you want. —Μετάknowledgediscuss/deeds 03:40, 8 May 2013 (UTC)
-
-
- I don't know if it is just coincidence, but vandalism marked as (Mobile Edit) has gone up alarmingly since this change. SemperBlotto (talk) 14:58, 2 May 2013 (UTC)
-
- I now see section edit links in the mobile view. Whatever was hiding them before doesn’tbseem to work with the new HTML.
- Would it be possible for me to reverse this change? I don't think I like it all that much, it makes pages appear messier. —CodeCat 02:11, 3 May 2013 (UTC)
- Add
.mw-editsection {float: right;}to Special:MyPage/common.css. --Yair rand (talk) 03:41, 3 May 2013 (UTC)
- Add
- It seems net beneficial to me personally as I formerly made lots of errors clicking on the wrong section link and often found the edit link hidden by project link and similar boxes. It should be easier for newbies too. DCDuring TALK 15:31, 3 May 2013 (UTC)
May 2013
Homophones [edit]
homophone provides the following definition: A word which is pronounced the same as another word but differs in spelling or meaning or origin, for example: carat, caret, carrot, and karat. It's important to note the use of as another word in this definition. This means that many pages provide wrong information, e.g. familiarisât mentions familiarisa and familiarisas as homophones, while they are different forms of the same word. It's possible to state that the forms are homophonous, but not to mention them in a section named Homophones. In French, when you list the homophones of sot, you mention saut, seau, sceau, Sceaux, but never sots, because it's the same word. Lmaltier (talk) 06:09, 1 May 2013 (UTC)
- If familiarisât, familiarisa and familiarisas are the same word, are their definitions incorrect? — Ungoliant (Falai) 06:12, 1 May 2013 (UTC)
- No, definitions are correct. It's the same as provide and provides, or cat and cats: inflected forms, different forms of the same word. Lmaltier (talk) 07:20, 1 May 2013 (UTC)
- cat and cats are different words; one is a conjugated form of the other, but they're still different words. Certainly that depends on the arbitrary definition of word, but I don't think my definition is idiosyncratic or unusual.--Prosfilaes (talk) 08:09, 1 May 2013 (UTC)
- So I oppose removing the homophones. They’re different words which mean different things, so if they have the same pronunciation they should list each other as homophones, whether they have the same lemma or not. — Ungoliant (Falai) 19:49, 1 May 2013 (UTC)
- No, definitions are correct. It's the same as provide and provides, or cat and cats: inflected forms, different forms of the same word. Lmaltier (talk) 07:20, 1 May 2013 (UTC)
- What's included as a ==French== homophone can be decided by the French editors without community input, much as language editors decide on transliteration, hyphenation, and so on. As for English, if (hypothetically, since I can't think of a real example) the present-tense tread and the past trod were pronounced the same in some common accent, I'd consider them homophones in that accent and IMO we should list them as such in the entry.—msh210℠ (talk) 07:29, 1 May 2013 (UTC)
- I don't see why we wouldn't class these are homophones; what's the reason not to? Mglovesfun (talk) 10:37, 1 May 2013 (UTC)
- I agree. No reason to say French sots isn't a homophone of sot. —Angr 11:26, 1 May 2013 (UTC)
- Yeah. Lmaltier, you just arbitrarily pick out one of the many definitions of "word" (= lexeme) and say that the current practice of adding homophones isn't in line with it. But there's no reason to pick out that particular definition in the first place. Longtrend (talk) 11:35, 1 May 2013 (UTC)
- From the PoV of someone not too familiar with French, it is at least minimally useful to know that there are many forms of a given lemma that are pronounced the same. DCDuring TALK 12:01, 1 May 2013 (UTC)
- For a learner of French it would actually be more useful to have a list of plural nouns that are not homophones of the singular. There aren't many; œufs and yeux spring to mind. —Angr 13:15, 1 May 2013 (UTC)
- You are right. No, I don't arbitrarily pick out one definition of word. I just want to use the noun homophone in the sense actually used by almost everybody referring to homophones. I understand that this sense is probably less clear in English, because inflected forms are not pronounced the same, but in languages such as French, different inflected forms are very often pronounced the same. Look at books or websites addressing homophones (such as http://tempsreel.nouvelobs.com/abc-lettres/saut-sceau-seau-sot/S/homophone.html). You'll see that they exclude inflected forms (because they consider they are the same word). I feel that Wiktionary homophone lists often misinterpret the sense of the word homophone, this was why I mentioned the issue. If you disagree, just try to find a book mentioning sots as a homophone of sot (or the same kind of case). Just a few references: http://books.google.fr/books?id=lTb3xSgNPecC&pg=PA3&dq=homophones&hl=fr&sa=X&ei=xGqBUcvtAYGK7Aa67YEw&ved=0CDUQ6AEwAA#v=onepage&q=homophones&f=false states words different in origin and signification. http://books.google.fr/books?id=Z7ZyXZAJgx8C&pg=PP7&dq=homophones&hl=fr&sa=X&ei=xGqBUcvtAYGK7Aa67YEw&ved=0CDoQ6AEwAQ states words which sound the same but have totally different meanings. http://books.google.fr/books?id=0crig9rvzpMC&pg=PA8&dq=homophones&hl=fr&sa=X&ei=xGqBUcvtAYGK7Aa67YEw&ved=0CFwQ6AEwCA#v=onepage&q=homophones&f=false states that homophones have a different meaning and a different spelling. I disagree with this last definition: bear as a noun and bear as a verb are true homophones. Anyway, I think that all these books seem to agree on the fact that homophones have unrelated meanings, which excludes inflected forms. Lmaltier (talk) 19:43, 1 May 2013 (UTC)
- Do those books ever list nonlemma forms? Do they list œufs as a homophone of eux? Do they list aime/aimes/aiment as homophones of M? (In English, bear and bear are usually called homonyms rather than homophones.) —Angr 20:06, 1 May 2013 (UTC)
- An example: http://people.mpim-bonn.mpg.de/zagier/files/exp-math-2/fulltext.pdf mentions œufs as a homophone of eux, and ôte as a homophone of haute, but never mentions inflected forms of the same word as homophones. Lmaltier (talk) 20:15, 1 May 2013 (UTC) No, this is not a good example: this mathematical paper mentions parle and parlent (but its objective is the demonstration of a theorem...). A better example is the linguistic site I mentioned above, which mentions œufs as a homophone of eux: http://tempsreel.nouvelobs.com/abc-lettres/eux-oeufs/E/homophone.html. They provide many homophones, classified by their first letter. Try to find cases such as sot/sots... Lmaltier (talk) 20:30, 1 May 2013 (UTC)
- Neither of those sources makes an attempt to be exhaustive, and they're not going to list forms that native speakers (who are their target audience) will find obvious and trivial—every French speakers knows (if only subconsciously) that virtually every plural noun is homophonous with its singular. But since we're an English-language dictionary, our target audience is English speakers, not French speakers, and our readers can't be expected to just know which inflected forms of words are going to be homophonous and which aren't. (Are aimait and aimer homophones? Without looking it up, I as a French learner honestly do not know.) —Angr 21:30, 1 May 2013 (UTC)
- An example: http://people.mpim-bonn.mpg.de/zagier/files/exp-math-2/fulltext.pdf mentions œufs as a homophone of eux, and ôte as a homophone of haute, but never mentions inflected forms of the same word as homophones. Lmaltier (talk) 20:15, 1 May 2013 (UTC) No, this is not a good example: this mathematical paper mentions parle and parlent (but its objective is the demonstration of a theorem...). A better example is the linguistic site I mentioned above, which mentions œufs as a homophone of eux: http://tempsreel.nouvelobs.com/abc-lettres/eux-oeufs/E/homophone.html. They provide many homophones, classified by their first letter. Try to find cases such as sot/sots... Lmaltier (talk) 20:30, 1 May 2013 (UTC)
- Do those books ever list nonlemma forms? Do they list œufs as a homophone of eux? Do they list aime/aimes/aiment as homophones of M? (In English, bear and bear are usually called homonyms rather than homophones.) —Angr 20:06, 1 May 2013 (UTC)
- You are right. No, I don't arbitrarily pick out one definition of word. I just want to use the noun homophone in the sense actually used by almost everybody referring to homophones. I understand that this sense is probably less clear in English, because inflected forms are not pronounced the same, but in languages such as French, different inflected forms are very often pronounced the same. Look at books or websites addressing homophones (such as http://tempsreel.nouvelobs.com/abc-lettres/saut-sceau-seau-sot/S/homophone.html). You'll see that they exclude inflected forms (because they consider they are the same word). I feel that Wiktionary homophone lists often misinterpret the sense of the word homophone, this was why I mentioned the issue. If you disagree, just try to find a book mentioning sots as a homophone of sot (or the same kind of case). Just a few references: http://books.google.fr/books?id=lTb3xSgNPecC&pg=PA3&dq=homophones&hl=fr&sa=X&ei=xGqBUcvtAYGK7Aa67YEw&ved=0CDUQ6AEwAA#v=onepage&q=homophones&f=false states words different in origin and signification. http://books.google.fr/books?id=Z7ZyXZAJgx8C&pg=PP7&dq=homophones&hl=fr&sa=X&ei=xGqBUcvtAYGK7Aa67YEw&ved=0CDoQ6AEwAQ states words which sound the same but have totally different meanings. http://books.google.fr/books?id=0crig9rvzpMC&pg=PA8&dq=homophones&hl=fr&sa=X&ei=xGqBUcvtAYGK7Aa67YEw&ved=0CFwQ6AEwCA#v=onepage&q=homophones&f=false states that homophones have a different meaning and a different spelling. I disagree with this last definition: bear as a noun and bear as a verb are true homophones. Anyway, I think that all these books seem to agree on the fact that homophones have unrelated meanings, which excludes inflected forms. Lmaltier (talk) 19:43, 1 May 2013 (UTC)
- For a learner of French it would actually be more useful to have a list of plural nouns that are not homophones of the singular. There aren't many; œufs and yeux spring to mind. —Angr 13:15, 1 May 2013 (UTC)
- From the PoV of someone not too familiar with French, it is at least minimally useful to know that there are many forms of a given lemma that are pronounced the same. DCDuring TALK 12:01, 1 May 2013 (UTC)
- Yeah. Lmaltier, you just arbitrarily pick out one of the many definitions of "word" (= lexeme) and say that the current practice of adding homophones isn't in line with it. But there's no reason to pick out that particular definition in the first place. Longtrend (talk) 11:35, 1 May 2013 (UTC)
- I agree. No reason to say French sots isn't a homophone of sot. —Angr 11:26, 1 May 2013 (UTC)
-
-
-
-
-
-
-
- “words different in origin and signification”: familiarisât, familiarisa and familiarisas have different signification and, judging from the different endings, were derived with (or descend from words with) different suffixes.
- “words which sound the same but have totally different meanings”: familiarisât, familiarisa and familiarisas sound the same and have different meanings.
- — Ungoliant (Falai) 20:33, 1 May 2013 (UTC)
- Are you serious? If you don't want to understand references as I do (and I find my interpretation really obvious), I cannot add anything except that: just try to find a linguistic book or a dictionary including inflected forms of the same word in their examples of homophones. Lmaltier (talk) 21:35, 1 May 2013 (UTC)
- Yes. Evanildo Bechara, Moderna Gramática Portuguesa:
- “Pode haver homofonia em um mesmo paradigma (“sincretismo”), como em cantava, 1.ª e 3.ª pess. do imperfeito, […] ”
- There can be homophony in the same paradigm (“syncretism”), as in cantava, 1st and 3rd person of the imperfect, […]
- He is claiming that cantava (1st person singular imperfect indicative of cantar) is homophonous with cantava (3rd person singular imperfect indicative of cantar). — Ungoliant (Falai) 22:00, 1 May 2013 (UTC)
- http://legacy.earlham.edu/~peters/writing/homofone.htm, Suber and Thorpe, "An English Homophone Dictionary", offers us axis and its plural axes. It's entirely natural to exclude them in French; it would be odd to exclude them in English but it's an extremely rare case, and I'm not familiar with any other language whose spelling is so confused as for homophones to be a major issue. (Well, Chinese, but that's a whole nother ballgame.)--Prosfilaes (talk) 07:39, 2 May 2013 (UTC)
- Your 1st example refers to the noun homofonia, not to the noun homophone (and even in English, I find the translation quite normal; my issue is not about homophony). Your 2nd example is more interesting: this case is so rare in English that it's understandable that it's interesting to mention that axis and axes are pronounced the same. But it seems clear to me that it's outside the scope of many definitions of the word homophone. Nonetheless, Webster's definition also seems to include this case, as a difference in spelling is one of possible conditions, according to this definition. This discussion seems to show that the general idea is rather clear, but that each author interprets the precise sense differently when trying to define it precisely. Lmaltier (talk) 19:58, 2 May 2013 (UTC)
- Homofonia means the quality of words being homophones. — Ungoliant (Falai) 20:09, 2 May 2013 (UTC)
- Actually, axis and axes are pronounced completely different in English, at least in the General American accent. In axis the 's' has a soft 'ss' sound while axes has a harder 'zz' and the 'e' is slightly longer. In short: "axis" = ack-sis while "axes" (plural of axe) = ack-siz and "axes" (plural of axis) = ack-seez. On-topic, I agree that plurals should be considered different words to clarify homophones to non-french speakers. --Soardra (talk) 20:13, 5 May 2013 (UTC)
- Your 1st example refers to the noun homofonia, not to the noun homophone (and even in English, I find the translation quite normal; my issue is not about homophony). Your 2nd example is more interesting: this case is so rare in English that it's understandable that it's interesting to mention that axis and axes are pronounced the same. But it seems clear to me that it's outside the scope of many definitions of the word homophone. Nonetheless, Webster's definition also seems to include this case, as a difference in spelling is one of possible conditions, according to this definition. This discussion seems to show that the general idea is rather clear, but that each author interprets the precise sense differently when trying to define it precisely. Lmaltier (talk) 19:58, 2 May 2013 (UTC)
- Are you serious? If you don't want to understand references as I do (and I find my interpretation really obvious), I cannot add anything except that: just try to find a linguistic book or a dictionary including inflected forms of the same word in their examples of homophones. Lmaltier (talk) 21:35, 1 May 2013 (UTC)
-
-
-
-
-
-
- I agree with Prosfilaes and Ungoliant, familiarisât, familiarisa and familiarisas are homophones. - -sche (discuss) 21:06, 1 May 2013 (UTC)
Context labels [edit]
Hi! I am adding context labels to my Wiktionary parser. There are 1001 context labels in English Wiktionary which should be added manually to parser by me and my colleagues :) I have several questions:
- Should Template:Karabakh, Template:Kromanti and Template:Tigranakert be moved from Category:Context labels to Category:Regional context labels?
- Does Template:item is really context label template (now it belongs to Category:Context labels) or it is a usual (not context label) template which adds links to "[talk] and [citations]"?
- If context label template has two categories: category and subcategory (e.g. Template:Ijekavian has categories "Context labels" and "Regional context labels") then the more specific category should be remained?
-- Andrew Krizhanovsky (talk) 08:15, 1 May 2013 (UTC)
-
- They look regional to me.
{{item}}is not used in principal namespace. Are you parsing outside principal namespace?- I personally favor having only the more specific category for any large category of "context labels".
- HTH. DCDuring TALK 12:09, 1 May 2013 (UTC)
Helping parsers and scrapers might be a good reason to explicitly use {{context|something}} or {{label|something}} instead of having an open set of labels {{something}}. Would this be helpful for the parser project? —Michael Z. 2013-05-01 15:47 z
Thank you!
Yes, parser will use {{context|something}} but an open set of labels {{something}} will be parsed also.
Templates Karabakh, Kromanti and Tigranakert moved to "Category:Regional context labels".
Dear DCDuring, I didn't catch what is "outside principal namespace"? -- Andrew Krizhanovsky (talk) 18:17, 1 May 2013 (UTC)
- Don’t thank me yet. This is a proposal on the table, but we haven’t moved forward yet.
- DCD wonders if you will parse Appendix:, Wiktionary:, Talk:, or pages in other namespaces. —Michael Z. 2013-05-01 18:45 z
- OK, now I am parsing only main namespace. -- Andrew Krizhanovsky (talk) 21:12, 1 May 2013 (UTC)
Someone broke something here. Why is a category "Regional context labels Armenian" appearing in e.g. ճղոպուր (čłopur)? --Vahag (talk) 12:18, 3 May 2013 (UTC)
- It's a wiki-magic :( I don't understand how it is happen. -- Andrew Krizhanovsky (talk) 08:00, 4 May 2013 (UTC)
Dialects (Context labels) [edit]
There are four templates: {{dialect}}, {{dialectal}}, {{dialectal-n}} (not used now!), {{dialects}}. Is there any difference between these templates? -- Andrew Krizhanovsky (talk) 04:30, 2 May 2013 (UTC)
{{dialect}},{{dialectal}}, and{{dialects}}all categorise an entry into Category:Language name dialectal terms, but they display different text in contextual descriptions, according to the name of the template. I don't know about{{dialectal-n}}. I'm so meta even this acronym (talk) 12:20, 2 May 2013 (UTC)- OK. Thank you. -- Andrew Krizhanovsky (talk) 13:19, 2 May 2013 (UTC)
-
-
- You're welcome. :-) I'm so meta even this acronym (talk) 21:47, 2 May 2013 (UTC)
{{context-n}}looks like 'context new' to me. It seems to do the same job but without using brackets. Am gonna rfd it. Mglovesfun (talk) 10:45, 3 May 2013 (UTC)
- You're welcome. :-) I'm so meta even this acronym (talk) 21:47, 2 May 2013 (UTC)
-
Yorubic (Religion context labels) [edit]
There are no entries with religion context label {{Yorubic}}. Should it be kept or deleted? -- Andrew Krizhanovsky (talk) 16:15, 8 May 2013 (UTC)
- Nominated for deletion. Mglovesfun (talk) 15:05, 16 May 2013 (UTC)
board sports vs. skateboarding [edit]
Must we merge two Sport context labels templates: {{board sports}} and {{skateboarding}}? There are only 3 entries with template "board sports" (see list). -- Andrew Krizhanovsky (talk) 07:48, 14 May 2013 (UTC)
- I think so, unless board sports is meant to include surfing and snowboarding. —Michael Z. 2013-05-16 14:46 z
-
- OK, I see. -- Andrew Krizhanovsky (talk) 10:44, 17 May 2013 (UTC)
video games vs. video game genre [edit]
Must we merge: {{video games}} and {{video game genre}}? There are only 4 entries with template "video game genre". -- Andrew Krizhanovsky (talk) 08:20, 14 May 2013 (UTC)
Template:mathematics [edit]
Must we move {{mathematics}} from Category:"Topical context labels" to Category:"Mathematics context labels"? -- Andrew Krizhanovsky (talk) 07:56, 14 May 2013 (UTC)
- We have never achieved consensus on how to consistently treat such labels. Originally, we only had "context labels" that indicated limited usage contexts. Then we introduced large-scale use of topical context labels without differentiating cleanly the four cases:
- a term-definition is widely used and understood, but clearly has a topic associated, eg, sum has a definition that belongs to the "topic" 'arithmetic'.
- a term-definition has a topic associated, but is only understood and used in a narrow, usually technical context, eg, affine transformation.
- a term is used by a technical community, but the subject matter is not limited to that community. Examples might be military slang for a civilian.
- widespread use, no specific topic. We act as if most definitions are of type 4, without having defined what "no specific topic" might mean. Most function words would be of type 4. Presumably also most basic verbs.
- Topical context labels should apply to types 1 and 2. Usage context labels should apply to types 2 and 3. Ruakh has defended the use of topical context labels, even for words that are widely understood, presumably including sum (type 1). MZajac has advocated more or less banning the use of topical context labels where the topic did not also provide a usage context (ie, type 1). I don't think there is a consensus. One can find dictionaries that seem to follow either. But I have yet to find a print or online dictionary that seems to impose topical labels on all the term-definitions that could "logically" be assigned to a topic.
- If you look at the topical category Category:en:Arithmetic, you will find several terms that are widely understood and used, but not sum which clearly has a definition that belongs in that topic.
- IOW, this is a can of worms. But it can be swept under the carpet again. DCDuring TALK 11:57, 16 May 2013 (UTC)
-
- I’m not sure I am clear on the four use cases. Is no. 4 the case where no label is normally applied?
- I think I would add one other, perhaps overlapping with any of 1 to 4: 5. a term whose technical meaning is prescribed by some authority and accepted in its field of usage, even though it may not be easily attested by citations. Examples may include technical or legal definitions. —Michael Z. 2013-05-16 17:12 z
- "Is no. 4 the case where no label is normally applied?" Yes, exactly. Does that make the rest of it clearer?
- "meaning is prescribed by some authority and accepted in its field of usage" That's an interesting situation - not uncommon - that adds an additional dimension, ie, another four types. DCDuring TALK 17:20, 16 May 2013 (UTC)
- DCDuring, we're talking about the categorization of the template, not how it is used. Context labels can be divided into subcategories using
|tcat=foo. There's really no consensus over whether to do this or not, and pretty much nobody cares. I started doing it a few years ago and realized there are many more important things I could do. Mglovesfun (talk) 10:48, 17 May 2013 (UTC)
- DCDuring, we're talking about the categorization of the template, not how it is used. Context labels can be divided into subcategories using
Template:game theory [edit]
Must we move {{game theory}} from Category:"Games context labels" to Category:"Mathematics context labels"?
See w:Game theory. -- Andrew Krizhanovsky (talk) 08:27, 14 May 2013 (UTC)
Template:element symbol [edit]
This template prints the text "(chemistry)". It would be reasonable to move it from Category:Context labels to Category:Science context labels. -- Andrew Krizhanovsky (talk) 10:48, 16 May 2013 (UTC)
- Agreed.
Done —Μετάknowledgediscuss/deeds 02:41, 17 May 2013 (UTC)
Latin [edit]
The template {{la-proper noun-indecl}} belongs to Category:Grammatical context labels. I think it is an error, because:
- context labels are not bound to a specific language (Latin here),
- In two entries (Adam#Latin and Abraham#Latin) this template is located in unusual (for context labels) place.
The template {{la-conj-form-gloss/iacio/context6}} is also strange context label template, which is not presented in any entries. -- Andrew Krizhanovsky (talk) 11:03, 16 May 2013 (UTC)
- Technically, it is a grammatical label, although usually we build these into our declension templates. A weakness of our whole labelling system is that sometimes a label belongs on the headword, but it is technically and visually awkward to put it there. (Also, “grammatical context label” is a nonsense phrase demonstrating that we should stop calling usage and grammatical labels “context labels.”) —Michael Z. 2013-05-16 15:02 z
- It's because
{{indecl}}was called by this template, now it isn't, it uses{{qualifier|indeclinable}}. Mglovesfun (talk) 15:03, 16 May 2013 (UTC)
- It's because
Japanese Romanization [edit]
Although I know this is an old topic and something that had already been discussed and decided on back in 2006, I think there was a MAJOR point that wasn't discussed. Most keyboards (at least in America) do not have keys for the extra symbols used in Hepburn romanization and would make searching difficult. Thus, it makes more sense for words like どうじょう to be romanized as dojo (which is what most people are accustomed to seeing as it is the form used in most main-stream publications) and include doujou in the article (which is how you would input it using romaji in Microsoft IME) as well as dōjō (which I've only seen in Google translate). In addition, it's a hassle for me, and probably others who work on these entries, to manually input the macrons when creating links to romanizations.
In short, what I propose is that the vowel romanizations be main-stream and keyboard-friendly with the alternate romanizations mentioned (and even hard redirected from), but not linked to. I know that this would lead to some words linking to the same romanization, but the distinction between different words can be distinguished on the romaji page and would add actual functionality to said pages. As it stands right now, they are just simply soft redirection pages that serve no real purpose.
Example:
Romanization [edit]dojo (hiragana: どうじょう)
|
I know that this would require a LOT of work, but I think it will make the Japanese romanization entries more search-friendly and help to bring purpose to those pages beyond a simple redirect. --Soardra (talk) 20:13, 5 May 2013 (UTC)
- I don’t disagree, but I’d like to point out that searching doesn’t appear to be an issue to me. OMM searching for plain dojo finds macronized dōjō on this page, in the search field’s suggestions, in the search results, and in google. —Michael Z. 2013-05-06 00:14 z
-
- All or most Roman diacritic symbols are not an issue in the Wiktionary or Google search. Adding additional transliterations would add an overhead on editors but eventually there could be a module or a template that does it automatically. I personally don't see the need to add new outdated transliterations.
- As a side-note, Wiktionary transliteration is slightly tweaked and follows the trend of popular dictionaries and practical needs. (Hepburn standard has various versions as well). Combinations like kana (おう) "o + u" are transliterated as "ō" when it's a long sound (お父さん (おとうさん - otōsan)) or "ou" when it's a verb form (思う (おもう - omou)). いい (ii) can be either "ī" or "ii". Adjective endings are always "ii". Other long vowels are consistently romanised with a macron - ā, ē, ū. Particles "は", "へ" and "を" are "wa", "e" and "o", not "ha", "he" and "wo" as when you type them in Microsoft IME. Microsoft IME is only needed when you need to enter a Japanese text on a computer. For this you need to know simple rules what input corresponds what kana letter (before converting to kanji) but there are variants, like tu = tsu, hu = fu, etc. --Anatoli (обсудить/вклад) 02:39, 6 May 2013 (UTC)
- Hmm, I guess I assumed that you wouldn't be able to search for it using a standard keyboard. I see that this assumption was wrong, but I do, however, still think that there should be centralized romanization pages with variations listed or the current romanizations should be hard redirects to hiragana entries since the most recent ruling turned them into redundant soft redirects. --Soardra (talk) 03:23, 6 May 2013 (UTC)
- I see no problem in creating hard redirects from unstandard romanisations to standard ones (if terms don't exist in other languages). Soft redirects are not redundant, since there are variant Japanese spellings and the romanisations also happen to be words in other languages. Nobody wished to convert them to hard redirects and the new ruling is the result of long discussions and a consensus. You can put your proposal in Wiktionary_talk:About_Japanese about the romanisation pages. The number of links needed to get to full Japanese entries just seems to be growing: "non standard Roman spellings" -> standard romaji -> katakana/hiragana -> kanji (if exists). Do we really need that many steps? JA editors are better off focusing on the Japanese language, not on all possible misspellings (with various numbers of spaces) in a script, which is not used by the Japanese. --Anatoli (обсудить/вклад) 04:03, 6 May 2013 (UTC)
- Hmm, I guess I assumed that you wouldn't be able to search for it using a standard keyboard. I see that this assumption was wrong, but I do, however, still think that there should be centralized romanization pages with variations listed or the current romanizations should be hard redirects to hiragana entries since the most recent ruling turned them into redundant soft redirects. --Soardra (talk) 03:23, 6 May 2013 (UTC)
-
-
-
-
-
- I think there are a couple different things at work here. Some of the alternate spelling conventions that Soardra mentions are non-standard in terms of both “not according to [official] standards” and “not according to Wiktionary’s selected standards”. For instance, spelling Japanese long vowels without indicating the length in some way (such as dojo for 道場 dōjō) is common in writings by folks who 1) aren't that specific and/or knowledgeable about Japanese and 2) can't be bothered to deal with diacritics / spellings. Neither of these considerations are appropriate for a dictionary, and given that vowel length is phonemic in Japanese, we really shouldn't be including such spellings at all, provided that these common spellings can still be used to find the proper entries -- and this does appear to be the case, thankfully.
- Spelling Japanese in Latin letters in a manner similar to input for the Japanese IMEs provided by Microsoft, Apple, and the various Linux communities works fine for input, but is inconsistent with usage by any English-based learning materials I've seen, and with any official governmental romanization scheme, and also with any academic romanization scheme. One might see Japanese romanized in this fashion, and it's common enough that it has its own moniker in Japanese (ワープロ式 wāpuro-shiki, "word-processor style"), but again, I don't think this romanization scheme has any place in a dictionary (other than the term itself).
- In terms of what we use here, that's explained at Wiktionary:About_Japanese/Transliteration, pointed to from WT:AJA#Transliteration. Perhaps the nub of the real issue here is that we don't make WT:AJA prominent enough for new users?
- (Then again, Wiktionary:About_Japanese/Transliteration is rather horribly out of date and does not describe either what we do here or what I've perceived as general practice for Japanese romanization schemes in general -- it is quite badly in need of a rewrite. Examples: we try to avoid hyphenation in most cases, using spaces instead, and we split suru verbs with a space before the suru, among other issues. I'll set to reworking that as time allows.)
- More general background information is available at w:Romanization of Japanese.
- -- Eiríkr Útlendi │ Tala við mig 00:47, 13 May 2013 (UTC)
-
-
-
-
-
-
-
-
-
-
- @Mzajac. Eirikr has answered in the first paragraph. I have no strong opinions on redirects from non-standard transliteration but I lean on not to have them.
- @Eirikr. Yes, the page needs rewriting. I actually don't think it's either our practice or other dictionary practice to romanise long vowels as "aa", "ee", or "ii" (unless they are different symbol or part of the inflected adjective form (い-adjectives). Niigata (not Nīgata) is another notable exception, perhaps for historical reasons I don't know why 新 in 新潟 is read "nī". . Somehow, the exception is made for long "ō" and "ū", very strange, even if "ō" is more controversial おお (oo) vs おう (ou) (with well-know exceptions - verb endings - おもう (omou) or separate stems - こうま(kouma)). I don't remember the outcome of the last discussion but I consistently use macrons, if they are not exceptions as above. I'm sure one of the versions of Hepburn standards supports this, anyway. We can discuss details on Wiktionary talk:About Japanese/Transliteration --Anatoli (обсудить/вклад) 01:09, 13 May 2013 (UTC)
- @Anatoli, 新 (nī) is an OJP-derived prefix attaching to nouns, meaning "new" or "fresh" or "first", or some variation thereof. The 新#Japanese entry is much in need of expansion. FWIW, I think the romanization Niigata uses the doubled-"i" for historical reasons, making this an(other) exception to the rule. :) -- Eiríkr Útlendi │ Tala við mig 03:40, 13 May 2013 (UTC)
-
-
-
-
-
Golin, a Papuan language [edit]
Just an observation that gĺ is the only Golin word that we have listed. It might be amusing if it wasn't sad. Pengo (talk) 02:22, 7 May 2013 (UTC)
- Okay? I can't find a list of counts of entries by language, but looking at Category:Nouns by language and Category:All languages, there's no more then 1,500. Given that there's 5,000 living languages, Golin's doing better then many other languages. Moreover, I'm not sure the value of adding vocabulary that no one is going to look up; the few people who know this language and have Internet access probably own a copy of the existing dictionary we would probably not be more then a pale copy of. If that's what you want to do, more power to you, but I don't think departments of anthropology are going to reference Wiktionary.--Prosfilaes (talk) 12:01, 7 May 2013 (UTC)
- I notice that the pronunciation section of nil kabe is n'l kabé. Aren't pronunciations supposed to be in IPA (or SAMPA)? - -sche (discuss) 05:22, 8 May 2013 (UTC)
- Have fixed it up.
Not sure if/how the tones should be incorporated into the IPA though, so I've kept it as a separate "tone guide". (source material)Pengo (talk) 07:57, 8 May 2013 (UTC)
- Have fixed it up.
Translations lacking transliteration categories [edit]
After testing a change to {{t}} (thanks to CodeCat and Yair rand) I have created categories for 36 selected languages so far Category:Translations which need romanization (each category has a subcategory): Abkhaz, Adyghe, Arabic, Armenian, Bashkir, Belarusian, Bengali, Bulgarian, Burmese, Chechen, Georgian, Greek, Hebrew, Hindi, Japanese, Kannada, Kazakh, Khmer, Korean, Kyrgyz, Lao, Macedonian, Malayalam, Mandarin, Mongolian, Ossetian, Persian, Russian, Sinhalese, Tajik, Tamil, Tatar, Telugu, Thai, Ukrainian, Yiddish. I've taken out Mandarin from the template, since translations into traditional version are usually not supplied.
New translations into above languages using {{t}} without transliterations are added immediately, also any edits on English articles with translations will cause the translations to be picked up. It takes some time for categories to be filled for older translations. Please don't add languages you're not going to work with. To add a new language some change to {{t}} is required and two new categories. Any help in adding transliterations is appreciated. --Anatoli (обсудить/вклад) 00:09, 9 May 2013 (UTC)
Special:Contributions/MewBot [edit]
CodeCat (talk • contribs) via her bot MewBot is going way too far in my opinion. Violating Wiktionary:Bots#Policy by making controversial edits en masse, such as orphaning {{inv}} before the end of the deletion debate (where it looks very likely to pass as well), removing genders from {{fr-adj-form}} such as this and converting {{m|p}} to {{m-p}} (has this been discussed anywhere or is this a CodeCat personal project?)
I'm well aware of all the good work done by CodeCat, and I understand she has very specific ideas about how she wants Wiktionary to progress, but some of these ideas are controversial and shouldn't be implemented in this way. Mglovesfun (talk) 18:49, 9 May 2013 (UTC)
- About the orphaning of
{{inv}}, I kind of did expect someone to speak up about that. But I reasoned, if there had been no deletion debate, I still would have made these changes wherever I found them to bring it in line with Wiktionary practice, and nobody would have minded. I thought, if nobody is going to complain about many incidental edits spread over time, it would be strange if it wasn't also ok to do it all at once. I found it a bit strange that people were saying keep based on the template's widespread usage, which came across as circular reasoning ("we agree with using it, because it's widely used"). I hoped that if it became less widely used, people would judge the template more on its merits and not on its current usage. - Removing genders from adjective forms was done because the gender information is already in the definition, so duplicating it seemed a bit strange.
- As far as I know, the format of genders was discussed before, in particular in regards to Module:gender and number. I don't remember when exactly but it was shortly after Lua was introduced, and the module has been around for a while now, steadily increasing in usage as more modules make use of it. —CodeCat 19:25, 9 May 2013 (UTC)
-
- I haven't really kept in touch with policy discussions as of late but I certainly can't remember anything about genders being discussed, and so it should never have been processed by bot.
I think I've stated it before but: we've known each other since 2006, and I know that CodeCat is usually very reasonable, but it seems that there are occasional issues with changes being performed with no consensus behind it. I am confident that CodeCat does not have any bad intent and just needs to be told to take a bit more time with changes. We're not in a hurry or anything. -- Liliana • 21:13, 9 May 2013 (UTC)
- I haven't really kept in touch with policy discussions as of late but I certainly can't remember anything about genders being discussed, and so it should never have been processed by bot.
First-person singular imperative (Portuguese) [edit]
All our Portuguese verb conjugation templates include a first-person singular (affirmative and negative) imperative. I have programmed the bot accordingly. A native Portuguese speaker (ValJor) has pointed out that no such thing exists. This sounds reasonable to me. Shall I modify all the templates (and my bot) accordingly? SemperBlotto (talk) 10:34, 10 May 2013 (UTC)
- Hmm, does Portuguese have a different form for giving oneself orders? I will occasionally shout orders at myself out of frustration using the imperative in English. Does Portuguese have a different form, a non-second person form? Mglovesfun (talk) 10:39, 10 May 2013 (UTC)
- I'm only a pt-1, but if it is similar to Italian, there should not be a first-person singular imperative. I believe that Italians use the third-person when encouraging themselves. SemperBlotto (talk) 10:45, 10 May 2013 (UTC)
- It doesn’t exist. It appears that the person who created
{{pt-conj}}invented it. See Talk:cantar, Wiktionary:Beer parlour archive/2012/April#First-person Singular Imperative of Portuguese Verbs and WT:T:APT#Banning first-person imperative. - MG: Portuguese uses the subjunctive present when giving the first person singular and the third person orders. — Ungoliant (Falai) 12:03, 10 May 2013 (UTC)
- OK. I'll make sure my bot doesn't create any more. Then I'll update the five thousand Portuguese conjugation templates. SemperBlotto (talk) 07:11, 11 May 2013 (UTC)
Module:gender and number [edit]
I have worked on this a bit more, and it now supports everything that our templates do, and a bit more as well. This module is already fairly widely used, not just in modules I made, either... others have used it as well. But a few people were wondering about this module and how it works, so I wrote some documentation for it to explain it, and I am now "introducing" it. I think this module can replace the current templates like {{m}}, at least as far as other templates go. If someone writes {{m}} in an entry directly, it can't be replaced, but we could change {{m}} and such themselves so that they use this module rather than the current wiki code.
There is a rather strong point to note, though. There is a slight incompatibility between this module and the templates in the way we have traditionally denoted combinations of gender and number. If you write {{m|p}}, then the templates will "know" not to display a separator between the two, it "knows" that both form a single gender specification. But in the module, m|p means "masculine or plural", and you need to write m-p instead to get the combination. This is done to keep things simpler but it also has another purpose. Gender specifications like m|f|p are ambiguous. Does it mean masculine (singular), feminine (singular) and plural (all genders)? Or does it mean masculine (singular) and feminine plural? Or masculine plural and feminine plural? The older scheme does not distinguish this, while the module does. The difference can be significant as well. Dutch, German and Swedish for example do not have a "masculine plural" or "neuter plural", only a generic plural for all genders, so for them, "masculine or plural" makes sense while "masculine plural" does not. On the other hand, French or Spanish do have "feminine plural", so then "plural" on its own isn't a valid gender. For that reason I created a set of new combined templates like {{m-p}}, and started to add them where appropriate. A few people have complained that this wasn't discussed properly, and I kind of agree, so I am mentioning it here now.—CodeCat 15:41, 10 May 2013 (UTC)
- Neat. While making perfect sense within our template system, it also manages to elegantly use the vertical bar symbol (|) here with the meaning familiar to programmers of "or" (e.g. m|p means "masculine or plural") Pengo (talk) 22:02, 11 May 2013 (UTC)
- Oh, that was actually my own shorthand. The module itself is indifferent to the method of separating the individual genders from each other, because it receives them already split, in the form of a list. The module can also be invoked from a template, like this:
{{#invoke:gender and number|show_list|m|f}}
-
- This will display m, f currently. This means that any code that uses this module will have to perform the split itself. This is intentional, because there is currently a wide variety of different ways to specify multiple genders.
{{head}},{{l}}and{{nl-noun}}use ag2=parameter,{{t}}uses additional unnamed parameters, while{{fr-noun}}usesmfand then "interprets this" accordingly. It would actually not be a very good idea to use "|" as the separator to separate multiple specifications in a single string, because that would interfere with how templates interpret that character and you'd end up having to use{{!}}all the time. If we do decide to use single strings for multiple genders in this module, a comma would probably be a better choice. —CodeCat 22:29, 11 May 2013 (UTC)
- This will display m, f currently. This means that any code that uses this module will have to perform the split itself. This is intentional, because there is currently a wide variety of different ways to specify multiple genders.
- If no one objects, I would like to replace the remaining occurrences of
{{f|p}}and such with{{f-p}}, so that we can then look at migrating our templates to this module. —CodeCat 10:32, 15 May 2013 (UTC)
Using talk pages for RFV, RFD, Etymology Scriptorium and Tea Room [edit]
I realise that we've tried this before, but I'm not sure why it failed exactly. What I also wonder is why it seems to work better on Wikipedia. Keeping the discussions on the talk pages would have several advantages:
- Things are kept in the place where they are the most relevant.
- The discussions wouldn't be forgotten or missed once they are no longer at the bottom of the page.
- Archiving becomes much easier (which is what I like about Wikipedia's method).
I am wondering what exactly would be needed to make this work. The major downside of talk pages is that any edits to them go unnoticed by the large majority of editors, so keeping things in a centralised place would be good. That's why we have the discussion rooms. Doesn't Wikipedia have bots that automatically add new discussions to the list? —CodeCat 13:13, 12 May 2013 (UTC)
WT:Families [edit]
Hey all, just thought I'd make it public known (moreso) by posting here that I made a bit of changes to this Families page. It's nothing controversial I hope; the change I am highlighting can be seen here. It's mainly to go with the fact that I made {{etyl:ngf-sbh}} to replace {{etyl:South Bird's Head}}. User: PalkiaX50 talk to meh 14:30, 12 May 2013 (UTC)
- I'm glad we have a 'regularly-formed exceptional code' for South Bird's Head now, if that's not too much of an oxymoron. :) I've moved the explanation of how to create codes for subfamilies whose superfamilies have codes (and the example, ngf-sbh) into the previous paragraph. I meant to include such a line there when I overhauled the page last year, but as you can tell, I forgot (or did a bad job of it); that's why "For example, the Pama-Nyungan family is aus-pam: "aus" is the ISO 639-5 code for Australian languages" was sitting around after the bit about Germanic for no apparent reason, lol. - -sche (discuss) 17:40, 12 May 2013 (UTC)
- Cool, thanks for that. User: PalkiaX50 talk to meh 18:49, 12 May 2013 (UTC)
Is this code used in HTML lang attributes? If we are just making up language codes, then let’s make up ones that won’t break our web pages.
HTML5 requires a lang attribute to contain a valid language code.[26] ngf-sbh is not valid. ngf-x-sbh would be a valid language tag, as a private-use extension.[27] —Michael Z. 2013-05-16 02:35 z
map of American English dialects [edit]
Those interested in American English dialects may find this large, detailed map of dialects and their features interesting: [28]. - -sche (discuss) 19:17, 12 May 2013 (UTC)
- Thanks. It seems quite good. There's still a bit more to do. For example, I think that Chicago has a dialect distinct from its surrounding communities, just as Pittsburgh, New York, New Orleans, Cincinnati, and San Francisco do. DCDuring TALK 21:03, 12 May 2013 (UTC)
- I have a challenge to anyone I meet online for them to guess where I grew up based on my accent (hint: it's in the United States). I will give narrow IPA transcriptions to the best of my abilities for any words requested, answer vocabulary questions, and if necessary, record audio. Anyone who wants to try can have a crack at it on my talkpage or by emailing me! —Μετάknowledgediscuss/deeds 21:54, 12 May 2013 (UTC)
- I find it interesting that the map indicates no cot-caught merger in Texas, while w:Texan English says (attributed to what is apparently a reliable source) "The cot-caught merger is found almost everywhere in Texas." Who to believe? —Angr 22:10, 12 May 2013 (UTC)
- UPenn has a map of just that merger, which seems to suggest the two words are distinct in southern Texas, merged in northern Texas (with a split similar to that which Aschmann marks between Inland and Lowland). The merger has also spread over time, so it's possible the different information comes from different times. - -sche (discuss) 02:42, 13 May 2013 (UTC)
- I find it interesting that the map indicates no cot-caught merger in Texas, while w:Texan English says (attributed to what is apparently a reliable source) "The cot-caught merger is found almost everywhere in Texas." Who to believe? —Angr 22:10, 12 May 2013 (UTC)
- I have a challenge to anyone I meet online for them to guess where I grew up based on my accent (hint: it's in the United States). I will give narrow IPA transcriptions to the best of my abilities for any words requested, answer vocabulary questions, and if necessary, record audio. Anyone who wants to try can have a crack at it on my talkpage or by emailing me! —Μετάknowledgediscuss/deeds 21:54, 12 May 2013 (UTC)
small template idea [edit]
Would it be acceptable to make any declension templates for displaying definite and indefinite articles with nouns? It is probably better suited for more inflected nouns, though. I bring this up because the German Wiktionary has something like this, for any languages that have articles. --Æ&Œ (talk) 20:26, 15 May 2013 (UTC)
- Can you make an example of what you want? Like a table or something. — Ungoliant (Falai) 03:22, 16 May 2013 (UTC)
Standard spelling of [edit]
I noticed something on [[licence]] which I think could be used more widely when handling US/UK/India/etc spellings: the use of Standard spelling of rather than Alternative spelling of. Obviously, a dedicated template would be preferable to the {{form of|...}} that licence uses at the moment, but what do you think of the general idea?
{{alternative spelling of}} would still be used when spellings are equally standard within the same dialect(s), e.g. aarrghh vs argh. {{standard spelling of}} would only be used in entries like disfavor, which is not merely an "alternative" to [[disfavour]] that some people in the US use, but the standard US spelling. In those entries, "standard" would be more accurate and less likely to be misinterpreted — as someone commented earlier, we don't mean "alternative" as a value judgement, but some people perceive it as one, and either (as non-native speakers) go away thinking the lemma is preferred or (as native speakers) get upset that their variant has been "slighted".
We could even use parameters like {{standard spelling of|foo|in=US|in2=Australian}}, rather than context labels, to effect display of Standard US and Australian spelling of and sort entries into Category:American English standard forms (or just Category:American English) etc, with reciprocal qualifier-like templates on the lemmata—like {{British spelling}}, except displaying (British spelling) rather than just (British) so as not to imply the sense that followed was what was restricted to the UK—to sort the lemmata into Category:British English standard forms/Category:British English.
For sets of spellings in which one entry has already been lemmatised (e.g. disfavour, disfavor), we should keep the status quo; for sets where there isn't a lemma yet and content is currently duplicated (color, colour), we could make the oldest entry the lemma, rather like WP does.
Thoughts? - -sche (discuss) 21:45, 17 May 2013 (UTC)
- We wouldn't need this so such if we marked all the spellings that are less common currently so that we could leave the standard one unmarked. But that is completely unrealistic, at least for the next six months. Thus, if we can improve a set of entries that are alternative spellings of a single underlying term by marking the standard one using
{{standard spelling}}, we should do so. - I can't support categories at this time, because they would be completely misleading for at least a "six-month" transition period until all English spellings were properly marked. We would only have to mark about 2,500-3,000 a day to get this done in six months for English and about 16-19,000 a day to get this done for all languages. DCDuring TALK 23:48, 17 May 2013 (UTC)
- DCDuring, it seems like you've completely misunderstood the entire post. Please read it through again. —Μετάknowledgediscuss/deeds 01:48, 18 May 2013 (UTC)
- No, thanks. DCDuring TALK 02:12, 18 May 2013 (UTC)
- OK then. Assuming that I am the one who has completely misunderstood, care to explain what I got wrong? —Μετάknowledgediscuss/deeds 04:20, 19 May 2013 (UTC)
- @Metaknowledge. How could I know that?
- Perhaps I should have asked more questions about the proposal. Most of my concerns were with the categorization, which would have to be either complete or very well explained to be useful. As I am very skeptical about users reading and understanding our categorization criteria, which are rarely (never?) documented, at best subjective and unsubstantiated, and often whimsical, I focused on completeness.
- Though you didn't say why, you also expressed opposition to categorization. DCDuring TALK 12:32, 19 May 2013 (UTC)
- OK then. Assuming that I am the one who has completely misunderstood, care to explain what I got wrong? —Μετάknowledgediscuss/deeds 04:20, 19 May 2013 (UTC)
- No, thanks. DCDuring TALK 02:12, 18 May 2013 (UTC)
- @-sche: I strongly support all of it except creating new categories. —Μετάknowledgediscuss/Special:Contributions/Metaknowledge 01:48, 18 May 2013 (UTC)
- DCDuring, it seems like you've completely misunderstood the entire post. Please read it through again. —Μετάknowledgediscuss/deeds 01:48, 18 May 2013 (UTC)
-
-
- AFAICT, you both understood the proposal.(?) I've starting switching a small number of
{{alternative form of}}s to Standard form (pending the creation of a dedicated{{standard form of}}template). I won't create new categories. We could still use the existing categories (Category:American English, etc), but I won't do that without further discussion. - -sche (discuss) 15:51, 19 May 2013 (UTC)
- AFAICT, you both understood the proposal.(?) I've starting switching a small number of
-
Migrating towards Module:languages [edit]
By way of experiment, I have changed {{languagex}}, {{derivcatboiler}} and a few other templates to use Module:language utilities (which is a "gateway" to Module:languages) instead of the traditional language code templates. From what I've seen, this move hasn't broken any more than a handful of pages (which I fixed), and it seems like it was rather easy. I noticed that some of our current templates have already become completely orphaned through this change, including most (if not all) of the proto-language code templates. I expect that the /family subtemplates will also end up orphaned once the software has worked its way through the queue.
So I would like to ask if it's ok to continue with the migration, by changing all remaining uses of the language code templates and their subpages to use the Lua module instead. —CodeCat 19:31, 18 May 2013 (UTC)
- I don't think the module has been updated to reflect the changes in the mean time to various /script and /family subtemplates. Unless I'm wrong, can you deal with that first? —Μετάknowledgediscuss/deeds 04:22, 19 May 2013 (UTC)
- I fixed those yesterday. I edited
{{langt}}to check whether the two match, and add the code template to Category:Language codes with desynchronized data if they do not. I don't know how often that category updates, so it's possible that more changes were made in the meantime that are not shown in the category. I suppose that's one reason why we should do this sooner rather than later. —CodeCat 11:48, 19 May 2013 (UTC)- I've been working on deleting the /names subtemplates. They were never really used for anything to begin with. For the next step I would like to orphan and delete the /family subtemplates. This will be more work because there are a lot more of them, and there is a chance that some of them are still being transcluded. So I propose the following "plan":
- Edit
{{langt}}to categorise all language code templates that currently have a /family subtemplate into Category:language code templates with family. - Go over each of those templates with a bot, checking the /family subtemplate for transclusions. If there are no transclusions, replace the contents of the /family template with
[[Category:family subtemplates to be deleted]]. That category will then need to emptied out by some means. - If any codes remain in Category:language code templates with family after this is complete, those need to be orphaned manually.
- Edit
- Is this ok? —CodeCat 14:28, 21 May 2013 (UTC)
- You are preserving the contents of the templates' /names pages in the module, yes? Some of them were used / contained content. - -sche (discuss) 17:22, 21 May 2013 (UTC)
- Yes, their contents was moved over to the module so the names were not lost. I actually think that we can use that information in the future to make things like
{{langrev}}automatically-generated. It would also be useful to add it to the category of each language. —CodeCat 18:05, 21 May 2013 (UTC) - I updated Wiktionary:Languages to use the names from the module now. —CodeCat 18:11, 21 May 2013 (UTC)
- Yes, their contents was moved over to the module so the names were not lost. I actually think that we can use that information in the future to make things like
- You are preserving the contents of the templates' /names pages in the module, yes? Some of them were used / contained content. - -sche (discuss) 17:22, 21 May 2013 (UTC)
- I've been working on deleting the /names subtemplates. They were never really used for anything to begin with. For the next step I would like to orphan and delete the /family subtemplates. This will be more work because there are a lot more of them, and there is a chance that some of them are still being transcluded. So I propose the following "plan":
- I fixed those yesterday. I edited
- Just let me know whenever it becomes time to start deleting the language templates themselves (
{{aaa}}, etc). For one thing, a lot of direct uses (i.e., "Ghotuo") will have to be modified. For another, I've put a lot of effort into making sure the information in the templates is up-to-date, whereas I've noticed places where the module is not up to date, so I volunteer to delete the templates by hand after cross-checking them and the module against each other, as described on your talk page. - -sche (discuss) 17:22, 21 May 2013 (UTC)- As I noted above they were cross-checked a few days ago and the module was updated to match the templates. But if someone makes edits to the templates now, the module won't be affected of course. So for now we need to check edits to the Template: namespace regularly to see if anyone made any changes. The names and family templates are probably not used anywhere anymore, but the main (name) and the script still are, so they need to be kept synchronised. I'm not sure what to do with the direct uses. When we eventually get around to it, we can delete the templates that are orphaned, and change the remainder so that they "forward" the call to the module. That would give us some more time to work on them without having to worry about synchronisation issues. —CodeCat 18:05, 21 May 2013 (UTC)
Category:Asturian verb forms [edit]
Are these valid? Do we assume them to be valid? Made by a banned user using an illegal bot, but if they're valid I guess we can't delete them no matter who created them. Mglovesfun (talk) 10:10, 19 May 2013 (UTC)
Category:en:Latvian demonyms [edit]
I've just created this category (actually, I meant to work on its Latvian counterpart Category:lv:Latvian demonyms, but I tend to create an English equivalent for a category when I see there isn't one yet), which made me have a doubt about demonyms. Is the term supposed to cover only words that refer to a person born in a specific place -- i.e., only nouns -- or also adjectives that can refer to the place, or to people who were born there? In other words, should only Courlander be placed in Category:en:Latvian demonyms, or do Curonian, Courlandish and Courish also belong there? (Of course, there is also the derived question of whether it is a good idea to subclassify demonyms by larger areas -- 'French demonyms', 'Russian demonyms', 'American demonyms', etc. -- and whether the names of these categories should be simply in the form 'Geographic Adjective + demonyms', or 'demonyms in + Geographic Noun'). --Pereru (talk) 12:08, 19 May 2013 (UTC)
- I dunno. Maybe these could be added to Category:en:Latvia instead of its own category. — Ungoliant (Falai) 21:21, 19 May 2013 (UTC)
Do we really need horizontal rules between language sections? [edit]
Our standard practice has always been to add ---- right above a language header. But I don't really understand why. I can imagine that people did it because they liked the visual appearance of the extra line above the header. But it is not really necessary (or desirable) to do it that way; a better way would be to add a top border to h2 through CSS. So should we abandon this practice, or is there another reason? —CodeCat 15:17, 19 May 2013 (UTC)
- Yes, it was added for the visual effect, to differentiate the language headers clearly from POS and other L3 headers. In November of 2005, somebody suggested that it should be handled automatically though CSS rather than manually, and some editors began removing all instances of it. After a few hundred pages had
----removed, we put a stop to the effort until someone could get CSS to handle the task correctly. Unfortunately, after all these years I just do not remember the details anymore, but I recall that no one was able to figure out how to do it via CSS. I believe that our HTML experts of that time concluded that it could not be done in CSS (but I don’t remember the reasons or anything like that). So we gave up on this idea and reverted all of the pages that had been changed. —Stephen (Talk) 16:02, 19 May 2013 (UTC)
- Yes. But I'm pretty sure we don't need a blank line above and below it. SemperBlotto (talk) 16:06, 19 May 2013 (UTC)
- We seem to use H2 for other things (e.g. "Latest revision" message), so apparently a class would be needed on the "----"-generated H2s. I bet some bots rely on finding the "----", too. Equinox ◑ 16:11, 19 May 2013 (UTC)
- How would you remove the line from the first section of the page? What about pages with only 1 language section? DTLHS (talk) 16:13, 19 May 2013 (UTC)
Since each h2 also has a gray rule below, this is not an ideal visual-design solution to separate it from the content above.
But, off the top of my head:
/* dispense with hr for rules above h2 */
body.ns-0 #mw-content-text > h2 { border-top: 3px double #aaa; margin-top: 2em; } /* direct-child h2’s of the content in the main namespace */
body.ns-0 #mw-content-text > h2:first-child { border-top: none; margin-top: 0; } /* but not the one at the top */
body.ns-0 #mw-content-text > #toc + h2 {border-top: none; margin-top: 0; } /* and if the TOC appears, not the first one after the TOC */
body.ns-0 #mw-content-text > hr { display: none; } /* hide now-redundant hr’s */
The immediate-child > selector ensures that none of this is rendered in MSIE 6 or earlier. Adjusting the margin-top and padding-top of the h2 might improve the visual separation.
Untested. Should be tested with both TOC shown and hidden, because the TOC contains another h2. Probably needs testing in MSIE 7, because I’m not sure that browser has proper support for these selectors. —Michael Z. 2013-05-19 19:14 z
-
- After a quick test, it seems to work as expected in Safari/Mac. Put the code above in your vector.css to try it out.
-
-
-
- Sure, if no one can think of any disadvantages.
- The hr’s are being used as presentational elements, and with this CSS we can obviate the requirement for manual work and its inevitably inconsistent results, and reduce wikitext clutter. Hr is properly a “paragraph-level thematic break,”[30] so this is not good usage. The spec adds “There is no need for an hr element between the sections themselves, since the section elements and the h1 elements imply thematic changes themselves.” And it could fool the makers of bots and scrapers into thinking they can determine page structure from it, as you mention below. —Michael Z. 2013-05-19 21:45 z
-
-
- Even if the actual wikicode for the horizontal line isn't necessary for presentation, it's still a lot easier to parse ---- than to use a regex to match ==(langname)==. Just something to consider. DTLHS (talk) 20:53, 19 May 2013 (UTC)
- I'm aware of that, but a bot really shouldn't be dividing sections based on the presence or absence of ----. After all, an entry may occasionally be missing it, and we don't want that to cause the bot to break things or make bad edits. The ---- is just a convenience but it can never be relied on, the headers are what counts. Furthermore, if a bot does want to divide a page into sections, then it will surely need to know the name of each section. It would be rather pointless otherwise. So even if a bot uses ---- to split the sections, it will need to parse the header anyway to find out what the name of the language is. Having said that, it really isn't all that hard to just parse the headers. I recently made MewBot do it and it was rather easy. —CodeCat 21:02, 19 May 2013 (UTC)
- I recommend using
~instead of:first-child, as far more browsers support it. That would be:
body.ns-0 #mw-content-text > h2 ~ h2 { border-top: 3px double #aaa; margin-top: 2em; }
/* dispense with hr for rules above h2 */
body.ns-0 #mw-content-text > h2 ~ h2 { border-top: 3px double #aaa; margin-top: 2em; } /* siblings of the first h2 of the content in the main namespace */
body.ns-0 #mw-content-text > hr { display: none; } /* hide now-redundant hr’s */
We really need to add back glosses to pinyin entries [edit]
I never do anything with Chinese so I'm not normally affected by the decisions made by its editors, but this is an exception. I wanted to find out what "huo long" meant. No tone marks, just that. So how do I find out what that means? The entries huo and long show a few possibilities for tones, so I choose one and then I'm presented with even more possibilities. It's just too much work to look through them all. In the end, the most helpful thing was when I did a full search for "huo long" and found out that huǒ means fire, which is the most likely meaning given the context. But what about the second word? In the old situation, the search would have been limited to 4-5 entries - one for each possible tone. That's still doable. But now I have to look through dozens of entries, which is tedious and I just thought "I'm not going to bother, this does not work for me". This is a really bad usability issue. We've already had several people complain on the feedback page that the new entry format was useless and that they want the old format back. And I definitely agree. —CodeCat 22:13, 19 May 2013 (UTC)
- 火龙 means fire dragon, I believe. I agree, we've gotten so many complaints from IPs on various talkpages and on the Feedback page that it looks like we're really making a mistake. —Μετάknowledgediscuss/deeds 22:19, 19 May 2013 (UTC)
- Oppose. This is an undue burden on Chinese editors. Anything that puts pressure on use to develop a better way of managing glosses across multiple pages is IMO a good thing. DTLHS (talk) 22:26, 19 May 2013 (UTC)
-
-
- Somewhere it says that we favour readers over editors.
-
-
-
- But could something like the following work?: For each Han character (e.g., 火), put the gloss text in 火/gloss. In the romanization entry (huǒ), have
{{pinyin reading of|火}}link to the character, but additionally show the text of character/gloss – and in huǒ/gloss aggregate all of the romanized characters’ glosses. In the ambiguous diacritic-less form (huo), aggregate all of the romanizations’ glosses. It would still be more work, but it would eliminate error and duplication if each gloss were typed in only one place. —Michael Z. 2013-05-20 01:42 z
- But could something like the following work?: For each Han character (e.g., 火), put the gloss text in 火/gloss. In the romanization entry (huǒ), have
-
-
-
-
- Toneless pinyin should not be used for anything, unless they have become English loanwords. There are just too many tone combinations. Monosyllabic pinyin is used for disambiguation. I know the IP guy - a longtime user. (He/shes uses multiple IP addresses but that may be related to his work, home, ipad, whatever IP). He works with Vietnamese and Mandarin. See also Talk:ya3 --Anatoli (обсудить/вклад) 01:45, 20 May 2013 (UTC)
- Talk:ya3 --Anatoli (обсудить/вклад) 01:45, 20 May 2013 (UTC)
- ^This this this...it's an extra click on from the pinyin entry to get to the real info, so what? Also, IMO nonstandard pinyin entries should not be made, with the exception of the basic "syllables" I guess.User: PalkiaX50 talk to meh 01:57, 20 May 2013 (UTC)
- It's not one click. It's as many clicks as there are Han entries for a single pinyin entry, which in the case of huǒ is six, but for lóng it's 47. Do you really expect users to look through all 47 of them to find the right one? I certainly gave up when I saw that long list. It's even worse for the many users who don't even know the tone, because then they have to look through all the tones' pinyin entries as well and it multiplies. —CodeCat 02:00, 20 May 2013 (UTC)
- ^This this this...it's an extra click on from the pinyin entry to get to the real info, so what? Also, IMO nonstandard pinyin entries should not be made, with the exception of the basic "syllables" I guess.User: PalkiaX50 talk to meh 01:57, 20 May 2013 (UTC)
-
-
-
-
-
-
-
-
- It's a hard effort to find a Chinese word matching toneless pinyin. If a word exists, then it's easy to find by pinyin. "huolong" would also yield "huǒlóng" in the search window. Even with the same tones "huǒlóng" may mean not only "fire dragon" but also 火笼 (fire cage), 或隆 (or Long (name). --Anatoli (обсудить/вклад) 02:13, 20 May 2013 (UTC)
-
-
-
-
-
- The answered complaints are at Wiktionary:Feedback#jin4 and Wiktionary:Feedback#li4. I'm sure they are form the same person. I don't think you can get an accurate analysis from a CEDict dump, which was done once many years ago. Single character definitions exist in the translingual sections, Mandarin are badly behind and many have no definitions. --Anatoli (обсудить/вклад) 02:07, 20 May 2013 (UTC)
-
- I've given it some thought. Given that editors may not catch with single-characters quickly enough, Perhaps we could revert the edit of User:MglovesfunBot on monosyllabic toned pinyin entries if they are in demand, like yǎ#Mandarin? I would make entries like ya3#Mandarin redirects to yǎ#Mandarin? The translations need to be used with care, the character translation is not the same as word translation in Chinese. There are many specific Japanese characters, hanzi, which are only used in combinations, pure phonetic hanzi or their "definitions" is hardly used in real Chinese. Still, they are perhaps 95% right and may give an idea of the meaning. --Anatoli (обсудить/вклад) 02:56, 21 May 2013 (UTC)
- Bot-made mass edits like diff, removing glosses from Pinyin entries, have made Wiktionary less usable for readers, for bad reasons, IMHO anyway. Wiktionary would be better off without these edits. --Dan Polansky (talk) 19:46, 21 May 2013 (UTC)
Calques in "derived from" categories [edit]
As a result of a recent edit, I noticed that a term that I created, uncanny valley, is categorized (through use of the etyl template) into "English terms derived from Japanese." I'm a little bit uneasy with the idea of a calque like this one being listed as "derived from" the originating language (it seems a little misleading to me), so I thought I'd see if I could find a more official thought on it. —Dajagr (talk) 02:35, 20 May 2013 (UTC)
- Sorry, I disagree. If the Japanese term is a calque from English uncanny valley then I think it is, in a way, derived from English. — Ungoliant (Falai) 21:08, 21 May 2013 (UTC)
Tech newsletter: Subscribe to receive the next editions [edit]
- Recent software changes
- (Not all changes will affect you.)
- The latest version of MediaWiki (version 1.22/wmf4) was added to non-Wikipedia wikis on May 13, and to the English Wikipedia (with a Wikidata software update) on May 20. It will be updated on all other Wikipedia sites on May 22. [31] [32]
- A software update will perhaps result in temporary issues with images. Please report any problems you notice. [33]
- MediaWiki recognizes links in twelve new schemes. Users can now link to SSH, XMPP and Bitcoin directly from wikicode. [34]
- VisualEditor was added to all content namespaces on mediawiki.org on May 20. [35]
- A new extension ("TemplateData") was added to all Wikipedia sites on May 20. It will allow a future version of VisualEditor to edit templates. [36]
- New sites: Greek Wikivoyage and Venetian Wiktionary joined the Wikimedia family last week; the total number of project wikis is now 794. [37] [38]
- The logo of 18 Wikipedias was changed to version 2.0 in a third group of updates. [39]
- The UploadWizard on Commons now shows links to the old upload form in 55 languages (bug 33513). [40]
- Future software changes
- The next version of MediaWiki (version 1.22/wmf5) will be added to Wikimedia sites starting on May 27. [41]
- An updated version of Notifications, with new features and fewer bugs, will be added to the English Wikipedia on May 23. [42]
- The final version of the "single user login" (which allows people to use the same username on different Wikimedia wikis) is moved to August 2013. The software will automatically rename some usernames. [43]
- A new discussion system for MediaWiki, called "Flow", is under development. Wikimedia designers need your help to inform other users, test the prototype and discuss the interface. [44].
- The Wikimedia Foundation is hiring people to act as links between software developers and users for VisualEditor. [45]
If you want to continue to receive the next issues every week, please subscribe to the newsletter. You can subscribe your personal talk page and a community page like this one. The newsletter can be translated into your language.
You can also become a tech ambassador, help us write the next newsletter and tell us what to improve. Your feedback is greatly appreciated. guillom 20:28, 20 May 2013 (UTC)Monkey business in biological class entries. [edit]
I have noticed that our entries for fish, reptile, amphibian, etc., all include a collection of pictures of animals from that biological class, but each also contains a picture of a chimpanzee. I understand that the intent is to convey that all of these things are animals, but it seems jarring and unnecessary. bd2412 T 01:55, 21 May 2013 (UTC)
- This is part of the Wiktionary "picture book" project that somebody started (and, I believe, abandoned) several years ago. I think it was intended to be hierarchical, with trees of related pictures. It is probably a dead thing. Equinox ◑ 02:53, 21 May 2013 (UTC)
-
- Easy to find all the offenders. Looks like an AWB job to dispose of them, but some entries (like szympans) are using it correctly, so it'll have to be a process with human supervision. —Μετάknowledgediscuss/deeds 02:59, 21 May 2013 (UTC)