Wiktionary:Beer parlour
Wiktionary > Discussion rooms > Beer parlour
| Wiktionary discussion rooms (edit) see also: requests | ||||
|---|---|---|---|---|
| Information desk comment | history | archives Newcomers' questions, minor problems, specific requests for information or assistance. |
Tea room comment | history | archives Questions and discussions about specific words. |
Etymology scriptorium history | archives Questions and discussions about etymology- the historical development of words. |
Beer parlour comment | history | archives General policy discussions and proposals, requests for permissions and major announcements. |
Grease pit comment | history | archives Technical questions, requests and discussions. |
| All Wiktionary: namespace discussions 1 2 3 4 5 - All discussion pages 1 2 3 4 5 |
Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.
Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.
Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!
| Beer parlour archives | |||
| 2002
|
|||
| 2003
|
|||
| 2004
|
|||
| 2005
|
|||
| 2006
|
|||
| 2007
|
|||
| 2008
|
|||
| 2009
|
|||
| 2010
|
|||
| 2011
|
|||
| 2012
|
|||
| 2013
|
|||
| All subject headings |
April 2013
Idea for proper noun entries that belong in an encyclopedia[edit]
There seem to be a lot of proper nouns that show up on WT:RFD. Many of these have articles in the EN WP. Since people are clearly looking for these entries, and some editors mistakenly think such entries belong here, while some readers mistakenly think they can find those entries here, it's clear there's some demand for having proper noun entries here at EN WT.
What would folks say to allowing the creation of proper noun entries, such as Mona Lisa or Mini Cooper or Hound of the Baskervilles, but just as redirects (soft or hard, as deemed appropriate) to the corresponding EN WP article? This would meet the apparent demand for such entries, while not wasting EN WT editor time writing and maintaining them, and while avoiding the inclusion of encyclopedic material in this dictionary project. -- Eiríkr Útlendi │ Tala við mig 17:06, 2 April 2013 (UTC)
- I don't think hard redirects to Wikipedia are even possible; they'd have to be soft. Wikipedia itself already has w:Template:Wiktionary redirect for pages that will only ever be dictionary entries; all we need to do is make a corresponding template here. —Angr 17:32, 2 April 2013 (UTC)
- Sounds good to me. I see, however, that Semper Blotto deleted Template:Wikipedia redirect way back in 2006... -- Eiríkr Útlendi │ Tala við mig 17:35, 2 April 2013 (UTC)
- I don't see why this is a good idea. How is it better than just not having the entries at all? How do we decide which entries need
{{only in|{{in wikipedia}}}} and which are red links? Or do we create such redirects for all entry titles which have Wikipedia articles? Mglovesfun (talk) 17:38, 2 April 2013 (UTC)- It seems fine to create
{{only in}}redirects to WP for all proper nouns (Why not all entries of any kind?) for which we do not have an entry. Editors can replace the redirect with an entry, which is subject to the usual reviews. At the very least we should use the redirects for proper noun entries that have failed RfD for whatever reason. DCDuring TALK 17:58, 2 April 2013 (UTC)
- Sorry, I thought my initial comment explains the "why" -- users, both as editors and as readers, are clearly coming to Wiktionary in search of such entries.
- As to which entries to convert, any proper noun entry that editors think should not be in Wiktionary would be a candidate for such redirection. If deemed necessary for clarity, the redirection template could include text explaining that Wikipedia might not yet have such an article, but that if anyone were to create such an article, it belongs in Wikipedia and not here.
- I'm simply floating an idea about how to respond to apparent user demand for encyclopedic proper noun entries in a way that 1) meets that demand, 2) points users to the appropriate place for such entries, and 3) and doesn't require much work from editors. -- Eiríkr Útlendi │ Tala við mig 17:59, 2 April 2013 (UTC)
- It seems fine to create
- I don't see why this is a good idea. How is it better than just not having the entries at all? How do we decide which entries need
- Sounds good to me. I see, however, that Semper Blotto deleted Template:Wikipedia redirect way back in 2006... -- Eiríkr Útlendi │ Tala við mig 17:35, 2 April 2013 (UTC)
A good idea, but there already is a page that comes up when someone goes to an undefined proper noun. See, for example Starry Night, Mini Cooper S, or A Study in Scarlet. It just doesn’t serve the required needs. The page that comes up for starry night, mini cooper s, or a study in scarlet is a bit better, but still could be improved.
I wonder if there is a way to improve “perhaps there is a page xxx in our sister encyclopedia project, Wikipedia.”
Anyway, let’s improve the 404 page instead of reinventing the wheel. —Michael Z. 2013-04-02 18:18 z
- The fact that people are searching for things doesn't mean we should include them, even as redirects to Wikipedia. The number one search on a user-generated replacement for Special:WantedPages was in fact the Mandarin for 'naked porno movies'. Mglovesfun (talk) 20:59, 2 April 2013 (UTC)
- Well, dang it, someone should create that Wikipedia article already.
- <ahem.> On a more serious note, the issue is not just that folks are searching for such pages, but that they are actually creating them. This generates maintenance overhead for WT editors. Redirecting users to Wikipedia might help reduce this overhead. -- Eiríkr Útlendi │ Tala við mig 21:03, 2 April 2013 (UTC)
- But redirects are bluelinks. If we make tens of thousands of redirects, how will anyone notice the few bluelinks which have, wrongly, been created as full entries that we (by our current policies and culture) tend to subject to WT:RFD? I agree with Michael: improve the "404" that comes up when someone clicks on [[Some Proper Noun]], goes to [1] or uses the search bar to search for "Some Proper Noun". - -sche (discuss) 21:28, 2 April 2013 (UTC)
- We already make color distinctions in our links: a lighter blue for links to other projects, orange for links with the wrong section. A bot could replace links to
{{only in}}entries with{{w}}links or "w:" piped plainlinks. Improving the 404 only partially addresses the problem, though it has the enormous advantage of, in principle, being easier to implement. DCDuring TALK 22:04, 2 April 2013 (UTC)
- We already make color distinctions in our links: a lighter blue for links to other projects, orange for links with the wrong section. A bot could replace links to
- But redirects are bluelinks. If we make tens of thousands of redirects, how will anyone notice the few bluelinks which have, wrongly, been created as full entries that we (by our current policies and culture) tend to subject to WT:RFD? I agree with Michael: improve the "404" that comes up when someone clicks on [[Some Proper Noun]], goes to [1] or uses the search bar to search for "Some Proper Noun". - -sche (discuss) 21:28, 2 April 2013 (UTC)
-
-
-
-
- I have never noticed light blue or orange, and as far as I know I have a good computer displays and good color vision. —Michael Z. 2013-04-03 14:28 z
- Orange links have to be turned on in your Per-browser preferences; as for light blue links, don't you see a difference between blue and blue? For me the difference is subtle but real. —Angr 15:24, 3 April 2013 (UTC)
- I have never noticed light blue or orange, and as far as I know I have a good computer displays and good color vision. —Michael Z. 2013-04-03 14:28 z
-
-
-
-
-
-
-
-
-
- Exactly. We also have greenlinks for no page corresponding to inflected forms, if you have the gadget for accelerated creation of these selected on user preferences. (See conquest#Verb.) DCDuring TALK 15:33, 3 April 2013 (UTC)
-
-
-
-
- I realize that I had possibly misinterpreted Mglovesfun's previous comment as suggesting we shouldn't even rework our 404. To clarify, I am not advocating that we start creating scores of pages solely for the purpose of redirecting to WP. My intent instead was originally just to ask if perhaps proper noun pages, particularly those that fail RFD (which I should have stated more specifically earlier), would benefit by having redirects to WP. Michael's suggestion of reworking our 404 sounds like a wonderful idea, either alongside specific redirects for pages that failed RFD, or as a replacement for that idea. -- Eiríkr Útlendi │ Tala við mig 22:19, 2 April 2013 (UTC)
-
- Then I'll back Michael's idea as well. Mglovesfun (talk) 09:16, 3 April 2013 (UTC)
-
-
-
-
-
-
- MediaWiki:Noarticletext contains the "Wiktionary does not yet have a mediawiki page for Noarticletext" message; you can change the message by editing that page. (There's also MediaWiki:Noexactmatch, but I don't know that it's used anywhere.) MediaWiki:Searchmenu-new, and possibly other pages, control(s) what's displayed when someone searches for a term we don't have. - -sche (discuss) 20:22, 3 April 2013 (UTC)
-
-
-
-
-
-
-
-
-
-
- Thanks. And do you know where to find the 404-from-a-link page, e.g. mini cooper s, and the additional wrong-case message added to Mini Cooper S? —Michael Z. 2013-04-03 21:09 z
-
-
-
-
-
-
-
-
-
-
-
-
- I presume it's one of these pages, but I don't know which one. - -sche (discuss) 21:40, 4 April 2013 (UTC)
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- The easiest way to find out is to visit http://en.wiktionary.org/w/index.php?title=mini_cooper_s&action=edit&redlink=1&uselang=qqx and examine the indicated messages. For example, (creating: mini cooper s) holds the place of a message generated by MediaWiki:Creating with $1 set to mini cooper s. (qqx is in the "private use" range of language-codes, so some enterprising MediaWiki developer decided to appropriate it for this purpose. I'm guessing the feature's primary target audience was interface translators, so they could find the message that they need to translate, but I've found it very useful myself.) —RuakhTALK 04:39, 8 April 2013 (UTC)
-
-
-
-
-
-
-
Gothic romanisation template[edit]
I have created Template:got-romanization (different from Template:got-romanization of!) and a sample entry "afdrausjan" (modified to use the new template). As with Japanese Template:ja-romaji the definition line with # is generated by the template, so it has both the headword and a definition. It has the same look and feel as a new romaji entry. Like Japanese, the Gothic entries only link to the main entry, no other information. --Anatoli (обсудить/вклад) 03:38, 4 April 2013 (UTC)
- How is it different from
{{got-romanization of}}? The output seems to be the same: they both say, "See XYZ" where XYZ is the spelling in the Gothic alphabet. I preferred it when it said "Romanization of", though. —Angr 10:52, 4 April 2013 (UTC)
-
- It's an attempt to make romanisation entries of different languages more similar to each other. Template:ja-romaji is increasingly used for Japanese romaji entries and there are two votes Dan Polansky has created in the protest of the change that was agreed on by JA editors after a very long discussion in BP. The votes: 1. Wiktionary:Votes/pl-2013-03/Japanese Romaji romanization - format and content and 2. Wiktionary:Votes/pl-2013-03/Romanization and definition line. The second vote is specifically about the approach on how definition line is added. Usually it's # on a new line in the wikitext. The new Japanese and the proposed Gothic template generate the definition line, thus not editable directly.
-
- User:Mzajac raised a concern that Japanese and Gothic are different from each other. Both Japanese and Gothic by default don't produce any definition as such, only a link to the main entry. Using a template will enforce this rule. The definition line will still be there (thus complying with Wiktionary:ELE#Definitions) but a new definition line is only added when a new parameter is added. The suggested template is much shorter and as proved by the current work on Japanese romaji entries can be generated very quickly both by people and bots.
-
- Re: "See" and "Romanization of". Again, just to make both templates (Gothic and Japanese) look similar. There's already the word "romanization" at the header level.
-
- New:
==Gothic==
===Romanization===
{{got-romanization|𐌰𐍆𐌳𐍂𐌰𐌿𐍃𐌾𐌰𐌽}}
-
- Old:
==Gothic==
===Romanization===
{{got-rom}}
# {{got-romanization of|𐌰𐍆𐌳𐍂𐌰𐌿𐍃𐌾𐌰𐌽}}
--Anatoli (обсудить/вклад) 11:23, 4 April 2013 (UTC)
Transpondine Portuguese[edit]
There is nothing in Wiktionary:About Portuguese concerning spellings on opposite sides of the Atlantic. I have been adding Brazilian forms as "alternative forms" of the spelling used in Portugal. But often, I see that the Portuguese Wiktionary does the exact opposite. Does anyone have an opinion on what we should do - or should it be up to the personal preference of our editors? SemperBlotto (talk) 15:55, 4 April 2013 (UTC)
Possible inadequacies in Template:Han char[edit]
In a discussion with User:Gdbf137, we discovered that Mac and MS seem to use different Cangjie input sequences. The Unihan database entry for 农 gives a Cangjie input sequence of LBV. Apparently, that works correctly on Mac OS X Lion. On Windows 7, however, MS's Changjie IME accepts HBV to input this character, while LBV just generates an error beep and no character is output.
Does anyone else have a handle on what's going on? Do we need someone to change {{Han char}} to allow for multiple Cangjie input strings, one per OS? Or, more frighteningly, has Microsoft and/or Apple been changing things willy-nilly, and we need to allow for multiple Cangjie input strings, one per OS version? -- Eiríkr Útlendi │ Tala við mig 17:35, 4 April 2013 (UTC)
- Is this related to Cangjie_input_method#Versions_of_Cangjie? "Currently, version 3 (第三代倉頡) is the most common; it is the version of Cangjie supported natively by Microsoft Windows ... The Cangjie input method supported on the Mac OS is somewhat like Version 3 and somewhat like Version 5." I don't know what the solution to this would be other than to specify what version the template is referring to. DTLHS (talk) 04:49, 5 April 2013 (UTC)
Cross-script/mutated semi-borrowings[edit]
This seems to be a repeated question, and it's come up again at Wiktionary:Requests for deletion#da. What do we do with half-borrowed words? Stuff like "da", which is clearly a Russian word being used in English, or "si", which is clearly a Spanish word being used in English, even if both would never be spelled that way in their original language. google books:si senor gets a lot of hits of English hits, even once we've excluded "Sí, Señor". Writings across the world are dropping a little bit of foreign language that their audience will understand in their text, and whenever there's orthographic differences, we'll probably see this type of change. "Da" can probably be attested in every major European language in this sense. Instead of creating senses under da for all the languages, maybe we could create a orthographically mangled (for foreigners) version of|да template (name to be changed, of course) and stick it under Russian. Same thing with si and danke schon and probably some mangled Latin we've deleted, etc. (This doesn't intend to change real borrowings, just one language stuck into another.) (The template could maybe use a foreign lang tag, so {{orthographically mangled (for foreigners) version of|old_lang=de|new_lang=en|[[danke schön#]]}}; I do suspect that da and friends are used in multiple Latin-script languages, but it's a too common particle to make that easy to check.)--Prosfilaes (talk) 05:17, 5 April 2013 (UTC)
- There are so many edge cases that it's hard to draw the line. Da might be meant as transliterated Russian, in which case WT:ARU disallows its existence. But I have a Hispanophone friend who sometimes says /siː ˈsɛn.nɚ/ as a joke, and si senor might be a valid English entry. I don't think you've made it crystal clear when to use this hypothetical template and when to create a normal entry, so I can't really support it yet. —Μετάknowledgediscuss/deeds 19:50, 7 April 2013 (UTC)
-
-
- I'm confused too. What if we use the normal process of finding citations? Words from one language used in another could be labeled as such, using
{{context}}or{{qualifier}}. It's dangerous if we go too far, e.g. if we start quoting all English words in Latin letters in another language, especially in non-Roman based languages. For the moment, I wouldn't go with romanised Russian either. --Anatoli (обсудить/вклад) 02:36, 8 April 2013 (UTC)
- I'm confused too. What if we use the normal process of finding citations? Words from one language used in another could be labeled as such, using
-
-
-
-
- I'm not suggesting we don't find citations; I'm worried about the stuff where we can find plentiful citations that establish it's between two languages. What I'm most concerned about is that stuff like "Do svidanya", that English speakers can find in English texts and want to look up, but is likely to get treated as Russian, and then get deleted because it's romanized. There seems to be a hole here where things that can be cited, and might actually get looked up, are deleted because they aren't real Russian or Latin, etc. I think si senor is a good example; it's not English, it's clearly Spanish or at least pseudo-Spanish. But we deleted danke schon for the same reasons, as not German. I'm not comfortable if it's created as English, it will survive. I am sure that eminently citable words and phrases like that need to be stored on Wiktionary in the spelling that people will find them used under, and what language tags them is less important then that.--Prosfilaes (talk) 06:27, 9 April 2013 (UTC)
-
-
-
-
-
-
- I'm not sure I understand your suggestion. We could have redirects for commonly known foreign words if they incorrectly spelled or written in the wrong script. do svidanya -> до свидания, danke schon -> danke schön (danke schon previously failed RFD but I see in the history, it was a full entry, not a redirect). Note: schon in German is a different word from schön. I don't think si senor or si señor merit an entry, English or Spanish. konnichi wa already exists as a romaji entry and can be looked up. --Anatoli (обсудить/вклад) 06:58, 9 April 2013 (UTC)
-
-
-
Template:sense[edit]
This template is used to label specific synonyms or antonyms. With antonyms that leads to problems though, like this edit shows: diff. People get confused because they expect that the sense being shown is the sense of the words listed after it. And that isn't really a strange assumption either, except that it's not how we use the template. So, would it be ok if some extra text were added to the template, so that it displays this instead: of the sense "(sense)" ? —CodeCat 14:19, 7 April 2013 (UTC)
- What is the before and after of your proposal in the general case? DCDuring TALK 14:54, 7 April 2013 (UTC)
- What do you mean? —CodeCat 16:05, 7 April 2013 (UTC)
- This is a fairly well-known problem; what are you actually proposing? Mglovesfun (talk) 16:34, 7 April 2013 (UTC)
- Um... I'm proposing to change the text that the template displays, like I said? —CodeCat 17:24, 7 April 2013 (UTC)
- To exactly what. DCDuring TALK 18:51, 7 April 2013 (UTC)
- Quote: "So, would it be ok if some extra text were added to the template, so that it displays this instead: of the sense "(sense)" ?" —CodeCat 18:58, 7 April 2013 (UTC)
- If anything, it should display "definition" or "def". "Sense" communicates mostly to us, perhaps to linguists. DCDuring TALK 19:12, 7 April 2013 (UTC)
- Status quo ante:
- CodeCat proposal (as I [incorrectly] understood it):
- CodeCat proposal (from below):
- Alternative proposal 1:
- Alternative proposal 2:
- There are numerous other arrangements of brackets, font types, and wording possible. I don't know that any of these will solve the problem of communicating the intent of the antonym section (and the less familiar semantic relations) while simply providing a breadcrumb back to the definition. We could also try putting "NOT" in front of the gloss for the antonyms heading only or we could skip trying to communicate to ordinary users. DCDuring TALK 19:42, 7 April 2013 (UTC)
- Quote: "So, would it be ok if some extra text were added to the template, so that it displays this instead: of the sense "(sense)" ?" —CodeCat 18:58, 7 April 2013 (UTC)
- To exactly what. DCDuring TALK 18:51, 7 April 2013 (UTC)
- Um... I'm proposing to change the text that the template displays, like I said? —CodeCat 17:24, 7 April 2013 (UTC)
- This is a fairly well-known problem; what are you actually proposing? Mglovesfun (talk) 16:34, 7 April 2013 (UTC)
- What do you mean? —CodeCat 16:05, 7 April 2013 (UTC)
- @CodeCat you said you wanted to change the text of the template, just not what you wanted to change it to. Mglovesfun (talk) 19:48, 7 April 2013 (UTC)
- I’ve proposed:
- in a previous discussion. — Ungoliant (Falai) 11:40, 9 April 2013 (UTC)
- Missed that. It has the advantage of brevity over the other proposals. And it makes sense if one read from linearly from the headings to the individual items: 'Antonyms of "definition"', 'Coordinate terms of "definition"' etc. How could a user misread it? Perhaps by ignoring the quotation marks and reading the "of" as part of the following text. Should "of" also be italicized? DCDuring TALK 11:57, 9 April 2013 (UTC)
- If the gloss is italicised and the ‘of’ isn’t, it will help prevent misreading. — Ungoliant (Falai) 12:05, 9 April 2013 (UTC)
- Actually I meant to include quotes around the sense, but that kind of got list in translation. —CodeCat 13:01, 9 April 2013 (UTC)
- So some possibilities with "of" are:
- (of definition):
- (of "definition"):
- (of definition):
- (of "definition"):
- Of these my favorite is the last, because: 1., we often put glosses in quotes, eg in
{{term}}, 2., 'Of' needs to distinguished, 3., the whole thing needs to be visually distinct from the terms following, including any that are not links, eg SoP circumlocutions. DCDuring TALK 14:00, 9 April 2013 (UTC)
- If the gloss is italicised and the ‘of’ isn’t, it will help prevent misreading. — Ungoliant (Falai) 12:05, 9 April 2013 (UTC)
- Missed that. It has the advantage of brevity over the other proposals. And it makes sense if one read from linearly from the headings to the individual items: 'Antonyms of "definition"', 'Coordinate terms of "definition"' etc. How could a user misread it? Perhaps by ignoring the quotation marks and reading the "of" as part of the following text. Should "of" also be italicized? DCDuring TALK 11:57, 9 April 2013 (UTC)
- The wording "of [sense]" or "of the sense [sense]" works for 'nyms and pronunciations, but not for usage notes.
I propose "in the sense '[sense]'" (or "in the sense of [sense]" or whatever), which is I think how normal people speak about a particular sense of a word. It works for 'nyms and pronunciations also: in fact, for me at least, it seems much more natural even for 'nymsand pronunciations.—msh210℠ (talk) 18:53, 9 April 2013 (UTC) ← Portions struck through at 04:45, 10 April 2013 (UTC).—msh210℠ (talk)- I was thinking of "of". Mglovesfun (talk) 22:06, 9 April 2013 (UTC)
- How about allowing an alternative wording, specified, say, by an "alt=" parameter for whatever cases cases not well served by "of". There are, in English at least, relatively few uses of
{{sense}}in Usage notes AFAICT. Is it commonly used there in other languages? DCDuring TALK 22:52, 9 April 2013 (UTC)
-
- I guess my issue is partially that
{{sense}}is often with not a gloss but a usage restriction or a field of endeavor as its parameter. For example, work (which currently has no 'nyms listed at all) might list 'nyms of the "Said of one's workplace (building), or one's department, or one's trade (sphere of business): He mostly works in logging, but sometimes works in carpentry" sense using {{sense|of a workplace or trade}} and 'nyms of the "(zymurgy) To cause to ferment" sense using {{sense|zymurgy}}. I've definitely seen examples of each of these types of uses of {{sense}}. Adding "of" would make no sense in those cases either. (The 'nyms aren't 'nyms of zymurgy.)And even in the more common case, viz even when the parameter of {{sense}} is a gloss of the headword, what we're really listing aren't 'nyms of "to cause to ferment" — as the wording "of cause to ferment" (or the awkward "of to cause to ferment") would imply. Rather, what we're listing are 'nyms of work in the sense of "to cause to ferment". So adding "of" doesn't cut it, in my opinion — not even for 'nyms and pronunciations.
Perhaps best would be "for [pagename] in the sense of:" with a colon at the end and no quotation marks around what follows. Quotation marks (and even italicization if the prefatory text isn't italicized) wouldn't work in the zymurgy (or field of endeavor) case, as it'd seem like "zymurgy" is a gloss. The colon is then necessary, as "in the sense of [gloss]" doesn't flow. Using only "in the sense of:" is still slightly ambiguous, not solving the problem we started with here: it could be referring to the listed antonyms rather than the headword. I think "for [pagename] in the sense of:" takes care of all these issues — though of course there may be others I haven't thought of.—msh210℠ (talk) 04:45, 10 April 2013 (UTC)
- I guess my issue is partially that
Some small changes to Mandarin (also Cantonese, Min Nan) entry structure and about topic categories - suggestion[edit]
I will run this by all our active Chinese contributors but I'd like to suggest to dump the rs (radical sort) value in Chinese entries, e.g. {{cmn-noun}}.
The rationale is the following:
- Finding the sorting order for the Chinese character entries is not straightforward, although Wiktionary itself is has this info. Lack of the knowledge impedes casual editors and any people who is sure about words but not sure about the structure to add new entries.
- The mistakes are numerous, I have fixed some when I noticed but I'm sure I missed many.
- Simplified and traditional topic categories are sorted differently but there is no real reason for it, e.g. 標準 (biāozhǔn) (“standard”) is sorted by "木11標準" (so will appear under "木" (tree) radical but its simplified equivalent 标准 by "biao1zhun3" and will appear under letter "B".
- A Chinese person who would rely on the radical sorting and very familiar with radicals and their order would probably be better just entering the word they are searching in Chinese and find it, rather than searching in the category listings
Take a look at this Category:cmn:Intermediate_Mandarin_in_traditional_script:
You see, a small number is sorted by a Latin letters, others are by radicals. Those under Roman letters are incorrectly formatted. Errors are often introduced when a traditional entry is created by copying a simplified entry and the initial character is different.
I suggest to remove the "rs" from entries and from category sorting and just sort by numbered pinyin (e.g. "biao1zhun3"), perhaps stop splitting topical Mandarin categories into simplified/traditional. Serbo-Croatian entries don't separate Cyrillic/Latin entries into separate categories. Or we need to check/fix all incorrectly formatted entries, for which we just don't have enough resources.
I'm not insisting on this change but User:A-cai is no longer very active here who did a great job and we could get more people on board if Mandarin entries were simpler.
Just want to check the mood and get opinions. We have tens of thousands of entries in traditional script, so there needs to be an agreement before anything happens. --Anatoli (обсудить/вклад) 04:24, 8 April 2013 (UTC)
- I have no strong opinion on this. The rs value is autogenerated when using
{{cmn new}}, which relies on{{zh-sortkeys}}to produce the rs of the first character in the page title. So doesn't really bother me. (I wish the language sections are just a single template, with various parameters included, eg.- {{language_name|標|準|p1=biāo|p2=zhǔn|jy1=biu1|jy2=zeon2|poj=piau-chún|n|[[standard]]|eg=|syn=基準|syn2=|ant=}} (effectively everything needed to generate 標準),
- and all the rest (trad-simp detection/conversion, pinyin analysis, sort key, even generating pinyin for character) are automated.) Wyang (talk) 05:16, 8 April 2013 (UTC)
-
- Thanks. You're well equipped, others are not so lucky. :)
- What about maintenance of topic categories. Many have been moved or deleted, just because they don't follow the structure of other languages.
- Category:Mandarin terms derived from English exists on its own (35 entries), although initially was meant to be split.
- Category:Mandarin terms in simplified script derived from English (356)
- Category:Mandarin terms in traditional script derived from English (301)
-
- Category:Mandarin terms derived from Japanese is now a separate category (21) but Category:Mandarin terms in simplified script derived from Japanese and Category:Mandarin terms in traditional script derived from Japanese deleted or moved (like many others, they are not empty!). It's a mess. Some long time editors like Tooironic seems to be confused about categories in Mandarin, so people just stopped categorizing Mandarin entries or categorise them at random (with or without words traditional/simplified). Well, the reason is simple - trad. and simpl. entries are sorted differently and therefore categorised differently. --Anatoli (обсудить/вклад) 05:54, 8 April 2013 (UTC)
-
- I do like the idea of getting rid of the duplication in categories- it always struck me as rather kludge-y. The main drawbacks/issues I can see would be characters that have multiple pronunciations, and the fact that we would instantly increase the membership of most categories and decrease the number of distinct entries per page. Also, the difference between traditional and simplified characters isn't as easy to see for those who don't know one or the other as for the difference between Latin and Cyrillic. I can see how there might be confusion about which terms in a category are traditional, simplified, or the same in both, and even which ones are paired with which. I'm sure those aren't terribly difficult to deal with, so I'm in favor of changing the category sorting.
- As for the rs parameter: we wouldn't have to get rid of it. It would be easier to just make it non-mandatory and ignore it in category sorting. Maybe someday we can give users the option of choosing which sort order to use, though we'd have to populate the rs parameters by bot, first. Chuck Entz (talk) 07:18, 8 April 2013 (UTC)
- Of course we should keep separate entries for simplified and traditional characters and words. Wiktionary after all aims to catalogue all words in all languages, in whatever forms. However I too support the abandoning of the old system under A-cai. It's simply not worth the extra effort. At present I add about 50 or so Mandarin entries a week. I imagine I, along with other editors, could create double the number of entries if we didn't have to deal with the rs field. But now Wyang says the rs field is generated automatically. Is that really the case? I just created a new Mandarin entry at 扇贝 - where is this automatic rs field you speak of? Did I do it wrong? If so advise me how. Cheers. ---> Tooironic (talk) 09:49, 8 April 2013 (UTC)
- When you create the entry, you can use the code {{subst:cmn new/a|p1=shàn|p2=bèi|n|[[scallop]]}} in both forms, and this will generate the entire content. Wyang (talk) 12:03, 8 April 2013 (UTC)
- Wow, that script is powerful. I just created 拆開 and 拆开 in seconds. Wish someone had told me about that earlier. But is the IPA on those entries correct? It doesn't look right to me... ---> Tooironic (talk) 23:09, 8 April 2013 (UTC)
- When you create the entry, you can use the code {{subst:cmn new/a|p1=shàn|p2=bèi|n|[[scallop]]}} in both forms, and this will generate the entire content. Wyang (talk) 12:03, 8 April 2013 (UTC)
-
-
- @Tooironic. Re: simplified/traditional separation. With Serbo-Croatian it's easier. The words in Cyrillic and Roman sort themselves differently automatically. As you know, the parameter "t" in
{{cmn-noun}}is an indicator that the noun is traditional, "s" is simplified. They are automatically added to Category:Mandarin nouns in traditional script or Category:Mandarin nouns in simplified script or both if the value is "ts". A word, which is both simplified and traditional will appear in both categories but if you just want Category:Mandarin nouns they will appear in the alphabetical order - both forms. We could apply the same sorting for both traditional and simplified noun categories but abandon trad/simp approach for topical categories? What do you think? - In a nutshell - I don't suggest removing "t", "s" and "ts" params, so SoP will always be separated into trad/simp categories as parts of speech. I suggest sorting by numbered pinyin instead of radical + number of strokes, i.e. "biao1zhun3" ("pint" parameter) instead of "rs" - "木11" for both simplified and traditional entries and remove words traditional/simplified from topical categories.
- @Tooironic. Re: simplified/traditional separation. With Serbo-Croatian it's easier. The words in Cyrillic and Roman sort themselves differently automatically. As you know, the parameter "t" in
-
--Anatoli (обсудить/вклад) 13:15, 8 April 2013 (UTC)
-
- I personally don't have any issues at finding the "rs" value, it only takes a few seconds longer to create a Mandarin entry and I have to open another tab. Don't get me wrong, guys. I am just worried that most templates we use for other languages don't work for Mandarin, like for example
{{etyl}}. Japanese entries also use sorting parameters (hiragana) but it's more consistent. Consider entries like 傍晚. It's adding to Category:cmn:Elementary Mandarin using "人10" as "skey" and Category:cmn:Elementary Mandarin in simplified script using "bang4wan3" as the sorting key. Why is it not categorised as a traditional version? If we treat simplified and traditional categories equally (using one sorting key) and move all topic categories to match other languages, then it would be easier for everyone. Musical instruments categories - trad/simp and without suffix all seem independent from each other - these entries ended up belonging to three topic categories, obviously using whatever sort order.
- I personally don't have any issues at finding the "rs" value, it only takes a few seconds longer to create a Mandarin entry and I have to open another tab. Don't get me wrong, guys. I am just worried that most templates we use for other languages don't work for Mandarin, like for example
-
- Category:cmn:Capital cities in simplified script and Category:cmn:Capital cities in traditional script don't have a common supercategory, they go directly under generic Category:Capital cities. Whatever category you take, there are problems. I stopped categorising a while ago, except for HSK, which is still OK, sort of.
-
- Allowing a bot to load rs value may not be such a bad thing but it's probably better to normalise categorise (make them similar to other languages - no trad/simp suffixes) and use numbered pinyin or radical sort (whatever we decide) but equally for both trad and simp entries. --Anatoli (обсудить/вклад) 13:03, 8 April 2013 (UTC)
-
-
- @Tooironic. I have modified your 屌絲 and created 屌丝. With my suggested way of categorising -
# {{slang|vulgar|lang=cmn|skey=diao3si1}}. Now both entries appear in Category:Mandarin slang and Category:Mandarin vulgarities sorted by chai1kai1 (under letter "D") (note categories are without words "traditional"/"simplified". - They are still in Category:Mandarin nouns in traditional script and Category:Mandarin nouns in simplified script - not suggesting to change that but we could change the sorting of the traditional term to be the same as simplified (pinyin, not rs), if we are in agreement.
- Please check whoever is interested, if this is worth attention. --Anatoli (обсудить/вклад) 00:24, 9 April 2013 (UTC)
- @Tooironic. I have modified your 屌絲 and created 屌丝. With my suggested way of categorising -
- I don't have any problem with this. I've never liked the idea of separating categories based on script types, especially two that share some characters. I wasn't even aware that some traditional terms were sorted differently. If this goes ahead, you will get my support. Jamesjiao → T ◊ C 01:41, 9 April 2013 (UTC)
-
-
-
- Great stuff. Will invite the creator - User:A-cai. I hope he will not be upset. We could still have some bots to do tricks with automatically adding rs values to Mandarin values, right?
- Wyang, you expressed suggestions how to add rs automatically but have not expressed your opinion on categories and sorting. What do you say?
- The hardest bit would be converting or automating this change but as I said, Mandarin topical categories are in a mess, anyway. --Anatoli (обсудить/вклад) 01:48, 9 April 2013 (UTC)
- I think simp/trad should be merged into one single category and sorted by pinyin. Adding the pinyins everywhere would be troublesome, but like I said I would prefer if all the templates in one language section are merged into one template {{language_name|..., with various things defined by various parameters, including definitions and context labels. But I can't see this being actualisable on Wiktionary any time soon, so... Wyang (talk) 04:24, 9 April 2013 (UTC)
-
-
-
-
-
- Both entries 動能 动能 belong to Category:cmn:Physics (not in
Category:cmn:Physics in simplified scriptorCategory:cmn:Physics in traditional script!) and are sorted by "dong4neng2", so appearing under letter "D", not under radical "力". If everyone is OK with this, I will update Wiktionary:About Sinitic languages. All entries in Mandarin categories with "...in simplified script" and "...in traditional script" should gradually be moved to categories without these suffixes, with the numbered pinyin sort order e.g.skey=dong4neng2or just by adding|dong4neng2in the category name, e.g.[[Category:cmn:Physics|dong4neng2]] - It's a lot of work and I am currently busy with other things but will get to this eventually.
- Parts of speech categories remain as they are for now, with the traditional/simplified distinction. We could change the sorting key for traditional entries to use pint rather than rs but I don't how. Simplified entries are sorted by pinyin. --Anatoli (обсудить/вклад) 00:17, 11 April 2013 (UTC)
- Both entries 動能 动能 belong to Category:cmn:Physics (not in
-
-
-
- Sorry for not responding sooner. I haven't had as much time to devote to the project in recent years. I'm all for automation and making things easier in order to make the site attractive to more contributors. No objections to your modification proposals. -- A-cai (talk) 17:49, 27 April 2013 (UTC)
En dash in {was wotd}?[edit]
Per user request at Template talk:was wotd#request to exchange hyphen for en dash, is it ok to change the hyphen “-” for an en dash “–” in {{was wotd}}?
This is a v. minor change, but it’s highly visible, so I thought it best to ask.
- I support. Good thing you asked, as some editors seem to really hate the use of typographic characters instead of plain ASCII ones. — Ungoliant (Falai) 12:31, 8 April 2013 (UTC)
-
-
- I would like to know what kind of person (other than a trained Wikipedia pedant) actually writes Bose–Einstein condensate rather than Bose-Einstein condensate. Equinox ◑ 21:47, 8 April 2013 (UTC)
-
-
-
-
- Writes? I don't think anyone uses a hyphen-minus in writing. People type it, but typesetters (who have nothing to do with Wikipedia) have always had to choose the correct dash-type character from the type tray or now character set. Pick up any properly typeset book, and you will find that Bose–Einstein condensate is typeset with an en dash.--Prosfilaes (talk) 01:35, 9 April 2013 (UTC)
-
-
It’s a weird world where we set type and publish it to the world on a typewriter keyboard. —Michael Z. 2013-04-09 02:58 z
- DC, regarding typographical correctness: yes, an en dash is correct here, while a hyphen is incorrect; see hyphen and dash. The hyphen is reserved for intraword usage, such as line-wrapping and compounds (such as line-wrapping ;), while en dash is used in varied contexts, including interword use such as this. See Wikipedia:Manual of Style: Hyphens for usage at ’pedia.
- Beyond correctness, there’s also aesthetics – a hyphen jumps out at me here as conspicuously too short (it’s sized for intraword use, and thus feels stubby surrounded by spaces), which is the standard typographical judgment.
- The main objections to use of non-typewriter typographical characters I’ve heard are:
- Rendering problems – non-ASCII characters render poorly on some computers, particularly older ones.
- Input or editing difficulties – some editors have difficulty entering non-ASCII characters (due to needing to use a character picker) or editing entries with non-ASCII characters (esp. due to rendering issues).
- Personal preference – some users prefer typewriter characters over book-style typographical characters.
- Use of typewriter characters is naturally common online, due to ease of input, though we needn’t be limited by it. In the case of templates (as opposed to use in entries), there aren’t any editing difficulties, and we have lots of Unicode throughout Wiktionary, so I don’t think there are significant problems, but want to check.
- Sounds like people are generally supportive (or “meh”); will wait another few days for more comments.
- —Nils von Barth (nbarth) (talk) 15:38, 9 April 2013 (UTC)
-
- Just go ahead and change it. Why on earth start a discussion about using the correct character in a template, where it will never have to be re-entered?
There being no opposition, I have gone and dunnit. —Michael Z. 2013-04-09 22:02 z
Facebook[edit]
I set up this page on Facebook for promoting Wiktionary of all languages. You are welcome to become co-administrators of the page, so you can update the page with inspiring messages. --LA2 (talk) 20:55, 8 April 2013 (UTC)
- Where do I apply? — Ungoliant (Falai) 21:29, 8 April 2013 (UTC)
- I noisily hate social meeja and would prefer us to "promote" ourselves through just making a good dictionary that people want to use. But I suppose it can't hurt :) Equinox ◑ 21:34, 8 April 2013 (UTC)
- I am boycotting Facebook, but why not promote Wiktionary there? DCDuring TALK 21:44, 8 April 2013 (UTC)
-
- I'm using that page to pull people out of Facebook and into Wiktionary. Whether you boycott Facebook doesn't matter, since you are already here. However, if someone would like to help to pick a "word of the day" for the Facebook page, I think that could make the page quite popular. --LA2 (talk) 22:37, 8 April 2013 (UTC)
-
-
- That's not the first page on Wiktionary in Facebook. Earlier this one was advertised. I liked both. Don't see why not. Would also be useful if we could recruit some native speakers and talented editors but promoting among users is also important, Wiktionary is for users, not for editors :) --Anatoli (обсудить/вклад) 00:57, 9 April 2013 (UTC)
- I really don't understand the amount of hate here for Facebook. Use it wisely and use it to your advantage. Don't post info that you don't want others to see.... Simple... I will take a look at the page on my home btw. I liken this attitude to the one on StackExchange towards Wiktionary. Take a look at this: How much should I trust Wiktionary?. I tried to defend Wiktionary and provide my own arguments (thanks Hippietrail for chiming in), but I can't change everyone's mind I guess. Jamesjiao → T ◊ C 01:56, 9 April 2013 (UTC)
- That's not the first page on Wiktionary in Facebook. Earlier this one was advertised. I liked both. Don't see why not. Would also be useful if we could recruit some native speakers and talented editors but promoting among users is also important, Wiktionary is for users, not for editors :) --Anatoli (обсудить/вклад) 00:57, 9 April 2013 (UTC)
-
-
-
-
-
- It is correct that in a Beer parlour discussion in March 2012, the existing Facebook page was mentioned, but that is a placeholder page that Facebook created based on a Wikipedia entry. That page doesn't get updated and there is no way to claim it, it's a dead end. The page I created now has a dozen co-administrators that are able to update the page and appoint more co-administrators. It's an anarchy of the same kind as the Wikisource page on Facebook, that I set up last year. It gets updated sometimes, but not very often. Right now, the Wikisource page has 418 fans and Wiktionary has 69. --LA2 (talk) 13:58, 9 April 2013 (UTC)
- 69? That's a good position to be in. Mglovesfun (talk) 22:04, 9 April 2013 (UTC)
-
- groans loudly* OK, seriously, Facebook pages have some sort of automated system in which you can write a bunch of posts and they'll come out on a schedule. Assuming somebody's willing to put some time in, we could easily have posts and to spare. —Μετάknowledgediscuss/deeds 15:11, 13 April 2013 (UTC)
-
- 69? That's a good position to be in. Mglovesfun (talk) 22:04, 9 April 2013 (UTC)
- It is correct that in a Beer parlour discussion in March 2012, the existing Facebook page was mentioned, but that is a placeholder page that Facebook created based on a Wikipedia entry. That page doesn't get updated and there is no way to claim it, it's a dead end. The page I created now has a dozen co-administrators that are able to update the page and appoint more co-administrators. It's an anarchy of the same kind as the Wikisource page on Facebook, that I set up last year. It gets updated sometimes, but not very often. Right now, the Wikisource page has 418 fans and Wiktionary has 69. --LA2 (talk) 13:58, 9 April 2013 (UTC)
-
-
-
- WE could have a Facebook widget on our Front Page, that users could click on. I think the code is something like <a title="Tell Facebook" href="http://www.facebook.com/sharer.php?u=http://en.wiktionary.org/;t=Wiktionary">Facebook</a> SemperBlotto (talk) 15:20, 13 April 2013 (UTC)
Proposal of a pronunciation recording tool[edit]
Hello, Rahul21, a developer, offers to develop a pronunciation recording tool for Wiktionary, helped by Michael Dale as part of GSoC. The tool would allow to record and add audio pronunciations to Wiktionary entries while browsing them (see background discussion on Wiktionary-l). Please read and comment the proposal! Regards, Nemo 22:37, 9 April 2013 (UTC)
A slightly different way to show etymologies derived from Latin verbs[edit]
Romance languages use the infinitive as the lemma, but for Latin we use the 1st person singular present. This means we can't write "from Latin cantō" in any of the etymologies at cantar, because the infinitive derives from cantāre. Most entries solve this by just saying "from Latin cantāre, present active infinitive of cantō". But that is rather wordy, moreso than what's really needed to get the point across: the word cantar derives from cantāre, but its Latin lemma/paradigm entry is at cantō. For that reason I've started to use another approach, by writing {{term|canto|cantāre|lang=la}}. So it will show "cantāre", but link to canto. Since not many entries have this, I wondered if nobody had considered doing it that way yet, so I'm sharing the idea here. :) —CodeCat 02:22, 11 April 2013 (UTC)
- That’s a good idea. — Ungoliant (Falai) 02:37, 11 April 2013 (UTC)
- I'd done that and waited for someone to complain about it. The case that CodeCat mentions seems ideal for that approach. What about derivations from participle forms? DCDuring TALK 03:30, 11 April 2013 (UTC)
- I've done this for months :) —Μετάknowledgediscuss/deeds 15:00, 13 April 2013 (UTC)
- I just do "from Latin cantō", exactly as you say we "can't". (I guess I've found a way! :-P) The French verb chanter really does come from the Latin verb cantō, so it's straightforward and correct. It's only a problem when people try to gloss cantō as "I sing" (as though they were glossing the specific form) instead of the correct "to sing" (which is how we gloss verbs). —RuakhTALK 16:39, 14 April 2013 (UTC)
- I like CodeCat's suggestion. Also, had I in the past noticed any entry glossing "canto" as "to sing" rather than "I sing", I would have changed it and (though this discussion informs me not to do so) I would have marked the edit as minor, assuming I was uncontroversially correcting a simple error by a random IP unfamiliar with Latin grammar. - -sche (discuss) 00:10, 15 April 2013 (UTC)
- Another issue is the descendant section of Latin verbs. Should, say, video’s descendants be linked to as
{{l/pt|ver}}or{{l/pt|ver|vejo}}? — Ungoliant (Falai) 01:02, 15 April 2013 (UTC)
- I have no problem with the concept, but you need make another template for this purpose instead of overloading the meaning of
{{term}}'s parameters. To anyone not familiar with this usage, it's confusing and looks like an error. I've probably accidentally "corrected" one of these before so that the macron and no-macron versions matched. Pengo (talk) 02:12, 7 May 2013 (UTC)- Why would we need another template to do effectively the same thing? The template doesn't mandate that the linked entry and the displayed term are in any way "the same", and as far as I know other people have been doing this for a long time. For example, you sometimes see definitions like this: # [[break|broken]]. I don't see anything wrong with that in principle. —CodeCat 02:20, 7 May 2013 (UTC)
Appendix:1000 Japanese basic words[edit]
This may not be appropriate for the BP but since this is the most visible spot, I want to ask everyone their opinion about Appendix:1000 Japanese basic words and what to do with it. (I wrote something on the talk page too.) It's a good appendix now, but it's "1000 Japanese basic words" and the description is "This appendix is a specific list of one thousand basic words," and yet there are about 700 words in it.
Some background: I don't know the full story but as far as I can tell, in a nutshell, the Japanese Wiktionary was building the list ja:Wiktionary:日本語の基本語彙1000 some time ago, and the editors here decided to copy it. At the time the original list was incomplete. Since then, the original list has grown but en.WT's list has not been maintained. Now, ja.WT's list has surpassed 1000 words and their list says "作業中 現在:989項目 2008年11月16日 一旦、1,000以上挙げ、その後取捨選択するなり基本語彙2,000にタイトルを変更するなりする方針としたいと思います。" which means that their list broke 1000 entries and that they are considering changing the name to 2000 basic words.
We can go two routes: depart from ja.WT and keep it a list of 1000 basic words, or mirror their version, and exceed 1000 words in the process.
I don't have exact numbers, but if you search for "Japanese word list" on Google, our appendix is the first result. That suggests to me that the wider world is making use of it as a resource. While ja.WT's version is good, it lacks essential words such as 可愛い (kawaii), いっぱい (ippai, "very"), たくさん (takusan, "many"), or すごい (sugoi, "very/wow!".) You can't have a 30-second conversation with high school students without using those words. Conversely, ja.WT's appendix has quite specific words such as ミミズ (mimizu, "earthworm") and 十二指腸 (jūnishichō, duodenum). Duodenum is a basic word?
How about both routes? I would like to combine the most basic of the "basic" words and the Japanese Language Proficiency Test Level 5 appendix (the lowest level) for a "1000 basic Japanese words" appendix, and maybe mirror ja.WT's appendix on a different page. --Haplology (talk) 05:02, 12 April 2013 (UTC)
- Your last paragraph sounds eminently reasonable, and I fully support that method (although I think perhaps mirroring ja.wikt's appendix is less important, because it would appear that we are a better arbiter of basicness than they are). —Μετάknowledgediscuss/deeds 14:59, 13 April 2013 (UTC)
-
- This appendix is not a very scientific one and was made by amateurs. It's worth adding words to make a thousand, choosing carefully from JLPT or frequency list and/or removing that are identified as not being basic.
- The valuable time could be spent on making Appendix:JLPT better - fixing the word format and choosing the spelling we actually have here, e.g. we have 上がる but not 上る, or create the alternative spellings.
- JLPT appendices could be made similar to Appendix:HSK list of Mandarin words with new categories like
Category:JLPT/N5Category:ja:JLPT-5 or similar. --Anatoli (обсудить/вклад) 01:53, 15 April 2013 (UTC)- I'm glad we all agree. I've been adding common words from the N5 list to the category, and once the category reaches 1000 items, I plan to add them to the appendix and add the sort keys to the categories. I've been though the whole N5 list once and added common words at my discretion (but not all of them,) and there are now almost 900 words in the category. I plan to go through N5 again, and also look at the N4 list and try to find any other essential words that may have been missed. The original list is biased toward nouns, so other parts of speech would be good places to look for new candidates. It also ignores casual words like ちゃう, which is also essential to high school students, or pretty much anybody. To anyone who is so inclined, if you see anything that strikes you as essential in the real world, then please add it. --Haplology (talk) 05:42, 17 April 2013 (UTC)
- I have just created new categories. What I meant is something like this: Category:Japanese by difficulty level with five categories. I only added two words as examples: 会う to level 5 (Category:ja:JLPT-5) and 安心 to level 4 (Category:ja:JLPT-4). The actual names of categories and templates, format and links can be discussed. The HSK categories provide a bit more info and look better. Please take a look. --Anatoli (обсудить/вклад) 06:09, 17 April 2013 (UTC)
- Sure, that sounds good. I just have a few questions. So basically this means completing the JLPT appendices project, as well as the 1000 basic words project, and having both exist in parallel? That's what I would hope for, as both projects have already been made, and they serve slightly different purposes. I assume that no new words would be added to the JLPT categories, only the ones already on the appendices? In the process of reviewing the appendices, it sounds like you want some revision to be done to them, such as adding more common forms like 上がる rather than 上る. I agree with that. I just changed "掃除 そうじする to clean" to "掃除 そうじ cleaning", but perhaps "掃除する そうじする to clean" would be better, and have that link to 掃除? I think there is also 近く, so what should be done with that? In the past there was some opposition to creating pages like 近く, but I think there's precedent for pages like that in other languages and there's no policy against them. It's mainly just that the Japanese editors have enough work with lemmas, and if there are going to be forms like 近く with their own entries, I'd rather a bot add them. The L5 appendix was a bit slow to edit, but did not time out or have any problems like that, so I guess there's no need to break it up like L1 (which was too much for the server to display.) What do you think about breaking up appendices? --Haplology (talk) 04:28, 18 April 2013 (UTC)
-
-
- Yes, I think both templates and category groups could easily coexist.
- する-verbs, I'd link to lemma but display lemma + する because they are verbs. Having "掃除 to clean" would look weird because 掃除 is a noun. I have adopted this for translations. Same thing for な-adjectives.
- Cleaning sounds good but I don't know if JLPT would prescribe 上る for the tests, not 上がる. JLPT is a bit more strict in nature than 1000 basic words but I have no idea who made original lists, how accurate and up-to-date they are. Should students for level 5 know both forms? We can always have simple entries with links to main entries, even skipping conjugations, etc. to save time. What do you think?
- No strong opinion on 近く but since く-adverbs are simple in structure, I don't see why we should discourage them, also for the sake of back translations from English. No need to create them, if a bot could do it but I wouldn't delete if they exist.
- Breaking up appendices - OK. You already did one. --Anatoli (обсудить/вклад) 04:53, 18 April 2013 (UTC)
-
tt[edit]
A lot of editors are used to typing <tt> to make things look typewritery. In HTML5, tt is “entirely obsolete, and must not be used by authors.”[2] The W3C suggests:
Where the
ttelement would have been used for marking up keyboard input, consider thekbdelement; for variables, consider thevarelement; for computer code, consider thecodeelement; and for computer output, consider thesampelement.
It looks to me like code is a good general replacement. More specific semantics can be conveyed with samp, kbd, and var. Continuing to use tt in discussions won’t break anything, but we should replace it in templates and entries, so we don’t have to endure the shame of unnecessary validation errors after the MediaWiki software is brought up to par. —Michael Z. 2013-04-12 17:51 z
By the way, also gone the way of the rotary dial are acronym, big, center, font, strike, and u, and all of those styling attributes on table elements. —Michael Z. 2013-04-12 17:59 z
- What does "obsolete" mean in HTML-world? I went to an HTML class today, and we were using some of these (well, definitely font) without any indication that they could ever be a problem. —Μετάknowledgediscuss/deeds 03:39, 14 April 2013 (UTC)
-
- Font? Ouch – I should have a word with your teacher.
-
- During the 1990s’ browser wars, every browser was making up new features and displaying them differently, and web development was a fragmented nightmare. Since then, the W3C approves the official open standards that make up the web based on feedback from browser developers, and we can mostly write HTML for one standard instead of for five current and twenty-seven past browsers (but don’t get me started on MSIE 6). The wide adoption of CSS, which allows for the separation of presentation from document structure, has led to newer versions of HTML deprecating and obsoleting purely presentational elements.[3] Unfortunately, the nature of wikitext encourages editors to include lots of presentation guff repeated many times in every page, but this is bad practice because it bloats pages and makes maintenance difficult. Like templates, style sheets let us centralize presentation and reduce page bloat.
</pedantry>
- During the 1990s’ browser wars, every browser was making up new features and displaying them differently, and web development was a fragmented nightmare. Since then, the W3C approves the official open standards that make up the web based on feedback from browser developers, and we can mostly write HTML for one standard instead of for five current and twenty-seven past browsers (but don’t get me started on MSIE 6). The wide adoption of CSS, which allows for the separation of presentation from document structure, has led to newer versions of HTML deprecating and obsoleting purely presentational elements.[3] Unfortunately, the nature of wikitext encourages editors to include lots of presentation guff repeated many times in every page, but this is bad practice because it bloats pages and makes maintenance difficult. Like templates, style sheets let us centralize presentation and reduce page bloat.
-
- Browsers are built for backwards-compatibility, so most of the old elements will still work. But as an organization for openness, we should follow the recommendations of current open standards, and certainly abandon practices deprecated in the last century.
-
-
- Thanks for that explanation. Specifically, my teacher recommended using CSS (which I'm learning now), but said that for basic formatting, just using the HTML tags is fine (although it may not be much faster than inline CSS). I agree with replacing them in templates but not giving a damn on discussion pages. —Μετάknowledgediscuss/deeds 04:44, 15 April 2013 (UTC)
-
100 million edits[edit]
According to our sources, the 100 millionth edit was made to Wiktionary (all languages taken together, humans and bots included) during Friday April 12. Congratulations to us all! About 20% of the edits have gone into the English Wiktionary. --LA2 (talk) 02:05, 13 April 2013 (UTC)
Foreign word of the day: reconstructed terms, constructed terms and name.[edit]
In the vote for creating the FWOTD feature, the points “eligibility of reconstructed languages” and “eligibility of constructed languages” didn’t achieve consensus (except conlangs which don’t meet CFI, which failed) by the end of the vote.
Also, we’ve had a few of people complain about the name “Foreign word of the day,” so if anyone wants to suggest a change feel free to do so.
Summarising, I’m consulting the community on:
- whether terms in reconstructed languages (Proto-Indo-European, Vulgar Latin, Proto-Germanic, etc.) should be allowed to be foreign words of the day;
- whether terms in constructed languages that meet CFI (Esperanto, Ido, Lojban, etc.) should be allowed to be foreign words of the day;
- whether the feature’s name should be changed.
— Ungoliant (Falai) 14:18, 13 April 2013 (UTC)
- I support the eligibility of reconstructed languages, because they are some of our most interesting content. Naturally, for reconstructed terms we shouldn’t require pronunciation and should require a reference from a trustworthy source instead of citations.
- I support the eligibility of constructed languages that meet CFI. Don’t see why not.
- I oppose changing the name. I don’t find it offensive in any way whatsoever.
- — Ungoliant (Falai) 14:18, 13 April 2013 (UTC)
- I support the first two, and I kind of oppose the third because I don't see anything wrong with the current name. In Dutch, there is a nice word anderstalig, but I don't know if English has an equivalent word. Maybe that would be a good word to feature? :) —CodeCat 14:43, 13 April 2013 (UTC)
- I oppose the eligibility of reconstructed languages since they are by definition uncitable. That's why they're not in mainspace, too. I support the eligibility of constructed languages that meet CFI. I abstain on the issue of the name; I don't understand what could be offensive about it, though I can see it might be misleading, but I can't think of a better name besides "non-English word of the day" which sounds dumb. Incidentally, although you didn't ask, I also oppose allowing mentions rather than uses to count as cites in FWOTD nominations. I know that mentions are good enough for RFV when it comes to LDLs, but I think FWOTD ought to have higher standards than RFV/CFI. Note that FWOTD already requires pronunciations, even though nothing at CFI requires them. —Angr 14:51, 13 April 2013 (UTC)
- While I sympathise with your point, this would make it much harder to feature words from languages without contributors who speak them, like Kaingang and Quechua, and it’s already Indo-European dominated enough as it is. — Ungoliant (Falai) 15:16, 13 April 2013 (UTC)
- The trouble with allowing a single mention is that there's no protection against errors. If the single source we use for Kaingang or Quechua has a fictitious entry (whether deliberate or accidental) or even just a typo, then we are at risk of propagating that error if we don't confirm it elsewhere. Bad enough when that happens in any entry, but worse when it happens in an entry being featured on the main page. —Angr 17:08, 13 April 2013 (UTC)
- While I sympathise with your point, this would make it much harder to feature words from languages without contributors who speak them, like Kaingang and Quechua, and it’s already Indo-European dominated enough as it is. — Ungoliant (Falai) 15:16, 13 April 2013 (UTC)
-
- I vote per Ungoliant, although I also support the eligibility of terms in conlangs,
which Ungoliant took no stance on. —Μετάknowledgediscuss/deeds 14:54, 13 April 2013 (UTC)
- I vote per Ungoliant, although I also support the eligibility of terms in conlangs,
-
-
- I did. — Ungoliant (Falai) 15:16, 13 April 2013 (UTC)
- Sorry. Rectified above. —Μετάknowledgediscuss/deeds 03:37, 14 April 2013 (UTC)
- I did. — Ungoliant (Falai) 15:16, 13 April 2013 (UTC)
-
-
- I vote per Angr. I'm undecided on whether the name needs to change; we don't have a great alternative, but I do understand why people might want to change it.--Prosfilaes (talk) 19:56, 13 April 2013 (UTC)
Not much point in voting against a title if there is no clear proposal for a replacement.
What exactly were the complaints against “foreign?” It’s not exactly offensive, but kind of ignorant when it’s a minority of English speakers who live in countries where other languages are truly foreign. Calling French a foreign language in Canada, for example, is incorrect and at least off-putting to a francophone Quebecker who accepts his or her first or only language for granted as native.
What alternatives are there?
- foreign-language word of the day
- non-English word of the day
- other-language word of the day
- alterlingual word of the day (is there a real Latinate word?)
- alloglossal word of the day (ditto Greek?)
- interlingual word of the day
- international word of the day
- global word of the day
- world word of the day
- exotic word of the day
- other word of the day
—Michael Z. 2013-04-13 19:21 z[updated list —Michael Z. 2013-04-14 14:40 z]
- But suppose you are an anglophone Canadian who learned French. If someone asks you “do you speak any foreign language?”, isn’t “French” a correct answer? — Ungoliant (Falai) 19:45, 13 April 2013 (UTC)
-
- No? I would regard it as sloppy usage of the word "foreign" = from a different country. In any case, suppose you are a francophone Frenchman; why would French be foreign?--Prosfilaes (talk) 19:50, 13 April 2013 (UTC)
-
-
- Well, foreign also means “from a different language,” and many Canadians live with only one of the official languages, which is why such misunderstandings can happen.
-
-
-
-
- "From a different language" is not listed as a definition at foreign, and it doesn't sit right with me when ASL or Native American languages get lumped in as foreign languages, though the lack of a better term often means they do. At Distributed Proofreaders, we got in a habit of using "languages other than English (LOTE)", precisely because they weren't foreign to our site or users.--Prosfilaes (talk) 10:40, 14 April 2013 (UTC)
-
-
-
-
-
-
-
-
-
-
- Can we agree to enhance the name by moving WT:Foreign Word of the Day to WT:Foreign-Language Word of the Day? We can get used to that in a month or two and see if it still raises readers’ ire. And reconsider renaming if it appears warranted later? —Michael Z. 2013-04-14 17:47 z
- I support "non-English"; "foreign-language" strikes me as having pretty much all the problems "foreign" does.--Prosfilaes (talk) 07:10, 16 April 2013 (UTC)
-
-
-
-
-
-
-
Regarding the entry foreign, defintion 2, example "eating with chopsticks was a foreign concept to him": Certainly, this use of "foreign" is not restricted to other cultures? Things can be "a foreign concept" to a person that has never met that idea before. I think good synonyms are "unfamiliar, unknown, strange", and that these should be added to the explanation. But English is not my native tongue. --LA2 (talk) 23:43, 14 April 2013 (UTC)
Increasing default font-size[edit]
I proposed this a couple of weeks ago, and had little feedback. Not sure if everyone doesn’t care or just didn’t notice. So I’m posting this reminder, and will change the site’s default font-size, shortly. —Michael Z. 2013-04-14 02:02 z
- It looks perfectly readable to me so I see no reason to change it. Why do you think it's too small? —CodeCat 03:04, 14 April 2013 (UTC)
-
- As I wrote in the original post, editors have used Common.css to enlarge the font for 54 languages and scripts, affecting thousands of entries. The discrepancies bug me. —Michael Z. 2013-04-14 05:14 z
- It is odd to me that the existing "default" font size for the site would not be the default for the user's browser, i.e. not medium. But Web designers seem to work upon contrarian principles of their own. Bigger is fine by me, but I hope it can be set to browser default rather than a hard-coded "what looks good on this year's monitors". Equinox ◑ 03:32, 14 April 2013 (UTC)
-
- I did the math. Browser default.. For a preview, copy the first bits from my vector.css. —Michael Z. 2013-04-14 05:19 z
-
-
- I will update MediaWiki:Vector.css within the hour. Complaints welcome. —Michael Z. 2013-04-14 15:34 z
-
-
-
-
-
-
- I thought two BP discussions with no opposition would constitute consensus to try out a harmless improvement. Your single subjective opinion after a one-minute look at a major visual change doesn’t constitute any consensus or evidence either. Thanks for speaking for everybody. —Michael Z. 2013-04-14 16:18 z
-
-
-
-
-
-
-
-
- I'll be opting out anyway. I didn't like it at all. Mglovesfun (talk) 16:00, 14 April 2013 (UTC)
-
-
-
Could someone actually respond to the evidence I have cited, instead of blowing away a major change based on “I don’t like it,” without even using it? —Michael Z. 2013-04-14 16:19 z
- Sorry, I don't see anything that I would call "evidence". In the previous discussion you gave a list of putative advantages, but seemingly no "evidence" for them. (Perhaps you and I define the term differently?) At any rate, if you want people to reply to something specific, please indicate what. In particular, if you could highlight some part of your argument that would justify increasing the font size even if no one liked the result, that would certainly be interesting! —RuakhTALK 16:45, 14 April 2013 (UTC)
-
- The biggest objective evidence that our font-size is small is that other editors have been increasing it, to the tune of over 50 CSS declarations in our style sheet, the majority setting the font-size to the browser default. No one has mentioned any disadvantage of setting the font-size to the browser default.
-
- I’ve put in significant time doing research and testing, tried to outline my reasoning, and did my best to get feedback. Not one objection was made. Now, could someone here at least do me the courtesy of actually trying to use this for an hour or a day, instead of taking one glance, blurting out “I don’t like it” because it is different, and blowing off my effort completely? —Michael Z. 2013-04-14 17:00 z
-
-
- Re: "could someone here at least do me the courtesy of actually trying to use this for an hour or a day": http://en.wiktionary.org/wiki/User:Ruakh/common.css?diff=20160735. —RuakhTALK 17:23, 14 April 2013 (UTC)
-
-
-
-
- Thank you for that. Sorry to get cranky. I included a list of what I see as concrete advantages in my original proposal. I think things can be improved, and I would appreciate critical feedback. —Michael Z. 2013-04-14 17:50 z
-
-
- I've been trying out the larger size for the past several days. While it's more legible, there are other drawbacks. While this larger size may correspond to the "de jure" default browser size, it doesn't correspond to the "de facto" default size for web pages. Almost every other text-based website I look at has smaller text, much closer to the "traditional" Vector size. People get used to one font size on webpages and when they encounter something noticeably smaller or (as in the proposed new Vector size) much larger, it looks absurd. And more urgently, if we change the default Vector size here at English Wiktionary we're out of sync with every other Wikimedia project's Vector skin. I know that perfect unity isn't possible across languages, but at least every English-language project's Vector should look like every other English-language project's Vector. If I'm looking at Wikipedia, then at Wikisource, then at Commons, and then at Wiktionary, it's startling when Wiktionary's text is so much larger than every one else's. And if I didn't know that it's that way because I deliberately set it that way on my own CSS page, I would be baffled and put off by it. —Angr 21:31, 17 April 2013 (UTC)
-
- Some good points I hadn’t considered in detail.
-
-
- WikiMedia branding. Indeed, most WikiMedia projects use 13px font size. I see that zh and ja Wiktionaries use 15px, Arabic, Pashto and Farsi 14px. However, explicit branding elements in the other projects vary a lot. Among Wiktionaries, even the site logos (!), home-page layout, use of tone and colour, icons, etc., vary wildly. The only thing all these sites have in common is the basic MediaWiki interface with grey and white background and blue rules. Also, the favicon is identical on all but cs. and en.Wiktionary. Choosing font-size for branding over readability would be poor prioritizing, when it would make an insignificant difference in the visual identity, but potentially a large one in readability. If we value our uniform branding at all, why don’t we coordinate site design, or unify even the most basic branding elements before compromising readability?
- The appearance of credibility. It’s true that 13px may be the the most popular font-size,[12] but that isn’t a “de facto default” in any sense I can think of, nor does being widely used make it the best choice for anything specific.[13] A website doesn’t look smart or credible by picking the most popular font size for no other reason. It does it by considering the factors that font size affects, and choosing an appropriate size for the particular site. Increasing font size for over 50 languages while sticking to a13px default looks “absurd” to me.
- Readability. As you say, a larger font than 13px is more legible. This is particularly true on both the extra-small and extra-large screens that more readers are using these days. Still more so for many of the language scripts we use, as we have concretely demonstrated in our style sheet.
- Accessibility. Overlaps with the above, but it should be mentioned that many of the designers of the average 13px websites have good eyes, good displays, and are poorly schooled in accessibility and internationalization. Many of these “average” websites are aimed at youthful or moneyed markets. Ours is the broadest possible audience, including non-native readers, aging, vision-impaired, impoverished, having only mobile internet access, etc. Failing to optimize readability harms segments of our audience that many other websites ignore.
-
-
- I still think any disadvantages of increasing font size are minor at worst, and far outweighed by the concrete benefits. —Michael Z. 2013-04-21 00:52 z
- Wikimedia projects' appearance varies widely from language to language, but not so much from project to project within a single language. When I had my Wiktionary font size set larger, I found it genuinely distracting to go from Wikipedia to Wiktionary because of the font size difference. It makes you notice the print rather than the content, which is a sign of poor typography. As for column width, as the page you linked to says, the solution is to define columns as being a certain number of ems wide, not to force the text to appear larger. But that too is something that ought to be done to all (English-language) projects' Vector skins, not just Wiktionary. —Angr 22:02, 23 April 2013 (UTC)
- I still think any disadvantages of increasing font size are minor at worst, and far outweighed by the concrete benefits. —Michael Z. 2013-04-21 00:52 z
Tracking category for missing inflected forms[edit]
Feel free to let me know if there is a better way of doing this already in place, but an idea struck me recently upon seeing red links in inflection lines. I think that we should have a system to track these links, since they are either valid missing entries for inflected forms of lemma entries or incorrect inflections being displayed on entries (for example, words lacking plurals or different feminine forms where the editor has not changed the template's default behavior). In both cases, they should be actively dealt with, either by creating pages for missing inflected forms or correcting the inflection templates. This seems like low-hanging fruit, since it is simple work and a motivated editor could do dozens of these in a sitting, or far more with acceleration. It would be relatively simple to use the ifexist parser function so that pages with red links in their inflection templates are put in a maintenance category recording that, so that editors can come along and address them.
As an example of what I am talking about, I made an edit to {{es-adj}}, so that it now puts entries with red-linked feminine singular forms in inflection templates into Category:Missing Spanish feminine adjectives. Have a look at that category to see what I mean. There are 884 of these (as of now) being detected, which means potentially 884 missing entries just in looking at Spanish singular feminine adjective forms alone. Ideally, I think this kind of category could be useful across all of the inflection templates and all of the inflected forms they output, but I wanted to raise the idea here for comment. We may want to have broader categories than the "Missing Spanish feminine adjectives" one I created; maybe all entries with missing inflected forms should go in a single big maintenance category. Is this a useful idea? Dominic·t 07:02, 16 April 2013 (UTC)
- There is one major difficulty with that. To check whether a given page exists is considered "expensive" by the MediaWiki software, and we're limited to about 100 of those checks per page. Once a page reaches that limit, any remaining checks will return "does not exist". So, we can't use this too much on pages because there is a danger that it will break the page if overused. —CodeCat 12:43, 16 April 2013 (UTC)
- Agreed. This is a bot job; we just need to convince somebody like SB to take it on. —Μετάknowledgediscuss/deeds 13:55, 16 April 2013 (UTC)
- I would be more afraid of false positives from people not changing the inflection template defaults if we just created them all at once than I would be of pages which will hit the limit of parser function checks from adding this new one. Do we have any reason to think there would be many, or any, pages that would break? I am fairly sure the limit is actually 500 calls, not 100. That's a lot of inflection templates for one page. Also, once the limit is reached, it does not make the functions return false, creating false positives. It actually just refuses to expand the templates after the limit. Dominic·t 14:53, 16 April 2013 (UTC)
- Agreed. This is a bot job; we just need to convince somebody like SB to take it on. —Μετάknowledgediscuss/deeds 13:55, 16 April 2013 (UTC)
암글[edit]
Can somebody delete 암글--wasn't sure how/where to ask? King jakob c 2 (talk) 20:47, 16 April 2013 (UTC)
Done. Thanks. Adding the {{delete}}template is enough. — Ungoliant (Falai) 20:58, 16 April 2013 (UTC)
Template term and lang parameter[edit]
I oppose template {{term}} requiring the "lang=" parameter, showing "???" before the term if the lang parameter is not provided. This change seems to have been introduced to the template today or yesterday by CodeCat (talk • contribs). An example of use of template "term" without lang parameter: physics. --Dan Polansky (talk) 08:07, 20 April 2013 (UTC)
- Something like this seems to have been discussed at Template_talk:term#lang. People should not use such obscure pages to discuss significant changes! --Dan Polansky (talk) 08:09, 20 April 2013 (UTC)
I feel the same way. Ƿidsiþ 08:52, 20 April 2013 (UTC)
- Why do you oppose it exactly? Not specifying the language leaves many problems: the link does not link to the correct section, the script template is not applied, and the word is marked in HTML as English (which creates usability problems). I wonder what justification there can be for ignoring those problems. —CodeCat 12:33, 20 April 2013 (UTC)
- This change breaks many, many, many discussion pages. -- Liliana • 12:36, 20 April 2013 (UTC)
- I don't think displaying a small notification really breaks anything. It's just a friendly reminder that something is missing and needs to be corrected. I don't know how to make it less obvious without making it so unobvious that nobody sees it. —CodeCat 12:39, 20 April 2013 (UTC)
- Others' posts should never be edited, even in case of incorrect syntax and such. At best, this should be restricted to the main namespace. -- Liliana • 12:41, 20 April 2013 (UTC)
- We've edited or broken people's posts in the past. Whenever a template is deleted, if that template is used in a past post, deleting it will break the page, but we do it anyway. In some cases we've replaced the template with an equivalent, but in other cases the pages remain broken. For example look at the transclusions of
{{hr}}; some were replaced by "sh" but some still remain. Similar with{{zh}}. This isn't really any different. We can't always guarantee backwards compatibility, and indeed we shouldn't try to go too far out of our way for it. —CodeCat 12:52, 20 April 2013 (UTC)
- We've edited or broken people's posts in the past. Whenever a template is deleted, if that template is used in a past post, deleting it will break the page, but we do it anyway. In some cases we've replaced the template with an equivalent, but in other cases the pages remain broken. For example look at the transclusions of
- Others' posts should never be edited, even in case of incorrect syntax and such. At best, this should be restricted to the main namespace. -- Liliana • 12:41, 20 April 2013 (UTC)
- I don't think displaying a small notification really breaks anything. It's just a friendly reminder that something is missing and needs to be corrected. I don't know how to make it less obvious without making it so unobvious that nobody sees it. —CodeCat 12:39, 20 April 2013 (UTC)
- @CodeCat: Naturally, I am not opposing using "lang=" for non-English languages to add script, and whatnot. I am opposing making "lang=en" mandatory for English. What you wrote does not seem to apply to English terms without lang=: "the link does not link to the correct section, the script template is not applied, and the word is marked in HTML as English". What I am saying is, if there is no lang=, let "term" template assume the term is English, as it did before your edits. --Dan Polansky (talk) 13:09, 20 April 2013 (UTC)
- I think you are a bit mistaken. It always has been mandatory, because specifying lang=en has never, in the history of the template, been equivalent to specifying no language. So it never assumed that the term is English, not before my edits and not after them. That is one of the biggest flaws in this template in particular, which others (which do default to English) never had because they were created properly from the beginning. The result is that we now have thousands of entries that use this template both for English and for many other languages, without specifying which. Simply changing the template so that English is the default is therefore not an option, because it would not be correct for the many thousands of non-English words that lack a language. The only option that I know of is to mark lack of a language as an error so that it be corrected. I am currently running a bot to correct some of the most obvious ones (uses where the
{{term}}template is preceded by{{etyl}}, which allows the bot to figure out the correct language), but there are still many many more that need to be fixed. —CodeCat 13:19, 20 April 2013 (UTC)- Re: "It always has been mandatory, ...": That seems incorrect. If lang= really were mandatory, the template would complain of a missing parameter. The parameter could only have been "mandatory" in a sense that I do not know. --Dan Polansky (talk) 13:24, 20 April 2013 (UTC)
- What I meant is that the template doesn't do what it should do if the language is left out. The correct behaviour, when lang=en is given, is to use Latn as the script, "en" as the language, and link to the English section. But when no language is given, it uses None as the script, "" as the language, and links to no section. Therefore, to correctly link to English terms, the language is mandatory. —CodeCat 13:29, 20 April 2013 (UTC)
- All of that is irrelevant. This is one of our most heavily-used templates, especially by our less-template-sophisticated editors. Changes that significantly affect its behavior should be discussed thoroughly in an appropriate venue before being implemented. Most of the people who use it aren't going to have a clue what the ??? means, and a good many won't know where to go to find out. There should have been some steps taken to educate people before implementing it. Chuck Entz (talk) 16:24, 20 April 2013 (UTC)
- There is a help message when you hover the cursor over it. That may not be entirely obvious, but actually writing the message out would look really bad and would have made even more people angry. The real "education" has been in
{{term}}'s documentation, which I presume is the proper place to put it. —CodeCat 17:03, 20 April 2013 (UTC)
- There is a help message when you hover the cursor over it. That may not be entirely obvious, but actually writing the message out would look really bad and would have made even more people angry. The real "education" has been in
- All of that is irrelevant. This is one of our most heavily-used templates, especially by our less-template-sophisticated editors. Changes that significantly affect its behavior should be discussed thoroughly in an appropriate venue before being implemented. Most of the people who use it aren't going to have a clue what the ??? means, and a good many won't know where to go to find out. There should have been some steps taken to educate people before implementing it. Chuck Entz (talk) 16:24, 20 April 2013 (UTC)
- What I meant is that the template doesn't do what it should do if the language is left out. The correct behaviour, when lang=en is given, is to use Latn as the script, "en" as the language, and link to the English section. But when no language is given, it uses None as the script, "" as the language, and links to no section. Therefore, to correctly link to English terms, the language is mandatory. —CodeCat 13:29, 20 April 2013 (UTC)
- Re: "It always has been mandatory, ...": That seems incorrect. If lang= really were mandatory, the template would complain of a missing parameter. The parameter could only have been "mandatory" in a sense that I do not know. --Dan Polansky (talk) 13:24, 20 April 2013 (UTC)
- I think you are a bit mistaken. It always has been mandatory, because specifying lang=en has never, in the history of the template, been equivalent to specifying no language. So it never assumed that the term is English, not before my edits and not after them. That is one of the biggest flaws in this template in particular, which others (which do default to English) never had because they were created properly from the beginning. The result is that we now have thousands of entries that use this template both for English and for many other languages, without specifying which. Simply changing the template so that English is the default is therefore not an option, because it would not be correct for the many thousands of non-English words that lack a language. The only option that I know of is to mark lack of a language as an error so that it be corrected. I am currently running a bot to correct some of the most obvious ones (uses where the
- This change breaks many, many, many discussion pages. -- Liliana • 12:36, 20 April 2013 (UTC)
Support the change.
No one has changed any discussion pages, but if you want your talk posts to continue looking the same, don’t leave live templates in them. Use subst. —Michael Z. 2013-04-20 23:24 z
- Totally support making lang= obligatory, but we should wait until the bot run is over before displaying the ???s, and not display them at all outside the content namespaces. — Ungoliant (Falai) 23:53, 20 April 2013 (UTC)
- The bot doesn't really have anything to do with the ??? either, the bot works from a category that can be added or removed independent of the question marks. But from the way the bot is running now, it's not really making a serious dent in the amount of pages. It is making the occasional change but it's skipping most of the pages in the list without doing anything (because it sees no change it can make). There were around 45 thousand pages in the list when it started, and I expect it won't be able to get rid of more than a few thousand of them currently; it's at 41 thousand now. —CodeCat 00:14, 21 April 2013 (UTC)
- But in this revision of warlock the term lie has ???s, and after the bot edit it doesn’t. — Ungoliant (Falai) 01:01, 21 April 2013 (UTC)
- That's true, but that's only because the bot has changed something that happened to both remove the ??? and remove it from the category. What I am saying is, the bot works from the category, and the ??? doesn't influence that. If we removed the ??? the category would still be there, and we could also put in ??? and remove the category. —CodeCat 01:06, 21 April 2013 (UTC)
- But the bot does influence the ???s. What I was saying is that we should wait for the bot run to be over before displaying them, because there would be no benefit displaying something that makes our entries look bugged when it’s going to be automatically fixed soon enough. But I changed my mind, since the bot isn’t going to make a serious dent (unfortunately). — Ungoliant (Falai) 01:18, 21 April 2013 (UTC)
- That's true, but that's only because the bot has changed something that happened to both remove the ??? and remove it from the category. What I am saying is, the bot works from the category, and the ??? doesn't influence that. If we removed the ??? the category would still be there, and we could also put in ??? and remove the category. —CodeCat 01:06, 21 April 2013 (UTC)
- But in this revision of warlock the term lie has ???s, and after the bot edit it doesn’t. — Ungoliant (Falai) 01:01, 21 April 2013 (UTC)
- The bot doesn't really have anything to do with the ??? either, the bot works from a category that can be added or removed independent of the question marks. But from the way the bot is running now, it's not really making a serious dent in the amount of pages. It is making the occasional change but it's skipping most of the pages in the list without doing anything (because it sees no change it can make). There were around 45 thousand pages in the list when it started, and I expect it won't be able to get rid of more than a few thousand of them currently; it's at 41 thousand now. —CodeCat 00:14, 21 April 2013 (UTC)
- Support considering lang= obligatory (meaning only that it must be present: I think it's fine for it to be explicitly blank, but English should be lang=en), probably oppose whatever "bot run" Ungoliant and CodeCat are referring to (it doesn't seem like it was ever discussed or approved?), weakly support some sort of visual indication of missing lang= once that's rare (though I'd strongly support such a visual indication if it were visible only to admins and opters-in), and oppose distinguishing content namespaces from non-content namespaces in this respect, since that will just make it harder for editors to learn what they're supposed to be doing. —RuakhTALK 00:30, 21 April 2013 (UTC)
- The bot run is adding lang= to uses of
{{term}}in etymologies where it can use a preceding{{etyl}}template to determine the correct language. Basically, it's replacing{{etyl|xx|yy}}{{term|word}}with{{etyl|xx|yy}}{{term|word|lang=xx}}. It didn't seem like a very controversial change. —CodeCat 00:35, 21 April 2013 (UTC)
- The bot run is adding lang= to uses of
Perhaps we should have a class of error messages that are hidden from readers but displayed for all logged-in editors. —Michael Z. 2013-04-21 01:18 z
- That might both be a good idea and a detrimental one.
{{nl-noun}}shows "error" messages when some of its parameters are missing, and calls on the viewer to provide them. Since those messages were added to the template, I have seen quite a lot of editors - IPs, newly registered and experienced alike - take the messages to heart and provide the forms. We even have an editor, User:DrJos, who registered specifically to provide the forms and has now made it his life's work to fix them all. :) So I would say that's first-hand evidence that this kind of notice not only works, but it even gets IPs to lend a hand. So if we decide to hide these requests from IPs, we will be losing some of the editors who might help out. —CodeCat 01:26, 21 April 2013 (UTC)- Don't forget that we also serve a lot of site visitors who don't edit and have no idea what "lang=" is. Why should some 10-year-old doing his or her homework have part of the content replaced by ??? so you can send a wake-up call to someone else? Are we the "dictionary that anyone can edit", or "the dictionary that everyone has to edit"? Chuck Entz (talk) 02:49, 21 April 2013 (UTC)
- Re: " […] part of the content replaced by ??? […] ": That's a straw man, since the version with ??? still has all the same content. (The ??? appears before the term, not instead of the term.) Maybe you meant to say that the 10-year-old would think that the ??? had replaced actual content? —RuakhTALK 03:43, 21 April 2013 (UTC)
- My mistake. I had already forgotten what the actual effect was, having only seen it on one page. Although I obviously overstated the effect, it still seems a bit much to clutter the main body of the text used by non-editors with stuff aimed strictly at editors. It might indeed cause concern among non-editors that something was broken that they didn't know how to fix.
- I'm not opposing the eventual implementation of such a change, just the massive scale of the change combined with the lack of effort taken to get consensus and to get feedback about what effect it might have, let alone to prepare people for it. Something that noticeably changes the appearance of a significant percentage of our millions of entries should require more than a general mention of the principle behind it here and there, followed by a discussion on the template talk page that only a very few would even know about. Chuck Entz (talk) 04:15, 21 April 2013 (UTC)
- Re: " […] part of the content replaced by ??? […] ": That's a straw man, since the version with ??? still has all the same content. (The ??? appears before the term, not instead of the term.) Maybe you meant to say that the 10-year-old would think that the ??? had replaced actual content? —RuakhTALK 03:43, 21 April 2013 (UTC)
- Don't forget that we also serve a lot of site visitors who don't edit and have no idea what "lang=" is. Why should some 10-year-old doing his or her homework have part of the content replaced by ??? so you can send a wake-up call to someone else? Are we the "dictionary that anyone can edit", or "the dictionary that everyone has to edit"? Chuck Entz (talk) 02:49, 21 April 2013 (UTC)
- I have rolled back CodeCat's edits to
{{term}}because currently, it seems that only CodeCat and Michael support the ???s, whereas Dan, Widsith, Liliana, Chuck, and I oppose some aspect of CodeCat's edits altogether, and Ruakh and Ungoliant do not support putting in the ???s until non-lang-specified uses become much more rare. That's only 22% of editors in support so far. This is why we need to have BP discussions before making sweeping changes to the interface as readers view it. —Μετάknowledgediscuss/deeds 02:03, 21 April 2013 (UTC)
-
- Not read the whole discussion, but do we need '???'? Is there any way of making these stick out less like a sore thumb, this is a dictionary after all, readers come here for lexical information, not to correct wiki syntax. PS there is a line in User:Mglovesfun/vector.js that converts
{{term|foo}}into{{term|foo|lang=en}}. Mglovesfun (talk) 09:45, 21 April 2013 (UTC)- I think that so far, the majority of people in this discussion agree that it's a good idea to make sure
{{term}}always has a language code. But that immediately brings up the question, how do we get there? Even if people want to add a language where it's missing, how can they do it? The reason why I added ??? was that it would make it obvious to editors that something needs fixing there. Making the problem visible and apparent is the first step towards fixing it, and that has been a real problem before. I also argued that showing a similar message on{{nl-noun}}has indeed helped to make the problem visible and therefore has led to more people fixing it. The bot I am running is helping, but it can only do so much; it has almost passed over all entries with a missing language but it has only managed to fix about 10% of the total (from 45 thousand to 40 thousand). A bot could never fix the majority of the entries that remain. So I suppose the real goal of this discussion is: if at least some of us agree that adding a language in all cases is a good thing, what can we do to make that happen and make it happen more quickly? If adding ??? to the entry is not the right way, then what is? —CodeCat 12:06, 21 April 2013 (UTC)- If there isn't a bot solution for the remaining 90%, then I guess we'll just have to use MG's JS (or a modified form of it) on every page we're already editing. The reason why the ???s don't work is that instead of solving the problem, they create a new one. It looks messy and unprofessional, and users have to go for an unintuitive tooltip to find what's gone wrong. (Don't get me wrong, I love xkcd, but tooltips are not what people try first upon seeing a cryptic message.) This is not an acute crisis, so if a chronic solution is the best we have, so be it. —Μετάknowledgediscuss/deeds 14:39, 21 April 2013 (UTC)
- What exactly does the script do? Blindly adding lang=en is not correct... if it were, we probably would have done that already. I think there is one approach that we could try in the long term. If we could weed out all the uses that are not English (which are presumably a minority) then it becomes more feasible to add lang=en to the remainder. Using Lua, we might be able to recognise some of the languages, and we can use other means as well. For example, anything with
{{polytonic}}as the script is bound to be Ancient Greek (and that template even sets lang="grc" if nothing is provided), so adding lang=grc whenever sc=polytonic is present is safe. Adding lang=got where sc=Goth is also safe, and many other scripts are only used for one language so we can derive the language from the script. We can also look at the characters in the term being linked to. Templates can't recognise which characters a word consists of, but Lua can. So if a word contains, say, Hiragana or Cyrillic, we can be pretty certain it's not English. We could also separate out calls to{{term}}that use Latin characters that are not used in English, like å. Granted, none of those approaches is absolutely failsafe, but it would probably be right more than 99% of the time, and it would make it much easier to chip away gradually at the number until it becomes more manageable. And making a few mistakes (marking a link with the wrong language) is not serious, especially not considering that currently 40000 are marked with the wrong language (it can only get better!). —CodeCat 14:55, 21 April 2013 (UTC)- I was imagining that most would be English, and then it would be easy for a human to scan it and fix the langcode if necessary. I don't know what percentage the script/character method can handle, but I'm sure it's noncontroversial for you to attempt it. —Μετάknowledgediscuss/deeds 14:59, 21 April 2013 (UTC)
- I don't know how many there would be either, but I can add an invocation to a module (that needs to be made) which would add a category to the page when lang= is not present. That module can then decide to add the page to different categories depending on other factors like the script code or the characters in the word. The number of entries in each category would then be used to gauge what needs to be done. And even if one category contains only a few hundred entries, that's still a few hundred fixed and done. Every little bit helps, and we'll need to do this in little bits. :) —CodeCat 15:05, 21 April 2013 (UTC)
- I was imagining that most would be English, and then it would be easy for a human to scan it and fix the langcode if necessary. I don't know what percentage the script/character method can handle, but I'm sure it's noncontroversial for you to attempt it. —Μετάknowledgediscuss/deeds 14:59, 21 April 2013 (UTC)
- What exactly does the script do? Blindly adding lang=en is not correct... if it were, we probably would have done that already. I think there is one approach that we could try in the long term. If we could weed out all the uses that are not English (which are presumably a minority) then it becomes more feasible to add lang=en to the remainder. Using Lua, we might be able to recognise some of the languages, and we can use other means as well. For example, anything with
- If there isn't a bot solution for the remaining 90%, then I guess we'll just have to use MG's JS (or a modified form of it) on every page we're already editing. The reason why the ???s don't work is that instead of solving the problem, they create a new one. It looks messy and unprofessional, and users have to go for an unintuitive tooltip to find what's gone wrong. (Don't get me wrong, I love xkcd, but tooltips are not what people try first upon seeing a cryptic message.) This is not an acute crisis, so if a chronic solution is the best we have, so be it. —Μετάknowledgediscuss/deeds 14:39, 21 April 2013 (UTC)
- I think that so far, the majority of people in this discussion agree that it's a good idea to make sure
- Not read the whole discussion, but do we need '???'? Is there any way of making these stick out less like a sore thumb, this is a dictionary after all, readers come here for lexical information, not to correct wiki syntax. PS there is a line in User:Mglovesfun/vector.js that converts
How about an error message like the one after this term [! Editors: the preceding term template lacks a language code]. Visible to all, relatively unobtrusive, self-explanatory and ignorable. The copy could be made more accessible; it should convey that an improvement is needed but doesn’t affect the accuracy of the information. —Michael Z. 2013-04-21 15:57 z
-
- How about Greek όρος (óros, “term”) [! Editors: the preceding term template lacks a language code]. I think it belongs after the whole template, because it refers to that construction. If it were before the brackets, it would look more like it was referring to the term itself.
-
-
- If we wanted a bit more urgency and context, a background appearing on hover or expand could tie it all together like Greek όρος (óros, “term”) ⊕ Editors: this term template lacks a language code. —Michael Z. 2013-04-21 23:27 z
-
-
-
-
-
- Can you point me to one, or tell me how to generate one? I remember something like that, but now when I try to save a module with a script error, I just see the big red box at the top of the page. —Michael Z. 2013-04-22 00:50 z
- You can look in Category:Pages with script errors. —CodeCat 01:03, 22 April 2013 (UTC)
- Can you point me to one, or tell me how to generate one? I remember something like that, but now when I try to save a module with a script error, I just see the big red box at the top of the page. —Michael Z. 2013-04-22 00:50 z
-
-
-
Support making lang mandatory. It may also be possible to include automatic transliteration later. Perhaps rather than "???", it should say "which language???". --Anatoli (обсудить/вклад) 23:52, 21 April 2013 (UTC)
- I don't like calling it an error. For one thing, it's beside the point and adds extra verbiage, but mostly, it gives the impression that things are falling apart. I would suggest following the lead of some of our rf- templates: "This term template is lacking a language code. If you know it, please add it as a lang= parameter". Still verbose, but it would only show on hover. The symbol should be something small and innocuous, like the one Michael suggested above, or maybe a bullet (•). Even the question marks might not be so bad- as a trailing superscript. Or how about: όρος (óros, “term”)[→?] (I'm sure there are attributes that would make it look more like a live control, but you get the idea). Chuck Entz (talk) 02:45, April 21, 2013 (UTC)
-
- I have no strong opinion on the message and the format of the warning. Whatever the community decides but making the "lang" mandatory is important, otherwise, just use square brackets or something. I also think
{{etyl}}should have the second parameter mandatory as well. Otherwise, people just add to English loanwords, even if they mean another language. --Anatoli (обсудить/вклад) 02:56, 22 April 2013 (UTC)- There are a good number of uses of etyl with a null lang= parameter, as a way of standardizing the language name. I even used to do it myself, before I was aware of things like template overhead. I suppose they would be pretty easy to locate and subst out using a bot, though. Chuck Entz (talk) 03:30, 22 April 2013 (UTC)
- I have no strong opinion on the message and the format of the warning. Whatever the community decides but making the "lang" mandatory is important, otherwise, just use square brackets or something. I also think
I support keeping lang mandatory (i.e., not defaulting to English) for now, fixing transclusions, and then defaulting to English. Or just keeping it mandatory. But I oppose any error message visible to not-logged-in users. This is a technical error, not a content error: it is a missing language parameter in the HTML, not a missing etymology or pronunciation. There's no need for visitors to see the error message.—msh210℠ (talk) 05:12, 22 April 2013 (UTC)
I oppose CodeCat's recent change to {{term}} for "keeping lang mandatory" (msh210) and for using ??? as "just a friendly reminder that something is missing" (CodeCat), as it is a little well- and a lot ill-done. What's that ill-done at all? First of all, void is "just a friendly reminder." Liliana wisely noted: "At best, this should be restricted to the main namespace." So did Chuck Entz.
CodeCat's change made all the past discussions look so ugly, freckled with so many ???, looking like subduing again the "global readership at CodeCat's disposal" (User:KYPark/mulberry, 16 April 2013), doing without due consensus again; again and again in my cases! As I've discussed most for a year with {{term}} heavily used, mine must look ugliest so that CodeCat looks like aiming at me, megalomanically speaking.
Again, I'd attend to Liliana saying to CodeCat: "Others' posts should never be edited, even in case of incorrect syntax and such." This should be so because so vital and prior is the global readership of end users as end judges. Just unjust would be the interference or intervention of unequal, intermediary administrators with "others' posts" being arbitrarily edited. Delete could be the worst edit. My posts are supposed to suffer the worst blocking in effect. I wish CodeCat and others could learn a lesson from this happening.
--KYPark (talk) 12:46, 22 April 2013 (UTC)
- I'm sorry if I can't take your arguments seriously if you're turning this into a personal vendetta against me. Please go and do something more useful. —CodeCat 12:52, 22 April 2013 (UTC)
- How dare you accuse CodeCat of “aiming at” you? Whether the changes to
{{term}}were ultimately good or not, she is just trying to improve Wiktionary. The world doesn’t revolve around you. — Ungoliant (Falai) 14:28, 22 April 2013 (UTC)
- If you're legitimately arguing against the change to the term template (belatedly, since it's already reverted), it would be best not to bring your own disputes with CodeCat into it, since I don't think anyone agrees with your assessment of them- some may disagree with her methods, but I don't know of anyone who doesn't sympathize with her reasons for doing what she's been doing. If you're trying to use this discussion as a forum for complaining about that matter, please don't. You'll just get people annoyed at you for cluttering an already-too-long discussion with unrelated issues. Chuck Entz (talk) 14:31, 22 April 2013 (UTC)
-
-
Editors are always wanted to do their best, say, even in using so many hard templates. Such is simply an ideal, esp. of wikis, more or less away from the reality in theory and practice. Rational choice theory is a mere theory heavily counter-balanced by bounded rationality.
-
-
-
- The better editorship, the better readership. Both go together in concert. Much easier is to interfere with editorship than readership, ill or well. Liliana would advise CodeCat not to interfere (too much or trivially) with editorship in discussion, and I would with readership. It is regretable indeed if the past discussions remain freckled with so many a ???, only to be hardly corrected in response to the "friendly reminder".
-
-
-
- If
''[[term]]''is valid without adding#English, then{{term|term}}is valid as well withoutlang=en. For someone else to edit to add such additives, esp. in discussions, is overdone or ill-done, I fear, as far as I understand Liliana. Why? We can only talk more or less perfectly, hence either strength or weakness worth to be archived as given.
- If
-
-
-
- Anyway the technical phase of mess and fuss is over; the majority deny mandatory lang=. Yet, so ain't the moral phase behind that, to be taken seriously at least right here. I'd argue it is painfully arbitrary and immoral to ignore the priority global readership and do without the due community consensus, repeatedly. The validity of my argument should not be upset by such a "tail of speech" as taking my own examples, however double-barreled I may look.
- --KYPark (talk) 05:04, 23 April 2013 (UTC)
-
-
-
-
-
It depends on too many factors to sum up easily! I prefer to talk in detail, focuslessly, not always wisely, while Liliana may prefer the short cut. The shorter speech, the more penetrating, like the proverb. You could speak only of the first thing first. And we could wisely interpret what is implied below the tip of the iceberg. Anyway, Liliana made me a perfect, most impressive sense, I guess. Fair enough? --KYPark (talk) 06:42, 23 April 2013 (UTC)
-
-
-
-
-
-
-
"Do you think you could sum [it] up in a simple sentence?" That is, for me to respond to three most unfriendly plus you unwittingly unleveling with them? Even omniscient and omnipotent God couldn't do so in human language of all imperfection, I guess. This ridiculous fuss was caused by making non-mandatory lang= mandatory, arbitrarily, as if the global readership and editorship should be at CodeCat's mercy! Originally, Z wished to do without that boring parameter, and suddenly CodeCat complicated it "horrible" (Z) for some reasons, said and unsaid. This is a genius for making one out of another at will. Such was the case with WT:Beer parlour/2013/March #Wiktionary:Etymology scriptorium/March 2013. Incidentally, the above three were most responsible for so doing. Assisted by Ungoliant, CodeCat did make a horrible ending out of Chuck Entz's unwitting beginning, remindful of the idiom make a mountain out of a molehill. This case implies too much for me to keep from saying much more, yet ...- --KYPark (talk) 07:38, 28 April 2013 (UTC)
- Re: "Do you think you could sum that up in a simple sentence?". From all available evidence, no. Nothing will ever be said simply that can be bloated with tables, graphics, massive blocks of text taken from other pages, rambling discourses in poor English, etc. CodeCat started moving his more irrelevant topics from the Etymology Scriptorum to his user space, so now he's roaming through the discussion pages looking for any excuse to take potshots at her. Chuck Entz (talk) 15:43, 28 April 2013 (UTC)
- The really sad thing is that I warned him some time ago against annoying everyone else so much that he'd lose sympathy, but it looks like he's doing just that. —CodeCat 16:05, 28 April 2013 (UTC)
-
-
-
Support the change, but "???" looks horrible, I prefer "[language code?]". --Z 12:07, 25 April 2013 (UTC)
-
-
-
-
-
- This is not the right place for you to blame me, but perhaps over there. I am just staying here to inform you where you'd better respond and trivially to know from you what is "lolwut" at all in English. No reason to stay any more. --KYPark (talk) 15:18, 28 April 2013 (UTC)
- lolwut is equally English as the next term, deal with it. Also, I support the change. User: PalkiaX50 talk to meh 15:30, 28 April 2013 (UTC)
- This is not the right place for you to blame me, but perhaps over there. I am just staying here to inform you where you'd better respond and trivially to know from you what is "lolwut" at all in English. No reason to stay any more. --KYPark (talk) 15:18, 28 April 2013 (UTC)
-
-
-
-
-
-
-
-
-
-
-
- Oh really? Ungoliant looks like a genius for pushing WTF, lolwut, etc., to me of en-2 so as to embarrass me and perhaps to delay. Anyway why do you jump in on his behalf? I don't really care you support the change but the arbitrary change at CodeCat's mercy, as if the world should revolve around CodeCat. Do you like that way? --KYPark (talk) 16:04, 28 April 2013 (UTC)
- Oh please, I never said (nor was I implying) that Ungoliant is a genius. Secondly, I am not specifically highlighting to you that I support I just decided to say seeing as others have opposed and supported as well. User: PalkiaX50 talk to meh 16:22, 28 April 2013 (UTC)
- Oh really? Ungoliant looks like a genius for pushing WTF, lolwut, etc., to me of en-2 so as to embarrass me and perhaps to delay. Anyway why do you jump in on his behalf? I don't really care you support the change but the arbitrary change at CodeCat's mercy, as if the world should revolve around CodeCat. Do you like that way? --KYPark (talk) 16:04, 28 April 2013 (UTC)
-
-
-
-
-
-
- I think I have a solution that would please everyone. With CSS, we can change the formatting of a word depending on whether it has a language or not. So,
{{term}}could be changed so that it applies a CSS class to the text when no language has been specified. That way, everyone can decide individually how they want the "error" to appear to them, while the default would just appear as normal. So it would be opt-in and customisable for each user. Is that ok? —CodeCat 14:19, 9 May 2013 (UTC)
Portuguese reflexive verbs[edit]
I have just added compadecer-se, but have no idea how to show its inflections. There is nothing in Wiktionary:About Portuguese and no obvious templates. The entry in Portuguese Wiktionary has no conjugation table. Any ideas? SemperBlotto (talk) 10:50, 21 April 2013 (UTC)
- I don't know Portuguese, but in general, it is worth considering whether to direct the reader from compadecer-se to compadecer, along the likes of mračit se directing the reader to mračit. Nonetheless, as regards reflexive forms, different languages seem to use diffferent approaches. Portuguese entry dirigir-se directs the reader to dirigir for conjugation, as does encaminhar-se. --Dan Polansky (talk) 14:55, 21 April 2013 (UTC)
- What do you do in cases where the non-reflexive verb doesn't exist? Then there is no entry to direct the reader to. On the other hand, let's imagine dirigir didn't exist and was not attestable, only dirigir-se. Then dirigir-se would have to have a conjugation table. But what should it contain? Suppose that it contains forms with the reflexive particle attached, so that it has te diriges. Then that would violate "all words in all languages" because diriges gets no entry, and that would confuse users who don't realise that "te diriges" is one term. Suppose on the other hand that the table instead displays te diriges, linked separately. Then we're faced with another dilemma: what would the entry diriges contain? It can't say "second person singular present of dirigir-se" because that's not correct, "te diriges" is the second person singular of dirigir-se, not "diriges". On the other hand, it can't be "second person singular present of dirigir" either, because dirigir doesn't exist. —CodeCat 15:37, 21 April 2013 (UTC)
- (after edit conflict) For Czech, I always create a non-reflexive entry even if all its uses are reflexive. Thus, for "mračit se", the definition is at mračit, where "se" is stated on the definition line; "mračit" is always used with "se". As for inflected forms, there would be e.g. mračila. Note that, in Czech, the reflexive particle se or whatever that is is separated from its verb, as in "pořád se na něho mračila", so I do not see it necessary to have mračila se as an inflected-form entry. --Dan Polansky (talk) 15:51, 21 April 2013 (UTC)
- What do you do in cases where the non-reflexive verb doesn't exist? Then there is no entry to direct the reader to. On the other hand, let's imagine dirigir didn't exist and was not attestable, only dirigir-se. Then dirigir-se would have to have a conjugation table. But what should it contain? Suppose that it contains forms with the reflexive particle attached, so that it has te diriges. Then that would violate "all words in all languages" because diriges gets no entry, and that would confuse users who don't realise that "te diriges" is one term. Suppose on the other hand that the table instead displays te diriges, linked separately. Then we're faced with another dilemma: what would the entry diriges contain? It can't say "second person singular present of dirigir-se" because that's not correct, "te diriges" is the second person singular of dirigir-se, not "diriges". On the other hand, it can't be "second person singular present of dirigir" either, because dirigir doesn't exist. —CodeCat 15:37, 21 April 2013 (UTC)
- Yes, it's awkward, isn't it. In Italian, we hard code the pronoun in the inflection table (with no wikilink) and wikilink the inflected verb (even in the few cases in which the non-reflexive form doesn't exist (Hmm)). In French, we redirect the "pronoun + infinitive" to "infinitive".SemperBlotto (talk) 15:42, 21 April 2013 (UTC) (See lavarsi and se laver as typical of these)
- In Dutch we don't have separate entries for reflexive verbs either. But that may not really be the best idea for all languages, because in some there is no space to separate the particle from the verb. Spanish and Portuguese are examples, but Catalan also has many pronouns that contract with the verb when next to a vowel (like in French). So Catalan might have adormir-se with the form m'adormo, and the imperative of acostumar-se is acostuma't. —CodeCat 16:46, 21 April 2013 (UTC)
- My practice has been using:
====Conjugation====
See {{l/pt|compadecer}}.
- Listing each combination would be too messy. A verb form like compadeceria can give se compadeceria, compadeceria-se and compadecer-se-ia. — Ungoliant (Falai) 17:09, 21 April 2013 (UTC)
-
- Hmm, Czech and Dutch verbs don't have entries for reflexive verbs but Polish and German do (not too many). Russian reflexive verbs are included because they are always spelled together and have variations in stress (на́чался or начался́) and the actual particle (-ся and -сь) can be different, -ся - after consonant, -сь - after vowel. I think it would be beneficial to have entries for reflexive verbs in Portuguese and other language, even if as a soft redirect. --Anatoli (обсудить/вклад) 03:26, 24 April 2013 (UTC)
Should Wiktionary really include entries for characters?[edit]
Dictionaries are normally about words, and not the things that those words refer to. A definition on Wiktionary is therefore mainly concerned with giving enough information so that someone who is familiar with the referent knows that the word refers to it. So the goal of Wiktionary is not to describe in detail what the thing is that a word refers to. That's encyclopedic information, and belongs on Wikipedia. When you look at letters and other characters, it's really the same thing. When seen as a character in themselves, they are symbols and aren't really any different from, say, a triangle or a sine wave. They're concepts, not words. Presumably, Wiktionary has decided to include them because they form words, but I'm not sure if that is the best decision. It is definitely lexicographical to say C is pronounced /siː/ and has the plural Cs. But is it really lexicographical to say "C is the 3rd letter of the English alphabet"? I don't think it is, because that definition refers to the symbol C itself, not to its use as a lexical term. Definitions should say what something means, not what it is. C might indicate the third of a sequence, but that's what it means, not what the letter C actually is, so not quite the same. Similar for the etymology: describing where the shape of the letter C came from doesn't strike me as particularly dictionary-worthy. So I would like to ask whether this should be reconsidered? —CodeCat 20:14, 23 April 2013 (UTC)
- Amusingly, WT:CFI doesn't specifially include alphabetic characters. It mentions "characters used in ideographic or phonetic writing" but no mention of syllabic or alphabetic characters is ever made. Do what you want with this bit of trivia. -- Liliana • 20:18, 23 April 2013 (UTC)
- I agree. In addition to not being lexicographic information, I see the following issues:
- Because scripts like Latin and Cyrillic are used in many languages, entries for characters in those scripts end up being excessively large.
- Who is the target audience of character entries? My best guess is people who are starting to learn a language. In that case, it is much better to have per-language appendices containing “entries” for every character of a language.
- — Ungoliant (Falai) 21:09, 23 April 2013 (UTC)
- I'm not sure if my suggestion would really make the pages a lot smaller. In every language, letters still have a pronunciation, which is definitely material for a dictionary. —CodeCat 21:15, 23 April 2013 (UTC)
- I’d move the pronunciations to the appendix pages I suggested as well. — Ungoliant (Falai) 21:19, 23 April 2013 (UTC)
- I don't mind including characters like letters and punctuation that are used linguistically. But I really don't see any lexicographic value in having entries for things like →, ∟, ┌, ▒, and ☺. I once removed ⍾, ⎙, and ⎆ from WT:Wanted entries because they aren't words, but I got reverted. I still don't think they're dictionary-worthy, though. —Angr 22:11, 23 April 2013 (UTC)
- There is a relatively small universe of such terms. With respect to those that are actually used as components of words, that seems to me to be lexical significance enough to keep them - even if most did not have additional meanings capable of being reported in a dictionary. bd2412 T 01:45, 24 April 2013 (UTC)
- I don't mind including characters like letters and punctuation that are used linguistically. But I really don't see any lexicographic value in having entries for things like →, ∟, ┌, ▒, and ☺. I once removed ⍾, ⎙, and ⎆ from WT:Wanted entries because they aren't words, but I got reverted. I still don't think they're dictionary-worthy, though. —Angr 22:11, 23 April 2013 (UTC)
- I’d move the pronunciations to the appendix pages I suggested as well. — Ungoliant (Falai) 21:19, 23 April 2013 (UTC)
- I'm not sure if my suggestion would really make the pages a lot smaller. In every language, letters still have a pronunciation, which is definitely material for a dictionary. —CodeCat 21:15, 23 April 2013 (UTC)
- I don't see why we shouldn't. They are generally included in single language dictionaries, and they are lexical information, not encyclopedic. Even the non-letter characters I find useful, convenient to be able to look up like words.--Prosfilaes (talk) 08:48, 24 April 2013 (UTC)
- I would rather keep entries for letters of Latin alphabet. Even for →, it is kind of nice to find the Unicode code point for the symbol in Wiktionary. So I would rather keep all Unicode codepoints. --Dan Polansky (talk) 20:00, 24 April 2013 (UTC)
S Yes, dictionary entries are for terms, including names, but not for things. Most of the letter entries would remain, because in English, at least, A is the name of the letter A, among a few other senses or subsenses.
But punctuation marks and diacritics certainly aren’t words, nor are mathematical and logical symbols. Just look at any professional dictionary, and see what is included as entries, and what appears in tables and appendices.
A “Unicode code point” isn’t a even a character, it is an encoded representation of a character. We don’t have an entry for the code point U+0041, any more than we should have one for Morse code “dot-dash” or for the signal flag
, or the CDC 1604 key-punch card code 31 – these are all ways to encode the letter A, and not lexical entities in themselves. —Michael Z. 2013-04-25 20:39 z
- We aren't a printed dictionary; we're a computerized dictionary. There's no concern about how people are supposed to look up →, whether it should go before A or after Z or under arrow, since we don't have to worry about order. We also don't have to worry about space or many other things; we can worry about what people want to look up.
- I don't get your point about Unicode code points; I don't think Dan Polansky wants us to add U+0041, but for A and → and 倀 and ─ and the rest. "A" may be a string of bits referring to Unicode, etc., but for our purposes we can just call it a character or word.--Prosfilaes (talk) 11:03, 26 April 2013 (UTC)
-
- I don’t get your point about dictionaries. Professional dictionaries don’t refrain from “defining” symbols like arrows (→) because they can’t be printed or sorted – they certainly can – they omit them because they are not words.
-
- This is exactly my point about Unicode. We include words (technically, “terms” or “lexical items”). Some editors think that having a code point in the ultimate encoding scheme makes a thing a lexical item, but it does not. Or that it proves that these characters are significant lexical entities, because each has a code point: { → ⇾ ➙ ➔ ➛ ➝ ➞ ➡ ➤ ➧ ➨ ➫ ➯ ➱ ➺ ➻ ➼ ➽ ⟶ → }. (They are arguably not even characters in the sense of writing. I can “encode” another three dozen such “characters” with a pen on paper, but that doesn’t make them dictionary items any more than having a Unicode code point does.) No matter how great it is, Unicode is merely one way of representing text, and does not define language. —Michael Z. 2013-04-26 22:30 z
-
-
- I think anything used to convey meaning in human language is good. Don't ask me to give a robust definition of that because I can't. Mglovesfun (talk) 22:50, 26 April 2013 (UTC)
-
-
-
- Didn't we have this discussion before, about encoding things like Mʳ that we could actually find encoded that way?
-
-
-
- We don't include words; we include strings of letters or code points. color and colour are different pages, for example, and yet we combine /kɑt/ (caught) and /kɔt/ (caught) and separate /kɑt/ (cot). I don't see any reason to get overpure here; Unicode is the substrate for our system and we should rely on that and use it.
-
-
-
- From another direction, we are a part of Wikimedia; Unicode code points is not something that any other project covers in depth, and thus we should stretch our ambit so that Wikimedia covers everything.--Prosfilaes (talk) 23:40, 26 April 2013 (UTC)
- Michael Z's argument makes a bit more sense to me than Prosfilaes's. While Wiktionary is encoded in Unicode, it's not tied to Unicode; we shouldn't be making editorial decisions based on the encoding we use, we should be independent of it. From a lexical/typographical point of view, "Mʳ" is a capital M with a superscript small R, and that's the way that Wiktionary should treat it as well. I think Michael Z's point about encoding your own codepoints by drawing little doodles on a paper is very interesting, because it makes it clear how detached Unicode can be from the written reality that we are actually trying to document. In our modern society, we've become almost enslaved to our computer's capabilities and what we write is determined in many ways by what a computer is capable of producing. But just 50 years ago, that wasn't the case, and people happily made up new characters and used them in their works. Esperanto (late 19th century) introduced a whole new set of letters with diacritics, and APL (a programming language of all things!) made up a whole set of characters that nobody else used. Going back even further, you see that people made letter types so that they could print what they wrote, even going so far as to make up ligatures that mimicked handwriting. In medieval times the situation was still further removed from Unicode's "reality", where people would happily stack characters on top of each other, write little lines and squiggles all over the text, and made up all kinds of abbreviations which would use whatever formatting they found useful. So if Wiktionary's task is to document usage, then we can't let Unicode decide what to document because it's clear that Unicode is quite far from an accurate representation of the usage our CFI wants us to record and cite. If Unicode and its characters are the lexical reality, then I guess the sky must be made out of 5 megapixels. :) —CodeCat 23:53, 26 April 2013 (UTC)
- From a modern typographical point of view, "Mʳ" is U+004D U+02B3. Both in the computer, and in the way that it's written, it's not a superscript small R, and the font maker will have to deal with that.
- (Esperanto wasn't new letters, then or now; both the typography of the time and Unicode have no problem with arbitrary combinations of accents on existing characters.)
- As I mentioned in my post, we don't handle the language that a computer is capable of recording and playing back; neither /kɔt/ nor more accurately
are handled by Wiktionary. As a practical thing, being say or play a word into your phone and have it come up with a definition and spelling would be worlds more useful and used then anything based on medieval handwriting.Audio (US) (file) - In any case, whether or not we should try and handle all the non-Unicode stuff strikes me as irrelevant to the question of whether we should handle the Unicode stuff. Whatever they did in the past is irrelevant to the fact that Unicode is the dominant system today, and no matter what you can imagine creating, people are likely to select Unicode characters and enter them into Wiktionary and not random doodles.--Prosfilaes (talk) 10:51, 27 April 2013 (UTC)
- Michael Z's argument makes a bit more sense to me than Prosfilaes's. While Wiktionary is encoded in Unicode, it's not tied to Unicode; we shouldn't be making editorial decisions based on the encoding we use, we should be independent of it. From a lexical/typographical point of view, "Mʳ" is a capital M with a superscript small R, and that's the way that Wiktionary should treat it as well. I think Michael Z's point about encoding your own codepoints by drawing little doodles on a paper is very interesting, because it makes it clear how detached Unicode can be from the written reality that we are actually trying to document. In our modern society, we've become almost enslaved to our computer's capabilities and what we write is determined in many ways by what a computer is capable of producing. But just 50 years ago, that wasn't the case, and people happily made up new characters and used them in their works. Esperanto (late 19th century) introduced a whole new set of letters with diacritics, and APL (a programming language of all things!) made up a whole set of characters that nobody else used. Going back even further, you see that people made letter types so that they could print what they wrote, even going so far as to make up ligatures that mimicked handwriting. In medieval times the situation was still further removed from Unicode's "reality", where people would happily stack characters on top of each other, write little lines and squiggles all over the text, and made up all kinds of abbreviations which would use whatever formatting they found useful. So if Wiktionary's task is to document usage, then we can't let Unicode decide what to document because it's clear that Unicode is quite far from an accurate representation of the usage our CFI wants us to record and cite. If Unicode and its characters are the lexical reality, then I guess the sky must be made out of 5 megapixels. :) —CodeCat 23:53, 26 April 2013 (UTC)
- From another direction, we are a part of Wikimedia; Unicode code points is not something that any other project covers in depth, and thus we should stretch our ambit so that Wikimedia covers everything.--Prosfilaes (talk) 23:40, 26 April 2013 (UTC)
-
-
-
-
-
-
- Unicode’s development follows language, imperfectly, and not the other way around. Unicode is also designed to represent non-linguistic writing, like typographical ornaments, mathematical equations, computer code, and UI elements. We are limited by Unicode in how we can represent the language. But we are a dictionary, not a code book. Our subject is written language. —Michael Z. 2013-04-27 14:45 z
-
-
-
-
-
-
-
-
-
-
- ☞ Re: Mʳ: this is an encoding error. It contravenes the Unicode standard: The fact that the latter two letters contain the word “superscript” in their names instead of “modifier letter” is an historical artifact of original sources for the characters, and is not intended to convey a functional distinction in the use of these characters in the Unicode Standard. ¶ Superscript modifier letters are intended for cases where the letters carry a specific meaning, as in phonetic transcription systems, and are not a substitute for generic styling mechanisms for superscripting of text, as for footnotes, mathematical and chemical expressions, and the like.[15]
-
-
-
-
-
-
-
-
-
-
-
-
- I don't see why you see a difference between someone willfully using a number for a letter or a phonetic letter for another letter; "pr0n" is as much an encoding error as "Mʳ". In any case, the point was not to restart the argument, just to remind you that we'd had that discussion before.
-
-
-
-
-
-
-
-
-
-
-
-
-
- Our organization is not inconsistent; each page is denoted by a string of characters. The occasional redirect is the only break from that.
-
-
-
-
-
-
-
-
-
-
-
-
-
- Don't tell me not to do something; tell me why I shouldn't do it. Dictionaries frequently include a lot of stuff that's not just words; many dictionaries come with biographical, geographical, and scientific data. Storing Unicode codepoints is something that would have unique value for us and no more not make us a dictionary then including entries on "George Washington" turned other dictionaries into not dictionaries.--Prosfilaes (talk) 22:46, 27 April 2013 (UTC)
-
-
-
-
-
-
-
-
-
If WT after all is to help readers resolve the semantic ambiguity anyway involved in speech and writing, then the punctuation marks (PM's) should necessarily come in as semantic functors. However, the trouble is that the main pages are designed or structured around words rather than PM's. Then we'd have a few options, say, as follows:
- A main page for each PM in spite of inadequate design.
- A main page for all PM's. In this and next cases, REDIRECT's may be well used.
- WT:Puncutation marks, probably subpaged for each PM.
-
-
|
Why not? It should! Traditionally, lexicographically, and perhaps lexicologically. Generally, it is quite desirable to review anything, esp. from the bottom up, as low as possible. Biased, however, you'd fall into the pitfall or vicious cycle of circular reasoning, as usual. Say, you presuppose, to begin with: "Dictionaries are normally about words, and not the things that those words refer to." This may be enough for you to begin too wrong! See first: A most commonsensical fatal fallacy is such that the word in and of itself does refer or relate to the thing or referent, likely magically, remindfully of "word magic". Unconvinced by my words, be convinced by: This includes the opening quotation of:
In a nutshell:
That is, "Words are magical" only to "the minds of those who use them." Put more precisely, cognitive minds are magical, rather than words, hence cognitive sciences since the late 70s! Recall "The Delta Factor" (1975). This recognition was quite a queer revolution in sheer silence or sheer mystery. All life comes back to the question of our ideas -- the medium through which we relate words to things, ill or well. This is my parody of:
This is the first of the ten quotations that open:
As far as my knowledge goes, you'd better believe me, this is the origin or center of the cognitive earthquake or revolution, vividly evolved since the late 70s but in sheer mystery! All I'm saying is perhaps U're doing too wrong more often than not! |
- (This'd best be where CodeCat, etc., would respond to "Why not? It should!" above. Thanks. --KYPark (talk) 08:18, 26 April 2013 (UTC))
Request for comment on inactive administrators[edit]
(Please consider translating this message for the benefit of your fellow Wikimedians. Please also consider translating the proposal.)
Read this message in English / Lleer esti mensaxe n'asturianu / বাংলায় এই বার্তাটি পড়ুন / Llegiu aquest missatge en català / Læs denne besked på dansk / Lies diese Nachricht auf Deutsch / Leś cal mesag' chè in Emiliàn / Leer este mensaje en español / Lue tämä viesti suomeksi / Lire ce message en français / Ler esta mensaxe en galego / हिन्दी / Pročitajte ovu poruku na hrvatskom / Baca pesan ini dalam Bahasa Indonesia / Leggi questo messaggio in italiano / ಈ ಸಂದೇಶವನ್ನು ಕನ್ನಡದಲ್ಲಿ ಓದಿ / Aqra dan il-messaġġ bil-Malti / norsk (bokmål) / Lees dit bericht in het Nederlands / Przeczytaj tę wiadomość po polsku / Citiți acest mesaj în română / Прочитать это сообщение на русском / Farriintaan ku aqri Af-Soomaali / Pročitaj ovu poruku na srpskom (Прочитај ову поруку на српском) / อ่านข้อความนี้ในภาษาไทย / Прочитати це повідомлення українською мовою / Đọc thông báo bằng tiếng Việt / 使用中文阅读本信息。
Hello!
There is a new request for comment on Meta-Wiki concerning the removal of administrative rights from long-term inactive Wikimedians. Generally, this proposal from stewards would apply to wikis without an administrators' review process.
We are also compiling a list of projects with procedures for removing inactive administrators on the talk page of the request for comment. Feel free to add your project(s) to the list if you have a policy on administrator inactivity.
All input is appreciated. The discussion may close as soon as 21 May 2013 (2013-05-21), but this will be extended if needed.
Thanks, Billinghurst (thanks to all the translators!) 04:34, 24 April 2013 (UTC)
- Distributed via Global message delivery (Wrong page? You can fix it.)
- Looking at the other projects' policies, and at our own inactive admins, I'd like to propose that we have a policy vote about this. What do you think about removal of adminship from admins who make less than 10 mainspace edits in a year? That sounds reasonable (compare with our voting requirements, for example). —Μετάknowledgediscuss/deeds 05:40, 24 April 2013 (UTC)
-
-
- As that proposal will override local consensus if it passes, I invite all Wiktionarians to oppose the proposal so we can stay independent and govern ourselves. -- Liliana • 08:04, 24 April 2013 (UTC)
-
Category:Hungarian nouns suffixed with -acs, et al[edit]
Why are these suffix categories sorted by PoS? It's especially confusing that Hungarian prefixes aren't. Would anyone object if I changed them to the same format as most (if not all) other languages? Ultimateria (talk) 14:46, 24 April 2013 (UTC)
- I think we should wait,
{{suffix}}allows for this sort of thing (not only{{hu-suffix}}) and last time I talked to User:Panda10 about it, he opposed deleting{{hu-suffix}},{{hu-prefix}}and{{hu-affix}}. We should at least let some of our Hungarian editors comment; a couple of days is nothing. Mglovesfun (talk) 14:53, 24 April 2013 (UTC) - Please do not change it. In several cases, the same suffix will create a different PoS and it is best to keep these categories of words separately. --Panda10 (talk) 13:58, 1 May 2013 (UTC)
Multiple user pages using "/"[edit]
Greetings. I've noticed that some users have created multiple pages for their user by creating pages with a backslash after their user name. (E.g. User:[username]/1000EnglishEntries.) Is this normally accepted, and is there a limit as to how many pages you can have? Thanks. TeragR (talk) 17:16, 25 April 2013 (UTC)
- If the content supports the work of Wiktionary, there is no limit that I am aware of. Any significant volume of content not related to the work of Wiktionary (including maintaining friendly relations useful for that work), whether or not on a subpage, is not permitted. DCDuring TALK 21:09, 25 April 2013 (UTC)
- Normally accepted and no limit. WT:USERPAGE should cover this. Mglovesfun (talk) 21:46, 25 April 2013 (UTC)
- It just says the same rule apply to subpages as to the main user page, that's enough, right? Mglovesfun (talk) 11:32, 27 April 2013 (UTC)
- Normally accepted and no limit. WT:USERPAGE should cover this. Mglovesfun (talk) 21:46, 25 April 2013 (UTC)
Administrator communication[edit]
I'm having trouble with an administrator that often reverts legitimate revisions en masse and deletes entries without bothering to message the user or start a talk page on the matter. I wouldn't be so troubled about it if they was simply an editor, but as an administrator, I expect more from them. I've contacted them, but they refute any culpability. Is there anyone that can intervene and ask them to better communicate with others, in both initiating contact and conducting themselves is a cooperative manner? Thanks. --Victar (talk) 20:37, 25 April 2013 (UTC)
- We do have the problem of having a very high ratio of pages to patrollers. This leads to curt interaction. You could try posting to the entry talk pages or visiting one of the pages like Wiktionary:About Frankish or Wiktionary:About Proto-Indo-European and leaving a message on a talk page there to determine what problem there may have been with your contribution. DCDuring TALK 21:17, 25 April 2013 (UTC)
-
-
- Another thing you could try is recognizing the fact that CodeCat is a knowledgeable, experienced, and respected editor on this project, and you are still relatively new. Showing some politeness, respect, and dare I say even a bit of deference would get you a long way. Simply put, the administrators on this project have all had to put up with new editors who think they know better, which is a tiresome process, and one which is somewhat jading. As far as I can tell, CodeCat has good reasons for their reversions, has been reasonably professional in their conversation with you, and generally met English Wiktionary admin standards that I am comfortable with. -Atelaes λάλει ἐμοί 00:12, 26 April 2013 (UTC)
-
-
-
-
- Again though, it isn't a matter of the quality of their revisions or even the way in which they communicate; it's the lack in communication that I find troublesome. If an admin is going to delete your work en masse without even discussing it with you, why would anyone want contribute? If you need more people, this is not the way to attract them. --Victar (talk) 00:33, 26 April 2013 (UTC)
-
-
-
-
-
-
- You don't realise the vast quantity of vandalism, idiocy, ill-informed edits, and good faith errors we have to clean up after and weed through. If we left a personalised message for everyone who made a mistake, it would take far too much time and (wo)manpower. We were all newbies once too, and we learned the inane template system and confusing structure as well. As long as you don't make it into a conflict, it almost certainly won't become one, and you can learn and move on. —Μετάknowledgediscuss/deeds 00:41, 26 April 2013 (UTC)
-
-
-
-
-
-
-
-
Victar is supposed to be concerning or complaining, as I used to, about both the administrator's moral(ity) and the editor's morale, mismatched, rather than the technicality of administration. No doubt, it is fatally self-defeating and immoral within the participatory wikis to discuourage editors as if vandals anyway. --KYPark (talk) 03:43, 26 April 2013 (UTC)
-
-
-
-
Number/Numeral categories what's the story?[edit]
I'm just wondering since I remember discussion about the categories for numbers and/or numerals a while back. Did any decisions or anything of the like come out of it? I mean, for a given language, what number related categories should we have, and what shouldn't we have? Ever since I noticed the controversy or uncertainty about this issue months and months ago I made sure to ignore any I saw on WantedCats. But I'm just curious atm and probably won't be staying on wikt much longer for the moment today so I was wondering if someone could easily tell me what happened and perhaps even direct me to the relevant discussion if they feel the need. User: PalkiaX50 talk to meh 13:24, 27 April 2013 (UTC)
- As far as I know, the decision was to use numeral when it represented a distinct part of speech, and to use some other part of speech when more appropriate. In particular, ordinal number words are almost never a "numeral" part of speech, they are usually adjectives. The cardinal and ordinal numbers are categorised in their own topical categories, which are based on meaning rather than on part of speech. —CodeCat 13:35, 27 April 2013 (UTC)
Is German infinitive ending -en considered a suffix?[edit]
In German, every verb in its infinitive form ends in the morpheme -en (or in its much rarer variants -ern or -eln). Should it still be considered a suffix? Some users seem to think so. For example, the etymology of the verb vernetzen was recently changed from {{prefix|ver|Netz|lang=de}} to {{confix|ver|Netz|en|lang=de}}. I don't think that makes sense, since -en is just a grammatical morpheme that marks the infinitive rather than a lexical morpheme of word formation. Longtrend (talk) 11:35, 29 April 2013 (UTC)
- I’d say it is. Inflectional suffixes are also suffixes. — Ungoliant (Falai) 11:47, 29 April 2013 (UTC)
- I'd say it isn't. Our suffix categories are usually broken up into several subcategories based on usage, so there's Category:German noun-forming suffixes, Category:German verb-forming suffixes and so on. There's also Category:German inflectional suffixes, which would contain things like -em, -en, -te. -en can be used to form new verbs, something like Category:German verbs suffixed with -en is horribly misleading. -en has to be present in a verb, so it's not true suffixation but really more like adapting the word morphologically into a verb. That's quite different. Latin -us is the same; it's not used as a way to form nouns, but rather as a way to make morphologically non-conforming nouns conform to the grammar of the language. —CodeCat 12:54, 29 April 2013 (UTC)
-
- But that doesn't belong in the etymology. That would mean using
{{suffix}}or{{prefix}}for almost every entry in almost every inflected language. Does falar really have a different etymology from falo? One that's worth distinguishing in the etymology section? Chuck Entz (talk) 12:58, 29 April 2013 (UTC)- I understand your concerns, but when you say "adapting the word morphologically into a verb"....well that's done by means of a suffix, isn't it? When German invents a new verb based on a noun, it sticks the suffix -en on the end of it. So in my opinion, it is a suffix. (In my opinion also, all of these categories are a complete waste of time and energy, but that's a separate issue...) Ƿidsiþ 13:17, 29 April 2013 (UTC)
- And vernetzt is ver + netz +t and vernetzte is ver + netz +te. The inflectional morphology is the result of the conversion to a verb, not the cause. In an analogous English case, would you describe a verb derived from a noun as noun + null ending, since our lemma is the unmarked form? Chuck Entz (talk) 13:55, 29 April 2013 (UTC)
- No, and that is a silly analogy. A better one would be to consider if regular English past-tense forms are formed by adding a suffix -ed. In my opinion (and that of the OED), this is the case (although I see no value in putting them all in a category). Ƿidsiþ 14:40, 29 April 2013 (UTC)
- And vernetzt is ver + netz +t and vernetzte is ver + netz +te. The inflectional morphology is the result of the conversion to a verb, not the cause. In an analogous English case, would you describe a verb derived from a noun as noun + null ending, since our lemma is the unmarked form? Chuck Entz (talk) 13:55, 29 April 2013 (UTC)
- No it wouldn’t. Falar is not derived from a noun + an inflectional suffix, vernetzen is.
- Maybe it’s better to think of inflectional suffixes as sets, instead of single suffixes. For example, the Portuguese 1st conjugation has {-ar, -ando, -ado, -o, -as, -a, -amos, -ais, -am, etc.}, the 2nd has {-er, -endo, -ido, -o, -es, -e, -emos, -eis, -em, etc.} and the 3rd has {-ir, -indo, -ido, -o, -es, -e, -imos, -is, -em, etc.}. Consider the word monitorar, it is the noun monitor + the 1st conjugation paradigm; since the lemma of Portuguese verbs is the impersonal infinitive, the etymology should, in my opinion, display monitor + the 1st conjugation paradigm’s impersonal infinitive suffix (-ar). — Ungoliant (Falai) 13:27, 29 April 2013 (UTC)
- Not in Portuguese, but it ultimately comes from fabula. My point is that it's not the sticking of the inflectional ending on it that made it a verb. Because it became a verb, the inflectional ending was added. Chuck Entz (talk) 13:55, 29 April 2013 (UTC)
- That "inflectional ending" can be considered a suffix though. Ƿidsiþ 14:40, 29 April 2013 (UTC)
- Not really, they are separate things from a formal point of view. In Indo-European at least, much derivation involves adding a suffix to extend the basic stem, which is distinct from the inflectional ending that comes after it. Indo-European words are formed as root + one or more suffixes + inflectional ending. This has become somewhat more muddled in later languages, because many languages have "zero endings" which are inflectional endings that are empty, and also because the distinction between root and suffix is no longer as apparent. So for modern IE languages, it's easier to just consider stem + ending, and treat the stem as the more "invariable" part. In this view, -en is definitely not part of the stem in German, and neither is -ar in Portuguese (although the -a- on its own might be). In German in particular, creating a verb from a noun is often a so-called "zero derivation" where both stems are the same. So there is really no suffixation involved, just a change of endings from the noun set (genitive -s, plural -e(n), dative plural -en) to the verb set (infinitive -en, 3sg present -t and so on). The reason that it appears as suffixation is because the lemma form (nominative singular) of a noun stem generally has a zero ending, whereas the lemma form of a verb stem has an overt ending. But if it had been the other way around (say, nouns had -en in their nominative singular, and the infinitive had a zero ending) then would it still appear as suffixation? I don't think it would. And if you look at Latin or (to a lesser extent) Portuguese, there are few forms that have no ending, so something like "replace -us (la) / -o (pt) with -ar(e)" can hardly be called suffixation by itself. Rather, the real suffixation is adding the verbal derivation morpheme -a- (first declension) to the noun stem, which then creates a verbal stem that requires -r(e) as the infinitive ending. —CodeCat 14:52, 29 April 2013 (UTC)
- Not in Portuguese, but it ultimately comes from fabula. My point is that it's not the sticking of the inflectional ending on it that made it a verb. Because it became a verb, the inflectional ending was added. Chuck Entz (talk) 13:55, 29 April 2013 (UTC)
- I understand your concerns, but when you say "adapting the word morphologically into a verb"....well that's done by means of a suffix, isn't it? When German invents a new verb based on a noun, it sticks the suffix -en on the end of it. So in my opinion, it is a suffix. (In my opinion also, all of these categories are a complete waste of time and energy, but that's a separate issue...) Ƿidsiþ 13:17, 29 April 2013 (UTC)
- But that doesn't belong in the etymology. That would mean using
-
-
-
-
-
- Re: “although the -a- on its own might be”: -a- is the suffix -ar’s thematic vowel.
- Re: “"replace -us (la) / -o (pt) with -ar(e)" can hardly be called suffixation by itself”: it can, because suffixes are usually added to the stem, not the whole word. Cabeludo is cabel- (stem of cabelo) + -udo; here the suffix is added to the stem and, similarly, verbs formed from nouns have the conjugation suffixes added to the noun’s stem.
- Even if “suffixation” and “suffix” aren’t the correct terms used by linguists, our etymology sections don’t use those terms. — Ungoliant (Falai) 15:16, 29 April 2013 (UTC)
-
-
-
-
-
-
-
-
- Irrespective of whether the suffix was added because it became a verb, or whether it became a verb because the suffix was added (and how can you tell?), a new word appeared and this new word is a previous word + a set of suffixes. You could just say that monitorar came from monitor, but then why isn’t it *monitorer or *monitorir? That’s because it’s monitor + {-ar, -ando, -ado, etc.}, not monitor + {-er, -endo, -ido, etc.} nor monitor + {-ir, -indo, -ido, etc.}. The word was derived with a specific set of suffixes, and this set’s lemma suffix should be added to the etymology. — Ungoliant (Falai) 15:16, 29 April 2013 (UTC)
- That's true, but that works only if the lemma form actually shows this distinction in paradigms. In the case of German, the infinitive suffix doesn't show the verb paradigm. Of course, the paradigm is included in the derivation process, but you don't see it. —CodeCat 15:30, 29 April 2013 (UTC)
- How about verbs that are formed from a noun? As in Dutch schaats ( a skate) ==> schaatsenrijden ==> schaatsen (verb). Couldn't you say in that case that affixing -en is a (productive) way of generating new verbs? A productive suffix? Think of faxen or sms'en Jcwf (talk) 15:40, 29 April 2013 (UTC)
- I'm not saying it's not productive, I'm saying that -en isn't the suffix used to perform the derivation. The verb is more than just its infinitive... alongside schaatsen there is also schaats, schaatste, and so on. How would you say schaats (the verb form) is derived from schaats (the noun)? To say that people first create the infinitive and then replace the infinitive ending with a zero ending would be silly. Treating the infinitive as the lemma form is only a lexicographical convenience but not the reality; verbs can exist without their infinitives, and choosing one of the forms as the lemma is arbitrary. What if Dutch verbs were lemmatised as the 1st person singular? How would we denote the etymology then? When schaatsen is derived from schaats, there is no suffixation involved (or rather zero-suffixation). Rather, we just change the lemma form of one part of speech to the lemma form of another, but the actual derivational process is completely independent of which lemma form you choose, so deriving schaats from schaats is not only a valid alternative, it's the exact same thing. —CodeCat 15:52, 29 April 2013 (UTC)
- I think "schaats" (Dutch noun stem) and "schaats" (Dutch verb stem) aren't the same thing and that we can still see the etymology because "schaats" (noun: runner or blade) still rules out "schaats" (noun: the act/result of skating), while we do have the noun "loop" (a competition in "lopen"; a track to "lopen"). --80.114.178.7 00:18, 4 May 2013 (UTC)
- That said, if "-en" is a suffix, it is a suffix to the verb "schaats" to make the infinitive (and 1st person plural of the present tense &c.), not a suffix to transform a noun into a verb. Perhaps it would be good to have a header "Stem" (like we have "Noun" and "Verb"), if only to have terminology and/or a place to link to (do we need "Noun stem"/"Verb Stem"?). --80.114.178.7 00:18, 4 May 2013 (UTC)
- I'm not saying it's not productive, I'm saying that -en isn't the suffix used to perform the derivation. The verb is more than just its infinitive... alongside schaatsen there is also schaats, schaatste, and so on. How would you say schaats (the verb form) is derived from schaats (the noun)? To say that people first create the infinitive and then replace the infinitive ending with a zero ending would be silly. Treating the infinitive as the lemma form is only a lexicographical convenience but not the reality; verbs can exist without their infinitives, and choosing one of the forms as the lemma is arbitrary. What if Dutch verbs were lemmatised as the 1st person singular? How would we denote the etymology then? When schaatsen is derived from schaats, there is no suffixation involved (or rather zero-suffixation). Rather, we just change the lemma form of one part of speech to the lemma form of another, but the actual derivational process is completely independent of which lemma form you choose, so deriving schaats from schaats is not only a valid alternative, it's the exact same thing. —CodeCat 15:52, 29 April 2013 (UTC)
- How about verbs that are formed from a noun? As in Dutch schaats ( a skate) ==> schaatsenrijden ==> schaatsen (verb). Couldn't you say in that case that affixing -en is a (productive) way of generating new verbs? A productive suffix? Think of faxen or sms'en Jcwf (talk) 15:40, 29 April 2013 (UTC)
- That's true, but that works only if the lemma form actually shows this distinction in paradigms. In the case of German, the infinitive suffix doesn't show the verb paradigm. Of course, the paradigm is included in the derivation process, but you don't see it. —CodeCat 15:30, 29 April 2013 (UTC)
- Irrespective of whether the suffix was added because it became a verb, or whether it became a verb because the suffix was added (and how can you tell?), a new word appeared and this new word is a previous word + a set of suffixes. You could just say that monitorar came from monitor, but then why isn’t it *monitorer or *monitorir? That’s because it’s monitor + {-ar, -ando, -ado, etc.}, not monitor + {-er, -endo, -ido, etc.} nor monitor + {-ir, -indo, -ido, etc.}. The word was derived with a specific set of suffixes, and this set’s lemma suffix should be added to the etymology. — Ungoliant (Falai) 15:16, 29 April 2013 (UTC)
-
-
-
- I agree with Widsith and Ungoliant, it's a suffix. I'm surprised that there's debate on this point. I find it no more or less useful to categorise all German verbs suffixed with -en together than to categorise all English past tense forms suffixed with -ed together, but -en and -ed remain suffixes. (And I can conceive of such categorisation being at least slightly useful, in that there are other suffixes—wandeln in German, dreamt in English—and someone might want to find only -en / -ed words.) - -sche (discuss) 20:21, 29 April 2013 (UTC)
- I definitely don't agree with making categories based purely on allophonic grounds. The ending of wandeln isn't somehow a different one from the -en that most other verbs end in. It's the same thing, just with a different shape depending on the stem. And I am surprised that you think there is no debate. What arguments do you have against the ones I've raised? How does one get schaats (“I skate”) from schaats (“a skate”) by suffixing -en? That doesn't make any sense to me at all. —CodeCat 20:48, 29 April 2013 (UTC)
- If one forms words in Dutch the way one forms them in German, then it seems one takes schaats (“a skate”, n), suffixes -en to get a verb, and conjugates that verb like other verbs that end in the suffix -en, resulting in forms like schaats (“I skate”). That seems as obvious to me as your analysis seems to you, so I don't know if we'll be able to do anything but agree to disagree... - -sche (discuss) 23:24, 29 April 2013 (UTC)
- Your analysis depends on treating the infinitive form as the basis from which other forms of the verb are derived. But it doesn't work like that in reality. The fact that we choose different lemmas in different languages is a reflection of that. For example, in Latin any of the Balkan languages, would you suggest that people first add a suffix to create the first person singular, and then conjugate that? I'd say that it's more realistic to say that when deriving a new verb, people create the whole verb and the complete set of its forms, and then select the one they need in that particular situation. Seen that way, the process of word derivation is what creates one paradigm from another, rather than just one lemma form from another. That is why I think it's misleading to treat -en as a suffix: it doesn't actually create new lemmas. The true derivational part is the zero suffix, which is attached to a noun paradigm in order to form a verb paradigm. That the lemma of the noun paradigm has no ending while the lemma of the verb paradigm has -en isn't relevant; this could easily change just by selecting another lemma, since it's arbitrary which lemma you choose. —CodeCat 23:44, 29 April 2013 (UTC)
- If one forms words in Dutch the way one forms them in German, then it seems one takes schaats (“a skate”, n), suffixes -en to get a verb, and conjugates that verb like other verbs that end in the suffix -en, resulting in forms like schaats (“I skate”). That seems as obvious to me as your analysis seems to you, so I don't know if we'll be able to do anything but agree to disagree... - -sche (discuss) 23:24, 29 April 2013 (UTC)
- I definitely don't agree with making categories based purely on allophonic grounds. The ending of wandeln isn't somehow a different one from the -en that most other verbs end in. It's the same thing, just with a different shape depending on the stem. And I am surprised that you think there is no debate. What arguments do you have against the ones I've raised? How does one get schaats (“I skate”) from schaats (“a skate”) by suffixing -en? That doesn't make any sense to me at all. —CodeCat 20:48, 29 April 2013 (UTC)
Note that the definition of desinence considers that a desinence is a kind of suffix. Lmaltier (talk) 20:07, 1 May 2013 (UTC)
- I agree that endings like -en are suffixes from the linguistic point of view, but from the lexicographical point of view it's no help to anyone to have Category:German words suffixed with -en. That category currently has just 48 words, all of them infinitives, but in principle it could have the vast majority of German infinitives, plus all 1st and 3rd person plural preterite forms, plus all 1st and 3rd person plural past subjunctive forms, plus all the plural nouns in -en, plus all the dative plurals in -en, plus the -en form of every single German adjective. No one would be able to use a category like that for navigation. —Angr 21:22, 1 May 2013 (UTC)
- It could be useful if it had sub-categories like Category:German infinitives ending with -en, Category:German 3rd person plural preterite forms ending with -en and so on, and just a few words which don't fall into those categories. But probably almost all words in the base cateegory would just have to be moved to (often several) subcategories. --80.114.178.7 00:26, 4 May 2013 (UTC)
[en] Change to wiki account system and account renaming[edit]
Some accounts will soon be renamed due to a technical change that the developer team at Wikimedia are making. More details on Meta.
(Distributed via global message delivery 03:31, 30 April 2013 (UTC). Wrong page? Correct it here.)
- For the lazy... Ƿidsiþ 07:46, 30 April 2013 (UTC)
- The developer team at Wikimedia Foundation is making some changes to how accounts work, as part of our on-going efforts to provide new and better tools for our users (like cross-wiki notifications). These changes will mean users have the same account name everywhere. This will let us give you new features that will help you edit and discuss better, and will allow more flexible user permissions for tools. One of the pre-conditions for this is that user accounts will now have to be unique across all 900 Wikimedia wikis.
- Unfortunately, some accounts are currently not unique across all our wikis, but instead clash with other users who have the same account name. To make sure that all of these users can use Wikimedia's wikis in future, we will be renaming a number of these accounts to have "~” and the name of their wiki added to the end of their accounts' name. This change will take place on or around 27 May. For example, a user called “Example” on the Swedish Wiktionary who will be renamed would become “Example~svwiktionary”.
- All accounts will still work as before, and will continue to be credited for all their edits made so far. However, users with renamed accounts (whom we will be contacting individually) will have to use the new account name when they log in. It will now only be possible for accounts to be renamed globally; the RenameUser tool will no longer work on a local basis - since all accounts must be globally unique - therefore it will be withdrawn from bureaucrats' tool sets. Once this takes place, it will still be possible for users to ask for their account to be renamed further here on Meta, if they do not like their new user name.
- Oh, Christ, am I going to become Equinox~enwiktionary because of that prior Equinox who made one edit on Wikipedia back in 1843? Equinox ◑ 12:54, 30 April 2013 (UTC)
- The Internet was steam-powered in those days. Ƿidsiþ 13:02, 30 April 2013 (UTC)
- Is there a way to avoid becoming Astral~enwiktionary? I'm pretty sure there are other Astrals scattered throughout various Wikimedia projects. Perhaps by renaming my account before May 27? Astral (talk) 02:35, 8 May 2013 (UTC)
- To both Equinox and Astral: Yes, if there is anyone on any other Wikimedia project with the same username, you will both be automatically renamed. You can seek renaming here now (or seek to have the existing other account(s) usurped in your favor now) or have this done at Meta after the change. bd2412 T 03:07, 8 May 2013 (UTC)
- Is there a way to avoid becoming Astral~enwiktionary? I'm pretty sure there are other Astrals scattered throughout various Wikimedia projects. Perhaps by renaming my account before May 27? Astral (talk) 02:35, 8 May 2013 (UTC)
- The Internet was steam-powered in those days. Ƿidsiþ 13:02, 30 April 2013 (UTC)
[en] Change to section edit links[edit]
The default position of the "edit" link in page section headers is going to change soon. The "edit" link will be positioned adjacent to the page header text rather than floating opposite it.
Section edit links will be to the immediate right of section titles, instead of on the far right. If you're an editor of one of the wikis which already implemented this change, nothing will substantially change for you; however, scripts and gadgets depending on the previous implementation of section edit links will have to be adjusted to continue working; however, nothing else should break even if they are not updated in time.
Detailed information and a timeline is available on meta.
Ideas to do this all the way to 2009 at least. It is often difficult to track which of several potential section edit links on the far right is associated with the correct section, and many readers and anonymous or new editors may even be failing to notice section edit links at all, since they read section titles, which are far away from the links.
(Distributed via global message delivery 18:21, 30 April 2013 (UTC). Wrong page? Correct it here.)
- I see this has gone live. I think it's bad for usability. Finding an edit link now requires horizontal scanning, since its position is relative to the header's text length. It used to be easy: absolute far right. Equinox ◑ 18:27, 1 May 2013 (UTC)
- I like this change; I find it easier to find the "edit" links when they're next to their headers; when they floated right, it took me longer to sort out which edit link went with which section on pages that had multiple immediately adjacent headers, e.g. pages with an L2 immediately followed by an empty Etymology 1 section immediately followed by a POS section, especially if only some of those headers were indented by right-floating Wikipedia boxes. (I've also had experience with this leftist placement for a long time, due to de.Wikt using it.) - -sche (discuss) 19:15, 1 May 2013 (UTC)
- Okay, I've fixed (hopefully all of) the resulting breakages in TabbedLanguages, DefSideBoxes, AddDefinition, RhymesEdit, and VisibityToggles. Did anything else break that anyone's aware of? --Yair rand (talk) 19:17, 1 May 2013 (UTC)
-
-
-
- Yep, thanks. Equinox ◑ 19:37, 1 May 2013 (UTC)
- @Yair: Nope, I'm still getting the crap Equinox reported. I can provide diffs if you want. —Μετάknowledgediscuss/deeds 03:40, 8 May 2013 (UTC)
-
-
- I don't know if it is just coincidence, but vandalism marked as (Mobile Edit) has gone up alarmingly since this change. SemperBlotto (talk) 14:58, 2 May 2013 (UTC)
-
- I now see section edit links in the mobile view. Whatever was hiding them before doesn’tbseem to work with the new HTML.
- Would it be possible for me to reverse this change? I don't think I like it all that much, it makes pages appear messier. —CodeCat 02:11, 3 May 2013 (UTC)
- Add
.mw-editsection {float: right;}to Special:MyPage/common.css. --Yair rand (talk) 03:41, 3 May 2013 (UTC)
- Add
- It seems net beneficial to me personally as I formerly made lots of errors clicking on the wrong section link and often found the edit link hidden by project link and similar boxes. It should be easier for newbies too. DCDuring TALK 15:31, 3 May 2013 (UTC)
May 2013
Homophones[edit]
homophone provides the following definition: A word which is pronounced the same as another word but differs in spelling or meaning or origin, for example: carat, caret, carrot, and karat. It's important to note the use of as another word in this definition. This means that many pages provide wrong information, e.g. familiarisât mentions familiarisa and familiarisas as homophones, while they are different forms of the same word. It's possible to state that the forms are homophonous, but not to mention them in a section named Homophones. In French, when you list the homophones of sot, you mention saut, seau, sceau, Sceaux, but never sots, because it's the same word. Lmaltier (talk) 06:09, 1 May 2013 (UTC)
- If familiarisât, familiarisa and familiarisas are the same word, are their definitions incorrect? — Ungoliant (Falai) 06:12, 1 May 2013 (UTC)
- No, definitions are correct. It's the same as provide and provides, or cat and cats: inflected forms, different forms of the same word. Lmaltier (talk) 07:20, 1 May 2013 (UTC)
- cat and cats are different words; one is a conjugated form of the other, but they're still different words. Certainly that depends on the arbitrary definition of word, but I don't think my definition is idiosyncratic or unusual.--Prosfilaes (talk) 08:09, 1 May 2013 (UTC)
- So I oppose removing the homophones. They’re different words which mean different things, so if they have the same pronunciation they should list each other as homophones, whether they have the same lemma or not. — Ungoliant (Falai) 19:49, 1 May 2013 (UTC)
- No, definitions are correct. It's the same as provide and provides, or cat and cats: inflected forms, different forms of the same word. Lmaltier (talk) 07:20, 1 May 2013 (UTC)
- What's included as a ==French== homophone can be decided by the French editors without community input, much as language editors decide on transliteration, hyphenation, and so on. As for English, if (hypothetically, since I can't think of a real example) the present-tense tread and the past trod were pronounced the same in some common accent, I'd consider them homophones in that accent and IMO we should list them as such in the entry.—msh210℠ (talk) 07:29, 1 May 2013 (UTC)
- I don't see why we wouldn't class these are homophones; what's the reason not to? Mglovesfun (talk) 10:37, 1 May 2013 (UTC)
- I agree. No reason to say French sots isn't a homophone of sot. —Angr 11:26, 1 May 2013 (UTC)
- Yeah. Lmaltier, you just arbitrarily pick out one of the many definitions of "word" (= lexeme) and say that the current practice of adding homophones isn't in line with it. But there's no reason to pick out that particular definition in the first place. Longtrend (talk) 11:35, 1 May 2013 (UTC)
- From the PoV of someone not too familiar with French, it is at least minimally useful to know that there are many forms of a given lemma that are pronounced the same. DCDuring TALK 12:01, 1 May 2013 (UTC)
- For a learner of French it would actually be more useful to have a list of plural nouns that are not homophones of the singular. There aren't many; œufs and yeux spring to mind. —Angr 13:15, 1 May 2013 (UTC)
- You are right. No, I don't arbitrarily pick out one definition of word. I just want to use the noun homophone in the sense actually used by almost everybody referring to homophones. I understand that this sense is probably less clear in English, because inflected forms are not pronounced the same, but in languages such as French, different inflected forms are very often pronounced the same. Look at books or websites addressing homophones (such as http://tempsreel.nouvelobs.com/abc-lettres/saut-sceau-seau-sot/S/homophone.html). You'll see that they exclude inflected forms (because they consider they are the same word). I feel that Wiktionary homophone lists often misinterpret the sense of the word homophone, this was why I mentioned the issue. If you disagree, just try to find a book mentioning sots as a homophone of sot (or the same kind of case). Just a few references: http://books.google.fr/books?id=lTb3xSgNPecC&pg=PA3&dq=homophones&hl=fr&sa=X&ei=xGqBUcvtAYGK7Aa67YEw&ved=0CDUQ6AEwAA#v=onepage&q=homophones&f=false states words different in origin and signification. http://books.google.fr/books?id=Z7ZyXZAJgx8C&pg=PP7&dq=homophones&hl=fr&sa=X&ei=xGqBUcvtAYGK7Aa67YEw&ved=0CDoQ6AEwAQ states words which sound the same but have totally different meanings. http://books.google.fr/books?id=0crig9rvzpMC&pg=PA8&dq=homophones&hl=fr&sa=X&ei=xGqBUcvtAYGK7Aa67YEw&ved=0CFwQ6AEwCA#v=onepage&q=homophones&f=false states that homophones have a different meaning and a different spelling. I disagree with this last definition: bear as a noun and bear as a verb are true homophones. Anyway, I think that all these books seem to agree on the fact that homophones have unrelated meanings, which excludes inflected forms. Lmaltier (talk) 19:43, 1 May 2013 (UTC)
- Do those books ever list nonlemma forms? Do they list œufs as a homophone of eux? Do they list aime/aimes/aiment as homophones of M? (In English, bear and bear are usually called homonyms rather than homophones.) —Angr 20:06, 1 May 2013 (UTC)
- An example: http://people.mpim-bonn.mpg.de/zagier/files/exp-math-2/fulltext.pdf mentions œufs as a homophone of eux, and ôte as a homophone of haute, but never mentions inflected forms of the same word as homophones. Lmaltier (talk) 20:15, 1 May 2013 (UTC) No, this is not a good example: this mathematical paper mentions parle and parlent (but its objective is the demonstration of a theorem...). A better example is the linguistic site I mentioned above, which mentions œufs as a homophone of eux: http://tempsreel.nouvelobs.com/abc-lettres/eux-oeufs/E/homophone.html. They provide many homophones, classified by their first letter. Try to find cases such as sot/sots... Lmaltier (talk) 20:30, 1 May 2013 (UTC)
- Neither of those sources makes an attempt to be exhaustive, and they're not going to list forms that native speakers (who are their target audience) will find obvious and trivial—every French speakers knows (if only subconsciously) that virtually every plural noun is homophonous with its singular. But since we're an English-language dictionary, our target audience is English speakers, not French speakers, and our readers can't be expected to just know which inflected forms of words are going to be homophonous and which aren't. (Are aimait and aimer homophones? Without looking it up, I as a French learner honestly do not know.) —Angr 21:30, 1 May 2013 (UTC)
- An example: http://people.mpim-bonn.mpg.de/zagier/files/exp-math-2/fulltext.pdf mentions œufs as a homophone of eux, and ôte as a homophone of haute, but never mentions inflected forms of the same word as homophones. Lmaltier (talk) 20:15, 1 May 2013 (UTC) No, this is not a good example: this mathematical paper mentions parle and parlent (but its objective is the demonstration of a theorem...). A better example is the linguistic site I mentioned above, which mentions œufs as a homophone of eux: http://tempsreel.nouvelobs.com/abc-lettres/eux-oeufs/E/homophone.html. They provide many homophones, classified by their first letter. Try to find cases such as sot/sots... Lmaltier (talk) 20:30, 1 May 2013 (UTC)
- Do those books ever list nonlemma forms? Do they list œufs as a homophone of eux? Do they list aime/aimes/aiment as homophones of M? (In English, bear and bear are usually called homonyms rather than homophones.) —Angr 20:06, 1 May 2013 (UTC)
- You are right. No, I don't arbitrarily pick out one definition of word. I just want to use the noun homophone in the sense actually used by almost everybody referring to homophones. I understand that this sense is probably less clear in English, because inflected forms are not pronounced the same, but in languages such as French, different inflected forms are very often pronounced the same. Look at books or websites addressing homophones (such as http://tempsreel.nouvelobs.com/abc-lettres/saut-sceau-seau-sot/S/homophone.html). You'll see that they exclude inflected forms (because they consider they are the same word). I feel that Wiktionary homophone lists often misinterpret the sense of the word homophone, this was why I mentioned the issue. If you disagree, just try to find a book mentioning sots as a homophone of sot (or the same kind of case). Just a few references: http://books.google.fr/books?id=lTb3xSgNPecC&pg=PA3&dq=homophones&hl=fr&sa=X&ei=xGqBUcvtAYGK7Aa67YEw&ved=0CDUQ6AEwAA#v=onepage&q=homophones&f=false states words different in origin and signification. http://books.google.fr/books?id=Z7ZyXZAJgx8C&pg=PP7&dq=homophones&hl=fr&sa=X&ei=xGqBUcvtAYGK7Aa67YEw&ved=0CDoQ6AEwAQ states words which sound the same but have totally different meanings. http://books.google.fr/books?id=0crig9rvzpMC&pg=PA8&dq=homophones&hl=fr&sa=X&ei=xGqBUcvtAYGK7Aa67YEw&ved=0CFwQ6AEwCA#v=onepage&q=homophones&f=false states that homophones have a different meaning and a different spelling. I disagree with this last definition: bear as a noun and bear as a verb are true homophones. Anyway, I think that all these books seem to agree on the fact that homophones have unrelated meanings, which excludes inflected forms. Lmaltier (talk) 19:43, 1 May 2013 (UTC)
- For a learner of French it would actually be more useful to have a list of plural nouns that are not homophones of the singular. There aren't many; œufs and yeux spring to mind. —Angr 13:15, 1 May 2013 (UTC)
- From the PoV of someone not too familiar with French, it is at least minimally useful to know that there are many forms of a given lemma that are pronounced the same. DCDuring TALK 12:01, 1 May 2013 (UTC)
- Yeah. Lmaltier, you just arbitrarily pick out one of the many definitions of "word" (= lexeme) and say that the current practice of adding homophones isn't in line with it. But there's no reason to pick out that particular definition in the first place. Longtrend (talk) 11:35, 1 May 2013 (UTC)
- I agree. No reason to say French sots isn't a homophone of sot. —Angr 11:26, 1 May 2013 (UTC)
-
-
-
-
-
-
-
- “words different in origin and signification”: familiarisât, familiarisa and familiarisas have different signification and, judging from the different endings, were derived with (or descend from words with) different suffixes.
- “words which sound the same but have totally different meanings”: familiarisât, familiarisa and familiarisas sound the same and have different meanings.
- — Ungoliant (Falai) 20:33, 1 May 2013 (UTC)
- Are you serious? If you don't want to understand references as I do (and I find my interpretation really obvious), I cannot add anything except that: just try to find a linguistic book or a dictionary including inflected forms of the same word in their examples of homophones. Lmaltier (talk) 21:35, 1 May 2013 (UTC)
- Yes. Evanildo Bechara, Moderna Gramática Portuguesa:
- “Pode haver homofonia em um mesmo paradigma (“sincretismo”), como em cantava, 1.ª e 3.ª pess. do imperfeito, […] ”
- There can be homophony in the same paradigm (“syncretism”), as in cantava, 1st and 3rd person of the imperfect, […]
- He is claiming that cantava (1st person singular imperfect indicative of cantar) is homophonous with cantava (3rd person singular imperfect indicative of cantar). — Ungoliant (Falai) 22:00, 1 May 2013 (UTC)
- http://legacy.earlham.edu/~peters/writing/homofone.htm, Suber and Thorpe, "An English Homophone Dictionary", offers us axis and its plural axes. It's entirely natural to exclude them in French; it would be odd to exclude them in English but it's an extremely rare case, and I'm not familiar with any other language whose spelling is so confused as for homophones to be a major issue. (Well, Chinese, but that's a whole nother ballgame.)--Prosfilaes (talk) 07:39, 2 May 2013 (UTC)
- Your 1st example refers to the noun homofonia, not to the noun homophone (and even in English, I find the translation quite normal; my issue is not about homophony). Your 2nd example is more interesting: this case is so rare in English that it's understandable that it's interesting to mention that axis and axes are pronounced the same. But it seems clear to me that it's outside the scope of many definitions of the word homophone. Nonetheless, Webster's definition also seems to include this case, as a difference in spelling is one of possible conditions, according to this definition. This discussion seems to show that the general idea is rather clear, but that each author interprets the precise sense differently when trying to define it precisely. Lmaltier (talk) 19:58, 2 May 2013 (UTC)
- Homofonia means the quality of words being homophones. — Ungoliant (Falai) 20:09, 2 May 2013 (UTC)
- Actually, axis and axes are pronounced completely different in English, at least in the General American accent. In axis the 's' has a soft 'ss' sound while axes has a harder 'zz' and the 'e' is slightly longer. In short: "axis" = ack-sis while "axes" (plural of axe) = ack-siz and "axes" (plural of axis) = ack-seez. On-topic, I agree that plurals should be considered different words to clarify homophones to non-french speakers. --Soardra (talk) 20:13, 5 May 2013 (UTC)
- Your 1st example refers to the noun homofonia, not to the noun homophone (and even in English, I find the translation quite normal; my issue is not about homophony). Your 2nd example is more interesting: this case is so rare in English that it's understandable that it's interesting to mention that axis and axes are pronounced the same. But it seems clear to me that it's outside the scope of many definitions of the word homophone. Nonetheless, Webster's definition also seems to include this case, as a difference in spelling is one of possible conditions, according to this definition. This discussion seems to show that the general idea is rather clear, but that each author interprets the precise sense differently when trying to define it precisely. Lmaltier (talk) 19:58, 2 May 2013 (UTC)
- Are you serious? If you don't want to understand references as I do (and I find my interpretation really obvious), I cannot add anything except that: just try to find a linguistic book or a dictionary including inflected forms of the same word in their examples of homophones. Lmaltier (talk) 21:35, 1 May 2013 (UTC)
-
-
-
-
-
-
- I agree with Prosfilaes and Ungoliant, familiarisât, familiarisa and familiarisas are homophones. - -sche (discuss) 21:06, 1 May 2013 (UTC)
Context labels[edit]
Hi! I am adding context labels to my Wiktionary parser. There are 1001 context labels in English Wiktionary which should be added manually to parser by me and my colleagues :) I have several questions:
- Should Template:Karabakh, Template:Kromanti and Template:Tigranakert be moved from Category:Context labels to Category:Regional context labels?
- Does Template:item is really context label template (now it belongs to Category:Context labels) or it is a usual (not context label) template which adds links to "[talk] and [citations]"?
- If context label template has two categories: category and subcategory (e.g. Template:Ijekavian has categories "Context labels" and "Regional context labels") then the more specific category should be remained?
-- Andrew Krizhanovsky (talk) 08:15, 1 May 2013 (UTC)
-
- They look regional to me.
{{item}}is not used in principal namespace. Are you parsing outside principal namespace?- I personally favor having only the more specific category for any large category of "context labels".
- HTH. DCDuring TALK 12:09, 1 May 2013 (UTC)
Helping parsers and scrapers might be a good reason to explicitly use {{context|something}} or {{label|something}} instead of having an open set of labels {{something}}. Would this be helpful for the parser project? —Michael Z. 2013-05-01 15:47 z
Thank you!
Yes, parser will use {{context|something}} but an open set of labels {{something}} will be parsed also.
Templates Karabakh, Kromanti and Tigranakert moved to "Category:Regional context labels".
Dear DCDuring, I didn't catch what is "outside principal namespace"? -- Andrew Krizhanovsky (talk) 18:17, 1 May 2013 (UTC)
- Don’t thank me yet. This is a proposal on the table, but we haven’t moved forward yet.
- DCD wonders if you will parse Appendix:, Wiktionary:, Talk:, or pages in other namespaces. —Michael Z. 2013-05-01 18:45 z
- OK, now I am parsing only main namespace. -- Andrew Krizhanovsky (talk) 21:12, 1 May 2013 (UTC)
Someone broke something here. Why is a category "Regional context labels Armenian" appearing in e.g. ճղոպուր (čłopur)? --Vahag (talk) 12:18, 3 May 2013 (UTC)
- It's a wiki-magic :( I don't understand how it is happen. -- Andrew Krizhanovsky (talk) 08:00, 4 May 2013 (UTC)
Dialects (Context labels)[edit]
There are four templates: {{dialect}}, {{dialectal}}, {{dialectal-n}} (not used now!), {{dialects}}. Is there any difference between these templates? -- Andrew Krizhanovsky (talk) 04:30, 2 May 2013 (UTC)
{{dialect}},{{dialectal}}, and{{dialects}}all categorise an entry into Category:Language name dialectal terms, but they display different text in contextual descriptions, according to the name of the template. I don't know about{{dialectal-n}}. I'm so meta even this acronym (talk) 12:20, 2 May 2013 (UTC)- OK. Thank you. -- Andrew Krizhanovsky (talk) 13:19, 2 May 2013 (UTC)
-
-
- You're welcome. :-) I'm so meta even this acronym (talk) 21:47, 2 May 2013 (UTC)
{{context-n}}looks like 'context new' to me. It seems to do the same job but without using brackets. Am gonna rfd it. Mglovesfun (talk) 10:45, 3 May 2013 (UTC)
- You're welcome. :-) I'm so meta even this acronym (talk) 21:47, 2 May 2013 (UTC)
-
Yorubic (Religion context labels)[edit]
There are no entries with religion context label {{Yorubic}}. Should it be kept or deleted? -- Andrew Krizhanovsky (talk) 16:15, 8 May 2013 (UTC)
- Nominated for deletion. Mglovesfun (talk) 15:05, 16 May 2013 (UTC)
board sports vs. skateboarding[edit]
Must we merge two Sport context labels templates: {{board sports}} and {{skateboarding}}? There are only 3 entries with template "board sports" (see list). -- Andrew Krizhanovsky (talk) 07:48, 14 May 2013 (UTC)
- I think so, unless board sports is meant to include surfing and snowboarding. —Michael Z. 2013-05-16 14:46 z
-
- OK, I see. -- Andrew Krizhanovsky (talk) 10:44, 17 May 2013 (UTC)
video games vs. video game genre[edit]
Must we merge: {{video games}} and {{video game genre}}? There are only 4 entries with template "video game genre". -- Andrew Krizhanovsky (talk) 08:20, 14 May 2013 (UTC)
Template:mathematics[edit]
Must we move {{mathematics}} from Category:"Topical context labels" to Category:"Mathematics context labels"? -- Andrew Krizhanovsky (talk) 07:56, 14 May 2013 (UTC)
- We have never achieved consensus on how to consistently treat such labels. Originally, we only had "context labels" that indicated limited usage contexts. Then we introduced large-scale use of topical context labels without differentiating cleanly the four cases:
- a term-definition is widely used and understood, but clearly has a topic associated, eg, sum has a definition that belongs to the "topic" 'arithmetic'.
- a term-definition has a topic associated, but is only understood and used in a narrow, usually technical context, eg, affine transformation.
- a term is used by a technical community, but the subject matter is not limited to that community. Examples might be military slang for a civilian.
- widespread use, no specific topic. We act as if most definitions are of type 4, without having defined what "no specific topic" might mean. Most function words would be of type 4. Presumably also most basic verbs.
- Topical context labels should apply to types 1 and 2. Usage context labels should apply to types 2 and 3. Ruakh has defended the use of topical context labels, even for words that are widely understood, presumably including sum (type 1). MZajac has advocated more or less banning the use of topical context labels where the topic did not also provide a usage context (ie, type 1). I don't think there is a consensus. One can find dictionaries that seem to follow either. But I have yet to find a print or online dictionary that seems to impose topical labels on all the term-definitions that could "logically" be assigned to a topic.
- If you look at the topical category Category:en:Arithmetic, you will find several terms that are widely understood and used, but not sum which clearly has a definition that belongs in that topic.
- IOW, this is a can of worms. But it can be swept under the carpet again. DCDuring TALK 11:57, 16 May 2013 (UTC)
-
- I’m not sure I am clear on the four use cases. Is no. 4 the case where no label is normally applied?
- I think I would add one other, perhaps overlapping with any of 1 to 4: 5. a term whose technical meaning is prescribed by some authority and accepted in its field of usage, even though it may not be easily attested by citations. Examples may include technical or legal definitions. —Michael Z. 2013-05-16 17:12 z
- "Is no. 4 the case where no label is normally applied?" Yes, exactly. Does that make the rest of it clearer?
- "meaning is prescribed by some authority and accepted in its field of usage" That's an interesting situation - not uncommon - that adds an additional dimension, ie, another four types. DCDuring TALK 17:20, 16 May 2013 (UTC)
- DCDuring, we're talking about the categorization of the template, not how it is used. Context labels can be divided into subcategories using
|tcat=foo. There's really no consensus over whether to do this or not, and pretty much nobody cares. I started doing it a few years ago and realized there are many more important things I could do. Mglovesfun (talk) 10:48, 17 May 2013 (UTC)
- DCDuring, we're talking about the categorization of the template, not how it is used. Context labels can be divided into subcategories using
Template:game theory[edit]
Must we move {{game theory}} from Category:"Games context labels" to Category:"Mathematics context labels"?
See w:Game theory. -- Andrew Krizhanovsky (talk) 08:27, 14 May 2013 (UTC)
Template:element symbol[edit]
This template prints the text "(chemistry)". It would be reasonable to move it from Category:Context labels to Category:Science context labels. -- Andrew Krizhanovsky (talk) 10:48, 16 May 2013 (UTC)
- Agreed.
Done —Μετάknowledgediscuss/deeds 02:41, 17 May 2013 (UTC)
Latin[edit]
The template {{la-proper noun-indecl}} belongs to Category:Grammatical context labels. I think it is an error, because:
- context labels are not bound to a specific language (Latin here),
- In two entries (Adam#Latin and Abraham#Latin) this template is located in unusual (for context labels) place.
The template {{la-conj-form-gloss/iacio/context6}} is also strange context label template, which is not presented in any entries. -- Andrew Krizhanovsky (talk) 11:03, 16 May 2013 (UTC)
- Technically, it is a grammatical label, although usually we build these into our declension templates. A weakness of our whole labelling system is that sometimes a label belongs on the headword, but it is technically and visually awkward to put it there. (Also, “grammatical context label” is a nonsense phrase demonstrating that we should stop calling usage and grammatical labels “context labels.”) —Michael Z. 2013-05-16 15:02 z
- It's because
{{indecl}}was called by this template, now it isn't, it uses{{qualifier|indeclinable}}. Mglovesfun (talk) 15:03, 16 May 2013 (UTC)
- It's because
Sursilvan and Surselvisch[edit]
Templates {{Sursilvan}} and {{Surselvisch}} prints the same "Sursilvan" and there are no links to the latter template. -- Andrew Krizhanovsky (talk) 06:51, 23 May 2013 (UTC)
- They also both default to categorising words as Romansch... so I've gone ahead and hard redirected
{{Surselvisch}}to{{Sursilvan}}. - -sche (discuss) 07:37, 23 May 2013 (UTC)
Japanese Romanization[edit]
Although I know this is an old topic and something that had already been discussed and decided on back in 2006, I think there was a MAJOR point that wasn't discussed. Most keyboards (at least in America) do not have keys for the extra symbols used in Hepburn romanization and would make searching difficult. Thus, it makes more sense for words like どうじょう to be romanized as dojo (which is what most people are accustomed to seeing as it is the form used in most main-stream publications) and include doujou in the article (which is how you would input it using romaji in Microsoft IME) as well as dōjō (which I've only seen in Google translate). In addition, it's a hassle for me, and probably others who work on these entries, to manually input the macrons when creating links to romanizations.
In short, what I propose is that the vowel romanizations be main-stream and keyboard-friendly with the alternate romanizations mentioned (and even hard redirected from), but not linked to. I know that this would lead to some words linking to the same romanization, but the distinction between different words can be distinguished on the romaji page and would add actual functionality to said pages. As it stands right now, they are just simply soft redirection pages that serve no real purpose.
Example:
Romanization[edit]dojo (hiragana: どうじょう)
|
I know that this would require a LOT of work, but I think it will make the Japanese romanization entries more search-friendly and help to bring purpose to those pages beyond a simple redirect. --Soardra (talk) 20:13, 5 May 2013 (UTC)
- I don’t disagree, but I’d like to point out that searching doesn’t appear to be an issue to me. OMM searching for plain dojo finds macronized dōjō on this page, in the search field’s suggestions, in the search results, and in google. —Michael Z. 2013-05-06 00:14 z
-
- All or most Roman diacritic symbols are not an issue in the Wiktionary or Google search. Adding additional transliterations would add an overhead on editors but eventually there could be a module or a template that does it automatically. I personally don't see the need to add new outdated transliterations.
- As a side-note, Wiktionary transliteration is slightly tweaked and follows the trend of popular dictionaries and practical needs. (Hepburn standard has various versions as well). Combinations like kana (おう) "o + u" are transliterated as "ō" when it's a long sound (お父さん (おとうさん - otōsan)) or "ou" when it's a verb form (思う (おもう - omou)). いい (ii) can be either "ī" or "ii". Adjective endings are always "ii". Other long vowels are consistently romanised with a macron - ā, ē, ū. Particles "は", "へ" and "を" are "wa", "e" and "o", not "ha", "he" and "wo" as when you type them in Microsoft IME. Microsoft IME is only needed when you need to enter a Japanese text on a computer. For this you need to know simple rules what input corresponds what kana letter (before converting to kanji) but there are variants, like tu = tsu, hu = fu, etc. --Anatoli (обсудить/вклад) 02:39, 6 May 2013 (UTC)
- Hmm, I guess I assumed that you wouldn't be able to search for it using a standard keyboard. I see that this assumption was wrong, but I do, however, still think that there should be centralized romanization pages with variations listed or the current romanizations should be hard redirects to hiragana entries since the most recent ruling turned them into redundant soft redirects. --Soardra (talk) 03:23, 6 May 2013 (UTC)
- I see no problem in creating hard redirects from unstandard romanisations to standard ones (if terms don't exist in other languages). Soft redirects are not redundant, since there are variant Japanese spellings and the romanisations also happen to be words in other languages. Nobody wished to convert them to hard redirects and the new ruling is the result of long discussions and a consensus. You can put your proposal in Wiktionary_talk:About_Japanese about the romanisation pages. The number of links needed to get to full Japanese entries just seems to be growing: "non standard Roman spellings" -> standard romaji -> katakana/hiragana -> kanji (if exists). Do we really need that many steps? JA editors are better off focusing on the Japanese language, not on all possible misspellings (with various numbers of spaces) in a script, which is not used by the Japanese. --Anatoli (обсудить/вклад) 04:03, 6 May 2013 (UTC)
- Hmm, I guess I assumed that you wouldn't be able to search for it using a standard keyboard. I see that this assumption was wrong, but I do, however, still think that there should be centralized romanization pages with variations listed or the current romanizations should be hard redirects to hiragana entries since the most recent ruling turned them into redundant soft redirects. --Soardra (talk) 03:23, 6 May 2013 (UTC)
-
-
-
-
-
- I think there are a couple different things at work here. Some of the alternate spelling conventions that Soardra mentions are non-standard in terms of both “not according to [official] standards” and “not according to Wiktionary’s selected standards”. For instance, spelling Japanese long vowels without indicating the length in some way (such as dojo for 道場 dōjō) is common in writings by folks who 1) aren't that specific and/or knowledgeable about Japanese and 2) can't be bothered to deal with diacritics / spellings. Neither of these considerations are appropriate for a dictionary, and given that vowel length is phonemic in Japanese, we really shouldn't be including such spellings at all, provided that these common spellings can still be used to find the proper entries -- and this does appear to be the case, thankfully.
- Spelling Japanese in Latin letters in a manner similar to input for the Japanese IMEs provided by Microsoft, Apple, and the various Linux communities works fine for input, but is inconsistent with usage by any English-based learning materials I've seen, and with any official governmental romanization scheme, and also with any academic romanization scheme. One might see Japanese romanized in this fashion, and it's common enough that it has its own moniker in Japanese (ワープロ式 wāpuro-shiki, "word-processor style"), but again, I don't think this romanization scheme has any place in a dictionary (other than the term itself).
- In terms of what we use here, that's explained at Wiktionary:About_Japanese/Transliteration, pointed to from WT:AJA#Transliteration. Perhaps the nub of the real issue here is that we don't make WT:AJA prominent enough for new users?
- (Then again, Wiktionary:About_Japanese/Transliteration is rather horribly out of date and does not describe either what we do here or what I've perceived as general practice for Japanese romanization schemes in general -- it is quite badly in need of a rewrite. Examples: we try to avoid hyphenation in most cases, using spaces instead, and we split suru verbs with a space before the suru, among other issues. I'll set to reworking that as time allows.)
- More general background information is available at w:Romanization of Japanese.
- -- Eiríkr Útlendi │ Tala við mig 00:47, 13 May 2013 (UTC)
-
-
-
-
-
-
-
-
-
-
- @Mzajac. Eirikr has answered in the first paragraph. I have no strong opinions on redirects from non-standard transliteration but I lean on not to have them.
- @Eirikr. Yes, the page needs rewriting. I actually don't think it's either our practice or other dictionary practice to romanise long vowels as "aa", "ee", or "ii" (unless they are different symbol or part of the inflected adjective form (い-adjectives). Niigata (not Nīgata) is another notable exception, perhaps for historical reasons I don't know why 新 in 新潟 is read "nī". . Somehow, the exception is made for long "ō" and "ū", very strange, even if "ō" is more controversial おお (oo) vs おう (ou) (with well-know exceptions - verb endings - おもう (omou) or separate stems - こうま(kouma)). I don't remember the outcome of the last discussion but I consistently use macrons, if they are not exceptions as above. I'm sure one of the versions of Hepburn standards supports this, anyway. We can discuss details on Wiktionary talk:About Japanese/Transliteration --Anatoli (обсудить/вклад) 01:09, 13 May 2013 (UTC)
- @Anatoli, 新 (nī) is an OJP-derived prefix attaching to nouns, meaning "new" or "fresh" or "first", or some variation thereof. The 新#Japanese entry is much in need of expansion. FWIW, I think the romanization Niigata uses the doubled-"i" for historical reasons, making this an(other) exception to the rule. :) -- Eiríkr Útlendi │ Tala við mig 03:40, 13 May 2013 (UTC)
-
-
-
-
-
Golin, a Papuan language[edit]
Just an observation that gĺ is the only Golin word that we have listed. It might be amusing if it wasn't sad. Pengo (talk) 02:22, 7 May 2013 (UTC)
- Okay? I can't find a list of counts of entries by language, but looking at Category:Nouns by language and Category:All languages, there's no more then 1,500. Given that there's 5,000 living languages, Golin's doing better then many other languages. Moreover, I'm not sure the value of adding vocabulary that no one is going to look up; the few people who know this language and have Internet access probably own a copy of the existing dictionary we would probably not be more then a pale copy of. If that's what you want to do, more power to you, but I don't think departments of anthropology are going to reference Wiktionary.--Prosfilaes (talk) 12:01, 7 May 2013 (UTC)
- I notice that the pronunciation section of nil kabe is n'l kabé. Aren't pronunciations supposed to be in IPA (or SAMPA)? - -sche (discuss) 05:22, 8 May 2013 (UTC)
- Have fixed it up.
Not sure if/how the tones should be incorporated into the IPA though, so I've kept it as a separate "tone guide". (source material)Pengo (talk) 07:57, 8 May 2013 (UTC)
- Have fixed it up.
Translations lacking transliteration categories[edit]
After testing a change to {{t}} (thanks to CodeCat and Yair rand) I have created categories for 36 selected languages so far Category:Translations which need romanization (each category has a subcategory): Abkhaz, Adyghe, Arabic, Armenian, Bashkir, Belarusian, Bengali, Bulgarian, Burmese, Chechen, Georgian, Greek, Hebrew, Hindi, Japanese, Kannada, Kazakh, Khmer, Korean, Kyrgyz, Lao, Macedonian, Malayalam, Mandarin, Mongolian, Ossetian, Persian, Russian, Sinhalese, Tajik, Tamil, Tatar, Telugu, Thai, Ukrainian, Yiddish. I've taken out Mandarin from the template, since translations into traditional version are usually not supplied.
New translations into above languages using {{t}} without transliterations are added immediately, also any edits on English articles with translations will cause the translations to be picked up. It takes some time for categories to be filled for older translations. Please don't add languages you're not going to work with. To add a new language some change to {{t}} is required and two new categories. Any help in adding transliterations is appreciated. --Anatoli (обсудить/вклад) 00:09, 9 May 2013 (UTC)
Special:Contributions/MewBot[edit]
CodeCat (talk • contribs) via her bot MewBot is going way too far in my opinion. Violating Wiktionary:Bots#Policy by making controversial edits en masse, such as orphaning {{inv}} before the end of the deletion debate (where it looks very likely to pass as well), removing genders from {{fr-adj-form}} such as this and converting {{m|p}} to {{m-p}} (has this been discussed anywhere or is this a CodeCat personal project?)
I'm well aware of all the good work done by CodeCat, and I understand she has very specific ideas about how she wants Wiktionary to progress, but some of these ideas are controversial and shouldn't be implemented in this way. Mglovesfun (talk) 18:49, 9 May 2013 (UTC)
- About the orphaning of
{{inv}}, I kind of did expect someone to speak up about that. But I reasoned, if there had been no deletion debate, I still would have made these changes wherever I found them to bring it in line with Wiktionary practice, and nobody would have minded. I thought, if nobody is going to complain about many incidental edits spread over time, it would be strange if it wasn't also ok to do it all at once. I found it a bit strange that people were saying keep based on the template's widespread usage, which came across as circular reasoning ("we agree with using it, because it's widely used"). I hoped that if it became less widely used, people would judge the template more on its merits and not on its current usage. - Removing genders from adjective forms was done because the gender information is already in the definition, so duplicating it seemed a bit strange.
- As far as I know, the format of genders was discussed before, in particular in regards to Module:gender and number. I don't remember when exactly but it was shortly after Lua was introduced, and the module has been around for a while now, steadily increasing in usage as more modules make use of it. —CodeCat 19:25, 9 May 2013 (UTC)
-
- I haven't really kept in touch with policy discussions as of late but I certainly can't remember anything about genders being discussed, and so it should never have been processed by bot.
I think I've stated it before but: we've known each other since 2006, and I know that CodeCat is usually very reasonable, but it seems that there are occasional issues with changes being performed with no consensus behind it. I am confident that CodeCat does not have any bad intent and just needs to be told to take a bit more time with changes. We're not in a hurry or anything. -- Liliana • 21:13, 9 May 2013 (UTC)
- I haven't really kept in touch with policy discussions as of late but I certainly can't remember anything about genders being discussed, and so it should never have been processed by bot.
First-person singular imperative (Portuguese)[edit]
All our Portuguese verb conjugation templates include a first-person singular (affirmative and negative) imperative. I have programmed the bot accordingly. A native Portuguese speaker (ValJor) has pointed out that no such thing exists. This sounds reasonable to me. Shall I modify all the templates (and my bot) accordingly? SemperBlotto (talk) 10:34, 10 May 2013 (UTC)
- Hmm, does Portuguese have a different form for giving oneself orders? I will occasionally shout orders at myself out of frustration using the imperative in English. Does Portuguese have a different form, a non-second person form? Mglovesfun (talk) 10:39, 10 May 2013 (UTC)
- I'm only a pt-1, but if it is similar to Italian, there should not be a first-person singular imperative. I believe that Italians use the third-person when encouraging themselves. SemperBlotto (talk) 10:45, 10 May 2013 (UTC)
- It doesn’t exist. It appears that the person who created
{{pt-conj}}invented it. See Talk:cantar, Wiktionary:Beer parlour archive/2012/April#First-person Singular Imperative of Portuguese Verbs and WT:T:APT#Banning first-person imperative. - MG: Portuguese uses the subjunctive present when giving the first person singular and the third person orders. — Ungoliant (Falai) 12:03, 10 May 2013 (UTC)
- OK. I'll make sure my bot doesn't create any more. Then I'll update the five thousand Portuguese conjugation templates. SemperBlotto (talk) 07:11, 11 May 2013 (UTC)
Module:gender and number[edit]
I have worked on this a bit more, and it now supports everything that our templates do, and a bit more as well. This module is already fairly widely used, not just in modules I made, either... others have used it as well. But a few people were wondering about this module and how it works, so I wrote some documentation for it to explain it, and I am now "introducing" it. I think this module can replace the current templates like {{m}}, at least as far as other templates go. If someone writes {{m}} in an entry directly, it can't be replaced, but we could change {{m}} and such themselves so that they use this module rather than the current wiki code.
There is a rather strong point to note, though. There is a slight incompatibility between this module and the templates in the way we have traditionally denoted combinations of gender and number. If you write {{m|p}}, then the templates will "know" not to display a separator between the two, it "knows" that both form a single gender specification. But in the module, m|p means "masculine or plural", and you need to write m-p instead to get the combination. This is done to keep things simpler but it also has another purpose. Gender specifications like m|f|p are ambiguous. Does it mean masculine (singular), feminine (singular) and plural (all genders)? Or does it mean masculine (singular) and feminine plural? Or masculine plural and feminine plural? The older scheme does not distinguish this, while the module does. The difference can be significant as well. Dutch, German and Swedish for example do not have a "masculine plural" or "neuter plural", only a generic plural for all genders, so for them, "masculine or plural" makes sense while "masculine plural" does not. On the other hand, French or Spanish do have "feminine plural", so then "plural" on its own isn't a valid gender. For that reason I created a set of new combined templates like {{m-p}}, and started to add them where appropriate. A few people have complained that this wasn't discussed properly, and I kind of agree, so I am mentioning it here now.—CodeCat 15:41, 10 May 2013 (UTC)
- Neat. While making perfect sense within our template system, it also manages to elegantly use the vertical bar symbol (|) here with the meaning familiar to programmers of "or" (e.g. m|p means "masculine or plural") Pengo (talk) 22:02, 11 May 2013 (UTC)
- Oh, that was actually my own shorthand. The module itself is indifferent to the method of separating the individual genders from each other, because it receives them already split, in the form of a list. The module can also be invoked from a template, like this:
{{#invoke:gender and number|show_list|m|f}}
-
- This will display m, f currently. This means that any code that uses this module will have to perform the split itself. This is intentional, because there is currently a wide variety of different ways to specify multiple genders.
{{head}},{{l}}and{{nl-noun}}use ag2=parameter,{{t}}uses additional unnamed parameters, while{{fr-noun}}usesmfand then "interprets this" accordingly. It would actually not be a very good idea to use "|" as the separator to separate multiple specifications in a single string, because that would interfere with how templates interpret that character and you'd end up having to use{{!}}all the time. If we do decide to use single strings for multiple genders in this module, a comma would probably be a better choice. —CodeCat 22:29, 11 May 2013 (UTC)
- This will display m, f currently. This means that any code that uses this module will have to perform the split itself. This is intentional, because there is currently a wide variety of different ways to specify multiple genders.
- If no one objects, I would like to replace the remaining occurrences of
{{f|p}}and such with{{f-p}}, so that we can then look at migrating our templates to this module. —CodeCat 10:32, 15 May 2013 (UTC)
Using talk pages for RFV, RFD, Etymology Scriptorium and Tea Room[edit]
I realise that we've tried this before, but I'm not sure why it failed exactly. What I also wonder is why it seems to work better on Wikipedia. Keeping the discussions on the talk pages would have several advantages:
- Things are kept in the place where they are the most relevant.
- The discussions wouldn't be forgotten or missed once they are no longer at the bottom of the page.
- Archiving becomes much easier (which is what I like about Wikipedia's method).
I am wondering what exactly would be needed to make this work. The major downside of talk pages is that any edits to them go unnoticed by the large majority of editors, so keeping things in a centralised place would be good. That's why we have the discussion rooms. Doesn't Wikipedia have bots that automatically add new discussions to the list? —CodeCat 13:13, 12 May 2013 (UTC)
WT:Families[edit]
Hey all, just thought I'd make it public known (moreso) by posting here that I made a bit of changes to this Families page. It's nothing controversial I hope; the change I am highlighting can be seen here. It's mainly to go with the fact that I made {{etyl:ngf-sbh}} to replace {{etyl:South Bird's Head}}. User: PalkiaX50 talk to meh 14:30, 12 May 2013 (UTC)
- I'm glad we have a 'regularly-formed exceptional code' for South Bird's Head now, if that's not too much of an oxymoron. :) I've moved the explanation of how to create codes for subfamilies whose superfamilies have codes (and the example, ngf-sbh) into the previous paragraph. I meant to include such a line there when I overhauled the page last year, but as you can tell, I forgot (or did a bad job of it); that's why "For example, the Pama-Nyungan family is aus-pam: "aus" is the ISO 639-5 code for Australian languages" was sitting around after the bit about Germanic for no apparent reason, lol. - -sche (discuss) 17:40, 12 May 2013 (UTC)
- Cool, thanks for that. User: PalkiaX50 talk to meh 18:49, 12 May 2013 (UTC)
Is this code used in HTML lang attributes? If we are just making up language codes, then let’s make up ones that won’t break our web pages.
HTML5 requires a lang attribute to contain a valid language code.[16] ngf-sbh is not valid. ngf-x-sbh would be a valid language tag, as a private-use extension.[17] —Michael Z. 2013-05-16 02:35 z
map of American English dialects[edit]
Those interested in American English dialects may find this large, detailed map of dialects and their features interesting: [18]. - -sche (discuss) 19:17, 12 May 2013 (UTC)
- Thanks. It seems quite good. There's still a bit more to do. For example, I think that Chicago has a dialect distinct from its surrounding communities, just as Pittsburgh, New York, New Orleans, Cincinnati, and San Francisco do. DCDuring TALK 21:03, 12 May 2013 (UTC)
- I have a challenge to anyone I meet online for them to guess where I grew up based on my accent (hint: it's in the United States). I will give narrow IPA transcriptions to the best of my abilities for any words requested, answer vocabulary questions, and if necessary, record audio. Anyone who wants to try can have a crack at it on my talkpage or by emailing me! —Μετάknowledgediscuss/deeds 21:54, 12 May 2013 (UTC)
- I find it interesting that the map indicates no cot-caught merger in Texas, while w:Texan English says (attributed to what is apparently a reliable source) "The cot-caught merger is found almost everywhere in Texas." Who to believe? —Angr 22:10, 12 May 2013 (UTC)
- UPenn has a map of just that merger, which seems to suggest the two words are distinct in southern Texas, merged in northern Texas (with a split similar to that which Aschmann marks between Inland and Lowland). The merger has also spread over time, so it's possible the different information comes from different times. - -sche (discuss) 02:42, 13 May 2013 (UTC)
- I find it interesting that the map indicates no cot-caught merger in Texas, while w:Texan English says (attributed to what is apparently a reliable source) "The cot-caught merger is found almost everywhere in Texas." Who to believe? —Angr 22:10, 12 May 2013 (UTC)
- I have a challenge to anyone I meet online for them to guess where I grew up based on my accent (hint: it's in the United States). I will give narrow IPA transcriptions to the best of my abilities for any words requested, answer vocabulary questions, and if necessary, record audio. Anyone who wants to try can have a crack at it on my talkpage or by emailing me! —Μετάknowledgediscuss/deeds 21:54, 12 May 2013 (UTC)
small template idea[edit]
Would it be acceptable to make any declension templates for displaying definite and indefinite articles with nouns? It is probably better suited for more inflected nouns, though. I bring this up because the German Wiktionary has something like this, for any languages that have articles. --Æ&Œ (talk) 20:26, 15 May 2013 (UTC)
- Can you make an example of what you want? Like a table or something. — Ungoliant (Falai) 03:22, 16 May 2013 (UTC)
Standard spelling of[edit]
I noticed something on [[licence]] which I think could be used more widely when handling US/UK/India/etc spellings: the use of Standard spelling of rather than Alternative spelling of. Obviously, a dedicated template would be preferable to the {{form of|...}} that licence uses at the moment, but what do you think of the general idea?
{{alternative spelling of}} would still be used when spellings are equally standard within the same dialect(s), e.g. aarrghh vs argh. {{standard spelling of}} would only be used in entries like disfavor, which is not merely an "alternative" to [[disfavour]] that some people in the US use, but the standard US spelling. In those entries, "standard" would be more accurate and less likely to be misinterpreted — as someone commented earlier, we don't mean "alternative" as a value judgement, but some people perceive it as one, and either (as non-native speakers) go away thinking the lemma is preferred or (as native speakers) get upset that their variant has been "slighted".
We could even use parameters like {{standard spelling of|foo|in=US|in2=Australian}}, rather than context labels, to effect display of Standard US and Australian spelling of and sort entries into Category:American English standard forms (or just Category:American English) etc, with reciprocal qualifier-like templates on the lemmata—like {{British spelling}}, except displaying (British spelling) rather than just (British) so as not to imply the sense that followed was what was restricted to the UK—to sort the lemmata into Category:British English standard forms/Category:British English.
For sets of spellings in which one entry has already been lemmatised (e.g. disfavour, disfavor), we should keep the status quo; for sets where there isn't a lemma yet and content is currently duplicated (color, colour), we could make the oldest entry the lemma, rather like WP does.
Thoughts? - -sche (discuss) 21:45, 17 May 2013 (UTC)
- We wouldn't need this so such if we marked all the spellings that are less common currently so that we could leave the standard one unmarked. But that is completely unrealistic, at least for the next six months. Thus, if we can improve a set of entries that are alternative spellings of a single underlying term by marking the standard one using
{{standard spelling}}, we should do so. - I can't support categories at this time, because they would be completely misleading for at least a "six-month" transition period until all English spellings were properly marked. We would only have to mark about 2,500-3,000 a day to get this done in six months for English and about 16-19,000 a day to get this done for all languages. DCDuring TALK 23:48, 17 May 2013 (UTC)
- DCDuring, it seems like you've completely misunderstood the entire post. Please read it through again. —Μετάknowledgediscuss/deeds 01:48, 18 May 2013 (UTC)
- No, thanks. DCDuring TALK 02:12, 18 May 2013 (UTC)
- OK then. Assuming that I am the one who has completely misunderstood, care to explain what I got wrong? —Μετάknowledgediscuss/deeds 04:20, 19 May 2013 (UTC)
- @Metaknowledge. How could I know that?
- Perhaps I should have asked more questions about the proposal. Most of my concerns were with the categorization, which would have to be either complete or very well explained to be useful. As I am very skeptical about users reading and understanding our categorization criteria, which are rarely (never?) documented, at best subjective and unsubstantiated, and often whimsical, I focused on completeness.
- Though you didn't say why, you also expressed opposition to categorization. DCDuring TALK 12:32, 19 May 2013 (UTC)
- OK then. Assuming that I am the one who has completely misunderstood, care to explain what I got wrong? —Μετάknowledgediscuss/deeds 04:20, 19 May 2013 (UTC)
- No, thanks. DCDuring TALK 02:12, 18 May 2013 (UTC)
- @-sche: I strongly support all of it except creating new categories. —Μετάknowledgediscuss/Special:Contributions/Metaknowledge 01:48, 18 May 2013 (UTC)
- DCDuring, it seems like you've completely misunderstood the entire post. Please read it through again. —Μετάknowledgediscuss/deeds 01:48, 18 May 2013 (UTC)
-
-
- AFAICT, you both understood the proposal.(?) I've starting switching a small number of
{{alternative form of}}s to Standard form (pending the creation of a dedicated{{standard form of}}template). I won't create new categories. We could still use the existing categories (Category:American English, etc), but I won't do that without further discussion. - -sche (discuss) 15:51, 19 May 2013 (UTC)
- AFAICT, you both understood the proposal.(?) I've starting switching a small number of
-
Migrating towards Module:languages[edit]
By way of experiment, I have changed {{languagex}}, {{derivcatboiler}} and a few other templates to use Module:language utilities (which is a "gateway" to Module:languages) instead of the traditional language code templates. From what I've seen, this move hasn't broken any more than a handful of pages (which I fixed), and it seems like it was rather easy. I noticed that some of our current templates have already become completely orphaned through this change, including most (if not all) of the proto-language code templates. I expect that the /family subtemplates will also end up orphaned once the software has worked its way through the queue.
So I would like to ask if it's ok to continue with the migration, by changing all remaining uses of the language code templates and their subpages to use the Lua module instead. —CodeCat 19:31, 18 May 2013 (UTC)
- I don't think the module has been updated to reflect the changes in the mean time to various /script and /family subtemplates. Unless I'm wrong, can you deal with that first? —Μετάknowledgediscuss/deeds 04:22, 19 May 2013 (UTC)
- I fixed those yesterday. I edited
{{langt}}to check whether the two match, and add the code template to Category:Language codes with desynchronized data if they do not. I don't know how often that category updates, so it's possible that more changes were made in the meantime that are not shown in the category. I suppose that's one reason why we should do this sooner rather than later. —CodeCat 11:48, 19 May 2013 (UTC)- I've been working on deleting the /names subtemplates. They were never really used for anything to begin with. For the next step I would like to orphan and delete the /family subtemplates. This will be more work because there are a lot more of them, and there is a chance that some of them are still being transcluded. So I propose the following "plan":
- Edit
{{langt}}to categorise all language code templates that currently have a /family subtemplate into Category:language code templates with family. - Go over each of those templates with a bot, checking the /family subtemplate for transclusions. If there are no transclusions, replace the contents of the /family template with
[[Category:family subtemplates to be deleted]]. That category will then need to emptied out by some means. - If any codes remain in Category:language code templates with family after this is complete, those need to be orphaned manually.
- Edit
- Is this ok? —CodeCat 14:28, 21 May 2013 (UTC)
- You are preserving the contents of the templates' /names pages in the module, yes? Some of them were used / contained content. - -sche (discuss) 17:22, 21 May 2013 (UTC)
- Yes, their contents was moved over to the module so the names were not lost. I actually think that we can use that information in the future to make things like
{{langrev}}automatically-generated. It would also be useful to add it to the category of each language. —CodeCat 18:05, 21 May 2013 (UTC)- All looks good. I just wanted to point out that fixing up the langrev apparatus and Lua-ising it is also very important to me, so I hope someone will attack that soon. —Μετάknowledgediscuss/deeds 02:52, 22 May 2013 (UTC)
- I updated Wiktionary:Languages to use the names from the module now. —CodeCat 18:11, 21 May 2013 (UTC)
- Yes, their contents was moved over to the module so the names were not lost. I actually think that we can use that information in the future to make things like
- You are preserving the contents of the templates' /names pages in the module, yes? Some of them were used / contained content. - -sche (discuss) 17:22, 21 May 2013 (UTC)
- I've been working on deleting the /names subtemplates. They were never really used for anything to begin with. For the next step I would like to orphan and delete the /family subtemplates. This will be more work because there are a lot more of them, and there is a chance that some of them are still being transcluded. So I propose the following "plan":
- I fixed those yesterday. I edited
- Just let me know whenever it becomes time to start deleting the language templates themselves (
{{aaa}}, etc). For one thing, a lot of direct uses (i.e., "Ghotuo") will have to be modified. For another, I've put a lot of effort into making sure the information in the templates is up-to-date, whereas I've noticed places where the module is not up to date, so I volunteer to delete the templates by hand after cross-checking them and the module against each other, as described on your talk page. - -sche (discuss) 17:22, 21 May 2013 (UTC)- As I noted above they were cross-checked a few days ago and the module was updated to match the templates. But if someone makes edits to the templates now, the module won't be affected of course. So for now we need to check edits to the Template: namespace regularly to see if anyone made any changes. The names and family templates are probably not used anywhere anymore, but the main (name) and the script still are, so they need to be kept synchronised. I'm not sure what to do with the direct uses. When we eventually get around to it, we can delete the templates that are orphaned, and change the remainder so that they "forward" the call to the module. That would give us some more time to work on them without having to worry about synchronisation issues. —CodeCat 18:05, 21 May 2013 (UTC)
- For the record, I oppose deleting the templates until there is something in place that mimics directly calling them. The ability to type a code on a page and have the language name displayed (in such a way that the text displayed updates automatically if the language is renamed) is too useful an ability to lose. It shouldn't be hard to make such a thing. I imagine one could use whatever bit of
{{etyl}}finds out the name associated with an entered code (and strip all the other derivation-related bits). - -sche (discuss) 19:07, 24 May 2013 (UTC)- There already is a replacement, Module:language utilities. It is a bit more verbose, but language templates should always be substed in entries anyway. —CodeCat 19:35, 24 May 2013 (UTC)
- For the record, I oppose deleting the templates until there is something in place that mimics directly calling them. The ability to type a code on a page and have the language name displayed (in such a way that the text displayed updates automatically if the language is renamed) is too useful an ability to lose. It shouldn't be hard to make such a thing. I imagine one could use whatever bit of
- As I noted above they were cross-checked a few days ago and the module was updated to match the templates. But if someone makes edits to the templates now, the module won't be affected of course. So for now we need to check edits to the Template: namespace regularly to see if anyone made any changes. The names and family templates are probably not used anywhere anymore, but the main (name) and the script still are, so they need to be kept synchronised. I'm not sure what to do with the direct uses. When we eventually get around to it, we can delete the templates that are orphaned, and change the remainder so that they "forward" the call to the module. That would give us some more time to work on them without having to worry about synchronisation issues. —CodeCat 18:05, 21 May 2013 (UTC)
- All family subtemplates have been orphaned and marked for deletion. I've converted more of our templates to use the module, but the software needs some time to catch up because it has affected many pages. For now, we need to check Category:Pages with script errors regularly and fix any entries that appear as the changes are applied throughout the wiki. Most of the script errors so far have been the result of missing or incorrect language codes. I am hoping to get
{{languagex}}and{{langprefix}}orphaned soonish. —CodeCat 16:01, 22 May 2013 (UTC)
Langrev[edit]
Note: Please do not delete or edit the langrev templates. Doing so would break quite a few important javascript tools, including the translations adder. --Yair rand (talk) 03:21, 22 May 2013 (UTC)
{{langrev}}itself probably won't be deleted, but it may be reimplemented using a module. If your scripts rely on the presence of the subtemplates, then they should probably be fixed. —CodeCat 09:57, 22 May 2013 (UTC)- The scripts rely on the subtemplates, not Template:langrev itself. The scripts cannot be "fixed" to work with whatever system the template is reworked to use. --Yair rand (talk) 20:26, 22 May 2013 (UTC)
- Are you sure there is no other way? Having to keep those templates around is going to be a real problem in the long run. —CodeCat 20:29, 22 May 2013 (UTC)
- The language autocomplete system of the translation adder works by pulling up a quick list of (max 3) pages beginning with "Template:langrev/" followed by however much the user has typed in the language field, along with the raw wikitext of those pages (typically two/three bytes per page), thereby getting the user's intended language name along with its language code. This is done every time the user types another character into the field. Minimal data sent and received, no page parsing, or anything remotely expensive. Suppose this was replaced by a Lua database. There are a few options: One, download the entire 0.7MB long module every time someone wants to type in a language to a javascript tool, and base autocomplete off of that. Two, set up a complex Lua module to run through the database and find language names and codes, and have the js tool send a parse request, with a Lua module in it no less, and run with the results. Option three, dump the entire database right into the script, increasing the script size gazillion-fold, and making everyone's page loads go a lot slower. Not great solutions, I think. --Yair rand (talk) 20:56, 22 May 2013 (UTC)
- Why would it not be possible for the script to call a template or module? —CodeCat 21:04, 22 May 2013 (UTC)
- Certainly it would be possible, as I mentioned above, but I suspect it would perform terribly, having to run a full parse of a probably complex module on every keystroke. --Yair rand (talk) 21:21, 22 May 2013 (UTC)
- (Also, I fail to see how keeping the templates could become a problem in the long run. --Yair rand (talk) 21:22, 22 May 2013 (UTC))
- Why would it not be possible for the script to call a template or module? —CodeCat 21:04, 22 May 2013 (UTC)
- The language autocomplete system of the translation adder works by pulling up a quick list of (max 3) pages beginning with "Template:langrev/" followed by however much the user has typed in the language field, along with the raw wikitext of those pages (typically two/three bytes per page), thereby getting the user's intended language name along with its language code. This is done every time the user types another character into the field. Minimal data sent and received, no page parsing, or anything remotely expensive. Suppose this was replaced by a Lua database. There are a few options: One, download the entire 0.7MB long module every time someone wants to type in a language to a javascript tool, and base autocomplete off of that. Two, set up a complex Lua module to run through the database and find language names and codes, and have the js tool send a parse request, with a Lua module in it no less, and run with the results. Option three, dump the entire database right into the script, increasing the script size gazillion-fold, and making everyone's page loads go a lot slower. Not great solutions, I think. --Yair rand (talk) 20:56, 22 May 2013 (UTC)
- Are you sure there is no other way? Having to keep those templates around is going to be a real problem in the long run. —CodeCat 20:29, 22 May 2013 (UTC)
- The scripts rely on the subtemplates, not Template:langrev itself. The scripts cannot be "fixed" to work with whatever system the template is reworked to use. --Yair rand (talk) 20:26, 22 May 2013 (UTC)
Category:Asturian verb forms[edit]
Are these valid? Do we assume them to be valid? Made by a banned user using an illegal bot, but if they're valid I guess we can't delete them no matter who created them. Mglovesfun (talk) 10:10, 19 May 2013 (UTC)
Category:en:Latvian demonyms[edit]
I've just created this category (actually, I meant to work on its Latvian counterpart Category:lv:Latvian demonyms, but I tend to create an English equivalent for a category when I see there isn't one yet), which made me have a doubt about demonyms. Is the term supposed to cover only words that refer to a person born in a specific place -- i.e., only nouns -- or also adjectives that can refer to the place, or to people who were born there? In other words, should only Courlander be placed in Category:en:Latvian demonyms, or do Curonian, Courlandish and Courish also belong there? (Of course, there is also the derived question of whether it is a good idea to subclassify demonyms by larger areas -- 'French demonyms', 'Russian demonyms', 'American demonyms', etc. -- and whether the names of these categories should be simply in the form 'Geographic Adjective + demonyms', or 'demonyms in + Geographic Noun'). --Pereru (talk) 12:08, 19 May 2013 (UTC)
- I dunno. Maybe these could be added to Category:en:Latvia instead of its own category. — Ungoliant (Falai) 21:21, 19 May 2013 (UTC)
Do we really need horizontal rules between language sections?[edit]
Our standard practice has always been to add ---- right above a language header. But I don't really understand why. I can imagine that people did it because they liked the visual appearance of the extra line above the header. But it is not really necessary (or desirable) to do it that way; a better way would be to add a top border to h2 through CSS. So should we abandon this practice, or is there another reason? —CodeCat 15:17, 19 May 2013 (UTC)
- Yes, it was added for the visual effect, to differentiate the language headers clearly from POS and other L3 headers. In November of 2005, somebody suggested that it should be handled automatically though CSS rather than manually, and some editors began removing all instances of it. After a few hundred pages had
----removed, we put a stop to the effort until someone could get CSS to handle the task correctly. Unfortunately, after all these years I just do not remember the details anymore, but I recall that no one was able to figure out how to do it via CSS. I believe that our HTML experts of that time concluded that it could not be done in CSS (but I don’t remember the reasons or anything like that). So we gave up on this idea and reverted all of the pages that had been changed. —Stephen (Talk) 16:02, 19 May 2013 (UTC)
- Yes. But I'm pretty sure we don't need a blank line above and below it. SemperBlotto (talk) 16:06, 19 May 2013 (UTC)
- We seem to use H2 for other things (e.g. "Latest revision" message), so apparently a class would be needed on the "----"-generated H2s. I bet some bots rely on finding the "----", too. Equinox ◑ 16:11, 19 May 2013 (UTC)
- How would you remove the line from the first section of the page? What about pages with only 1 language section? DTLHS (talk) 16:13, 19 May 2013 (UTC)
Since each h2 also has a gray rule below, this is not an ideal visual-design solution to separate it from the content above.
But, off the top of my head:
/* dispense with hr for rules above h2 */
body.ns-0 #mw-content-text > h2 { border-top: 3px double #aaa; margin-top: 2em; } /* direct-child h2’s of the content in the main namespace */
body.ns-0 #mw-content-text > h2:first-child { border-top: none; margin-top: 0; } /* but not the one at the top */
body.ns-0 #mw-content-text > #toc + h2 {border-top: none; margin-top: 0; } /* and if the TOC appears, not the first one after the TOC */
body.ns-0 #mw-content-text > hr { display: none; } /* hide now-redundant hr’s */
The immediate-child > selector ensures that none of this is rendered in MSIE 6 or earlier. Adjusting the margin-top and padding-top of the h2 might improve the visual separation.
Untested. Should be tested with both TOC shown and hidden, because the TOC contains another h2. Probably needs testing in MSIE 7, because I’m not sure that browser has proper support for these selectors. —Michael Z. 2013-05-19 19:14 z
-
- After a quick test, it seems to work as expected in Safari/Mac. Put the code above in your vector.css to try it out.
-
-
-
- Sure, if no one can think of any disadvantages.
- The hr’s are being used as presentational elements, and with this CSS we can obviate the requirement for manual work and its inevitably inconsistent results, and reduce wikitext clutter. Hr is properly a “paragraph-level thematic break,”[20] so this is not good usage. The spec adds “There is no need for an hr element between the sections themselves, since the section elements and the h1 elements imply thematic changes themselves.” And it could fool the makers of bots and scrapers into thinking they can determine page structure from it, as you mention below. —Michael Z. 2013-05-19 21:45 z
-
-
- Even if the actual wikicode for the horizontal line isn't necessary for presentation, it's still a lot easier to parse ---- than to use a regex to match ==(langname)==. Just something to consider. DTLHS (talk) 20:53, 19 May 2013 (UTC)
- I'm aware of that, but a bot really shouldn't be dividing sections based on the presence or absence of ----. After all, an entry may occasionally be missing it, and we don't want that to cause the bot to break things or make bad edits. The ---- is just a convenience but it can never be relied on, the headers are what counts. Furthermore, if a bot does want to divide a page into sections, then it will surely need to know the name of each section. It would be rather pointless otherwise. So even if a bot uses ---- to split the sections, it will need to parse the header anyway to find out what the name of the language is. Having said that, it really isn't all that hard to just parse the headers. I recently made MewBot do it and it was rather easy. —CodeCat 21:02, 19 May 2013 (UTC)
- I recommend using
~instead of:first-child, as far more browsers support it. That would be:
body.ns-0 #mw-content-text > h2 ~ h2 { border-top: 3px double #aaa; margin-top: 2em; }
/* dispense with hr for rules above h2 */
body.ns-0 #mw-content-text > h2 ~ h2 { border-top: 3px double #aaa; margin-top: 2em; } /* siblings of the first h2 of the content in the main namespace */
body.ns-0 #mw-content-text > hr { display: none; } /* hide now-redundant hr’s */
We really need to add back glosses to pinyin entries[edit]
I never do anything with Chinese so I'm not normally affected by the decisions made by its editors, but this is an exception. I wanted to find out what "huo long" meant. No tone marks, just that. So how do I find out what that means? The entries huo and long show a few possibilities for tones, so I choose one and then I'm presented with even more possibilities. It's just too much work to look through them all. In the end, the most helpful thing was when I did a full search for "huo long" and found out that huǒ means fire, which is the most likely meaning given the context. But what about the second word? In the old situation, the search would have been limited to 4-5 entries - one for each possible tone. That's still doable. But now I have to look through dozens of entries, which is tedious and I just thought "I'm not going to bother, this does not work for me". This is a really bad usability issue. We've already had several people complain on the feedback page that the new entry format was useless and that they want the old format back. And I definitely agree. —CodeCat 22:13, 19 May 2013 (UTC)
- 火龙 means fire dragon, I believe. I agree, we've gotten so many complaints from IPs on various talkpages and on the Feedback page that it looks like we're really making a mistake. —Μετάknowledgediscuss/deeds 22:19, 19 May 2013 (UTC)
- Oppose. This is an undue burden on Chinese editors. Anything that puts pressure on use to develop a better way of managing glosses across multiple pages is IMO a good thing. DTLHS (talk) 22:26, 19 May 2013 (UTC)
-
-
- Somewhere it says that we favour readers over editors.
-
-
-
- But could something like the following work?: For each Han character (e.g., 火), put the gloss text in 火/gloss. In the romanization entry (huǒ), have
{{pinyin reading of|火}}link to the character, but additionally show the text of character/gloss – and in huǒ/gloss aggregate all of the romanized characters’ glosses. In the ambiguous diacritic-less form (huo), aggregate all of the romanizations’ glosses. It would still be more work, but it would eliminate error and duplication if each gloss were typed in only one place. —Michael Z. 2013-05-20 01:42 z
- But could something like the following work?: For each Han character (e.g., 火), put the gloss text in 火/gloss. In the romanization entry (huǒ), have
-
-
-
-
- Toneless pinyin should not be used for anything, unless they have become English loanwords. There are just too many tone combinations. Monosyllabic pinyin is used for disambiguation. I know the IP guy - a longtime user. (He/shes uses multiple IP addresses but that may be related to his work, home, ipad, whatever IP). He works with Vietnamese and Mandarin. See also Talk:ya3 --Anatoli (обсудить/вклад) 01:45, 20 May 2013 (UTC)
- Talk:ya3 --Anatoli (обсудить/вклад) 01:45, 20 May 2013 (UTC)
- ^This this this...it's an extra click on from the pinyin entry to get to the real info, so what? Also, IMO nonstandard pinyin entries should not be made, with the exception of the basic "syllables" I guess.User: PalkiaX50 talk to meh 01:57, 20 May 2013 (UTC)
- It's not one click. It's as many clicks as there are Han entries for a single pinyin entry, which in the case of huǒ is six, but for lóng it's 47. Do you really expect users to look through all 47 of them to find the right one? I certainly gave up when I saw that long list. It's even worse for the many users who don't even know the tone, because then they have to look through all the tones' pinyin entries as well and it multiplies. —CodeCat 02:00, 20 May 2013 (UTC)
- ^This this this...it's an extra click on from the pinyin entry to get to the real info, so what? Also, IMO nonstandard pinyin entries should not be made, with the exception of the basic "syllables" I guess.User: PalkiaX50 talk to meh 01:57, 20 May 2013 (UTC)
-
-
-
-
-
-
-
-
- It's a hard effort to find a Chinese word matching toneless pinyin. If a word exists, then it's easy to find by pinyin. "huolong" would also yield "huǒlóng" in the search window. Even with the same tones "huǒlóng" may mean not only "fire dragon" but also 火笼 (fire cage), 或隆 (or Long (name). --Anatoli (обсудить/вклад) 02:13, 20 May 2013 (UTC)
-
-
-
-
-
- The answered complaints are at Wiktionary:Feedback#jin4 and Wiktionary:Feedback#li4. I'm sure they are form the same person. I don't think you can get an accurate analysis from a CEDict dump, which was done once many years ago. Single character definitions exist in the translingual sections, Mandarin are badly behind and many have no definitions. --Anatoli (обсудить/вклад) 02:07, 20 May 2013 (UTC)
-
- I've given it some thought. Given that editors may not catch with single-characters quickly enough, Perhaps we could revert the edit of User:MglovesfunBot on monosyllabic toned pinyin entries if they are in demand, like yǎ#Mandarin? I would make entries like ya3#Mandarin redirects to yǎ#Mandarin? The translations need to be used with care, the character translation is not the same as word translation in Chinese. There are many specific Japanese characters, hanzi, which are only used in combinations, pure phonetic hanzi or their "definitions" is hardly used in real Chinese. Still, they are perhaps 95% right and may give an idea of the meaning. --Anatoli (обсудить/вклад) 02:56, 21 May 2013 (UTC)
- Bot-made mass edits like diff, removing glosses from Pinyin entries, have made Wiktionary less usable for readers, for bad reasons, IMHO anyway. Wiktionary would be better off without these edits. --Dan Polansky (talk) 19:46, 21 May 2013 (UTC)
-
- ya3#Mandarin is a duplication of yǎ#Mandarin but entries in Category:Mandarin pinyin with diacritics (monosyllabic, toned pinyin - 1,403 entries) could be restored before the last edit by User:MglovesfunBot. --Anatoli (обсудить/вклад) 02:38, 22 May 2013 (UTC)
- I don't understand the issue. You can type toneless pinyin into the search box and, if we have an entry for it, it will automatically appear in the drop-down box... ---> Tooironic (talk) 22:43, 25 May 2013 (UTC)
Calques in "derived from" categories[edit]
As a result of a recent edit, I noticed that a term that I created, uncanny valley, is categorized (through use of the etyl template) into "English terms derived from Japanese." I'm a little bit uneasy with the idea of a calque like this one being listed as "derived from" the originating language (it seems a little misleading to me), so I thought I'd see if I could find a more official thought on it. —Dajagr (talk) 02:35, 20 May 2013 (UTC)
- Sorry, I disagree. If the Japanese term is a calque from English uncanny valley then I think it is, in a way, derived from English. — Ungoliant (Falai) 21:08, 21 May 2013 (UTC)
- You have the cases reversed; the English is a calque of the Japanese. There's certainly a justification for separating out this type of derivation, but it's not false on the face of it.--Prosfilaes (talk) 21:18, 23 May 2013 (UTC)
Tech newsletter: Subscribe to receive the next editions[edit]
- Recent software changes
- (Not all changes will affect you.)
- The latest version of MediaWiki (version 1.22/wmf4) was added to non-Wikipedia wikis on May 13, and to the English Wikipedia (with a Wikidata software update) on May 20. It will be updated on all other Wikipedia sites on May 22. [21] [22]
- A software update will perhaps result in temporary issues with images. Please report any problems you notice. [23]
- MediaWiki recognizes links in twelve new schemes. Users can now link to SSH, XMPP and Bitcoin directly from wikicode. [24]
- VisualEditor was added to all content namespaces on mediawiki.org on May 20. [25]
- A new extension ("TemplateData") was added to all Wikipedia sites on May 20. It will allow a future version of VisualEditor to edit templates. [26]
- New sites: Greek Wikivoyage and Venetian Wiktionary joined the Wikimedia family last week; the total number of project wikis is now 794. [27] [28]
- The logo of 18 Wikipedias was changed to version 2.0 in a third group of updates. [29]
- The UploadWizard on Commons now shows links to the old upload form in 55 languages (bug 33513). [30]
- Future software changes
- The next version of MediaWiki (version 1.22/wmf5) will be added to Wikimedia sites starting on May 27. [31]
- An updated version of Notifications, with new features and fewer bugs, will be added to the English Wikipedia on May 23. [32]
- The final version of the "single user login" (which allows people to use the same username on different Wikimedia wikis) is moved to August 2013. The software will automatically rename some usernames. [33]
- A new discussion system for MediaWiki, called "Flow", is under development. Wikimedia designers need your help to inform other users, test the prototype and discuss the interface. [34].
- The Wikimedia Foundation is hiring people to act as links between software developers and users for VisualEditor. [35]
If you want to continue to receive the next issues every week, please subscribe to the newsletter. You can subscribe your personal talk page and a community page like this one. The newsletter can be translated into your language.
You can also become a tech ambassador, help us write the next newsletter and tell us what to improve. Your feedback is greatly appreciated. guillom 20:28, 20 May 2013 (UTC)Monkey business in biological class entries.[edit]
I have noticed that our entries for fish, reptile, amphibian, etc., all include a collection of pictures of animals from that biological class, but each also contains a picture of a chimpanzee. I understand that the intent is to convey that all of these things are animals, but it seems jarring and unnecessary. bd2412 T 01:55, 21 May 2013 (UTC)
- This is part of the Wiktionary "picture book" project that somebody started (and, I believe, abandoned) several years ago. I think it was intended to be hierarchical, with trees of related pictures. It is probably a dead thing. Equinox ◑ 02:53, 21 May 2013 (UTC)
-
- Easy to find all the offenders. Looks like an AWB job to dispose of them, but some entries (like szympans) are using it correctly, so it'll have to be a process with human supervision. —Μετάknowledgediscuss/deeds 02:59, 21 May 2013 (UTC)
Quotations and Examples[edit]
(I have tried to put something into the Beer Parlour on this topic a few days ago. What did I do wrong, or is it lurking somewhere out of my view?)
I'm a newbie who puts in quotations. In doing so I have been quite confused by the different ways examples are put in, and distressed by their untidiness.
I strongly suggest that provision be made for examples be treated in the same way as quotations, that is, for them to be hidden by default, but able to be switched by the reader between hidden and shown, by sense or overall. Perhaps they could be coded by a #= prefix unless that's already used. Of course the examples should be independent of the quotations.
The difference between examples and quotations should be emphasized.
An example simply illustrates briefly, in a single phrase or sentence, how the word is used in the local sense.
A quotation is accompanied by details of its source, preferably with URLs, so that the reader can verify its authenticity and follow up on its context. Quotations should be chosen to cover a variety of contexts and dates of publication.
Examples and quotations having different purposes, different readers will choose to see each or both or neither. Indeed the default display should be neither so that the reader sees a relatively compact display of senses on first going to an entry. —This unsigned comment was added by ReidAA (talk • contribs).
- You had posted this on the WT:Grease pit actually. — Ungoliant (Falai) 11:48, 22 May 2013 (UTC)
Requesting a block for IP user:24.135.76.120[edit]
On User talk:24.135.76.120 they wrote
meaning "Ivan, Ivan, your w:Ustaše propaganda is extremely shameful[...]" I'm guessing this is a previously blocked user, who came back making some more uproar and causing disturbance. --biblbroksдискашн 15:14, 22 May 2013 (UTC)
-
- IP has been blocked, and an abuse complaint has been filed with the ISP. --Ivan Štambuk (talk) 01:33, 23 May 2013 (UTC)
- Oh come on really? Sure, block the deluded fuck from here and stop him from spreading more crap and whining on our site. But a complaint to their ISP? Really? Yeah, they're an idiot or something for doing what they did but I think that's a bit much...I don't really care too much I guess but still, merely blocking them solves the issue IMO. User: PalkiaX50 talk to meh 01:41, 23 May 2013 (UTC)
- This same guy has been doing it for a very long time (I think more than two years). The majority of his edits are correct and well formatted, but once there is a disagreement about some topic, even a trivial one like in this particular case (the other IP, which was from Croatia, has wrongly marked kupatilo as Croatian-only term, which was in turn misinterpreted by him as an act of organized anti-Serbian campaign), he starts spouting abuse. And the particular swear words that he's using are extremely politically charged and insultive, and can classify as hate speech. All I have asked in the complaint is to ask their customer to stop persistently violating netiquette and remind him that Internet does not grant neither anonymity nor immunity. --Ivan Štambuk (talk) 04:18, 23 May 2013 (UTC)
- Oh come on really? Sure, block the deluded fuck from here and stop him from spreading more crap and whining on our site. But a complaint to their ISP? Really? Yeah, they're an idiot or something for doing what they did but I think that's a bit much...I don't really care too much I guess but still, merely blocking them solves the issue IMO. User: PalkiaX50 talk to meh 01:41, 23 May 2013 (UTC)
- IP has been blocked, and an abuse complaint has been filed with the ISP. --Ivan Štambuk (talk) 01:33, 23 May 2013 (UTC)
- For future reference, the place to post this is WT:VIP.—msh210℠ (talk) 18:35, 31 May 2013 (UTC)
…… symbol in Chinese entries instead of ....[edit]
Before I go ahead and rename Mandarin entries with an ellipsis, I'd like to ask.
Are there any serious reasons NOT to use the Chinese ellipsis …… in Mandarin, Min Nan, etc. entries, instead of the Western "..."? It looks better with hanzi, and the spelling like 不是……而是…… is how the Chinese resources describe terms when ellipsis is needed. Pinyin transliteration and entries would still use "...". --Anatoli (обсудить/вклад) 23:26, 23 May 2013 (UTC)
Vote on Wikidata[edit]
A vote is actually on going on Wikidata concerning the management of the Wiktionary interwiki links by Wikidata. There are also discussions for more advanced concept using Wikidata. The vote is there. Pamputt (talk) 15:31, 25 May 2013 (UTC)
- Can someone talk me through it here? My instinct it to oppose but I say almost all support votes there. Mglovesfun (talk) 21:03, 26 May 2013 (UTC)
How to format plurals[edit]
I have a question regarding pages for plural forms of words. Right below the Noun heading, you would normally put {{en-noun}} (on English pages that aren't for plural words). I was wondering what one is supposed to do when the word is plural. Do you just put '''word'''? Or '''word''' {{plural}}? Thanks. TeragR (talk) 21:13, 26 May 2013 (UTC)
- For the headword, in these cases, I just put the word itself within triple quotes. That seems to work. If it has a gender, then I put that following the headword. SemperBlotto (talk) 21:17, 26 May 2013 (UTC)
- I would suggest
{{head|en|plural}}or perhaps{{head|en}}. —CodeCat 21:18, 26 May 2013 (UTC) - ... which achieves the same thing but causes the wiki to fetch a template and interpret it. SemperBlotto (talk) 21:20, 26 May 2013 (UTC)
- The default is '''word''' or
{{head|languagecode}}, whichever you prefer. Sometimes people provide more detail, adding things like{{head|languagecode|plural}}or{{head|languagecode|noun form}}or a gender, as SemperBlotto and CodeCat note, but often the definition-line templates already add the plural or noun form categorisation which is (or would otherwise be) the main discernible benefit to{{head|languagecode|plural}}. - -sche (discuss) 21:22, 26 May 2013 (UTC)- I don't think that the definition templates should add categories, because different languages may have different needs. For example,
{{plural of}}categorises into "plurals" but that would make no sense for a language that has more noun forms than just a plural, and never mind adjective plurals. That is why I prefer{{head|en|plural}}, so that in the event that{{plural of}}'s category is removed in the future, the entry is already prepared for it. Also,{{head}}or some other headword-line template is mandatory for any language but English, because it also sets the script and language of the headword. You might as well use it for English as well, for consistency if nothing else. Another difference is that{{head}}uses the script code's formatting rules, which are more detailed/precise than mere bold text.{{Latn}}for example will format a headword with<strong class="headword" lang="en">word</strong>. —CodeCat 21:30, 26 May 2013 (UTC)- How long does it take the server to interpret
{{head|en}}as opposed to word? I bit it's microseconds. Mglovesfun (talk) 21:32, 26 May 2013 (UTC)- It's not as easy to tell because the usage cost of language codes is amortized. The language module is loaded on the first use on a page, but once it's loaded it stays available for that page. So the first use is relatively expensive because the module needs to be loaded, but subsequent uses are very cheap as they are just a matter of a single lookup. —CodeCat 21:36, 26 May 2013 (UTC)
- How long does it take the server to interpret
- A lot depends on the language and on the part of speech. In languages other than English, it's a good idea to look at the About page for the language (if there is one), and at other similar entries to see what the accepted practice is. Also look for headword-line templates. The best way to find all of that is to go to the main category for the language. For Spanish, for instance, that would be Category:Spanish language. For other languages, it would be the same, but the language name would be different. Chuck Entz (talk) 23:26, 26 May 2013 (UTC)
- I don't think that the definition templates should add categories, because different languages may have different needs. For example,
{{en-noun}} really needs a form for plurals. It is inadequate and confusing that the second-most common form of English noun is a radical exception in entry format without a clear guideline. —Michael Z. 2013-05-29 14:46 z
- Something like
{{en-noun-form}}you mean? —CodeCat 17:13, 29 May 2013 (UTC)
-
- I mean a parameter. We already have:
-
- Not sure of the difference between the last two. How about one of:
Māori? Maori?[edit]
- See also WT:RFM#Template:mi
This edit prompted me to look into how the {{mi}} language is described here on WT. I'm used to seeing the language name as Māori, with the macron, and that's also how the language is listed at w:Māori. However, I see that the macron-less form is in common use here on WT.
Does anyone know why we don't use the macron consistently here? Is it a matter of legacy spellings? I recognize that the macron-less form is common enough to warrant inclusion; that's not my concern. The language name as listed, however, should be the more "official" spelling with the macron, no?
Curious, -- Eiríkr Útlendi │ Tala við mig 03:48, 28 May 2013 (UTC)
- A look at the interwiki listings on the side of many entry pages will show how few of our L2 headers match the "official" spelling. Since the macron-less spelling is common in English, and also much easier to type, I think we should stick with it- except for the page name for the Maori word, itself, of course. Chuck Entz (talk) 04:20, 28 May 2013 (UTC)
- I'll paste my comment from WT:RFM#Template:mi: I favour "Maori" and oppose a move. "Māori" is, indeed, the native name of the language, but English disuses diacritics, and even before I overhauled it, WT:LANGNAMES noted that Wiktionary avoids diacritics, too. Like CodeCat said [in the RFM discussion], "Maori" is common, in part because many non-specialists write about the language. In contrast, [en.Wikt does use diacritics in other languages names, e.g. ǃXóõ, because] the only people who write about ǃXóõ tend to be specialists who spell it "ǃXóõ". - -sche (discuss) 04:31, 28 May 2013 (UTC)
- Yes and we have French not français (and so on). Mglovesfun (talk) 10:52, 28 May 2013 (UTC)
-
-
- That’s not a useful comparison. It appears that Māori is correct formal written English among people who write about the language, while français is just French.
-
-
-
-
- There are 388 “official” language names with diacritics, out of 7644 in WT:LANGLIST (1 in 20), plus a handful with click characters or apostrophes. I have no way to determine how many diacritical names we have rejected, but we have accepted hundreds, so the statement that “diacritics are avoided” is dubious. Any reason not to remove it? —Michael Z. 2013-05-28 16:11 z
-
-
- The official one? The ISO 639-2 and 639-3 spelling is Maori. News.google.com shows a lot of news articles with Maori, more then with "Māori"; I don't think that one spelling can be declared the correct formal form for written English. I'm personally a fan of writing English with English characters, and not with a plethora of diacritics not normally found in English.--Prosfilaes (talk) 00:33, 29 May 2013 (UTC)
Language code migration step 2[edit]
The script templates have almost all been marked for deletion, only a few of the most-used ones still have transclusions because the software hasn't updated all the pages yet, but I am working on that. That leaves two points remaining. The first is a relatively small one, the /documentation subpages that a few of the language templates have. They are listed in Category:language code templates with documentation but that list is not yet complete, the category is still updating. I'm not sure if they should be deleted outright without preserving the information somewhere, but I don't know where to preserve it either.
The second point of course is orphaning and deleting the code templates themselves. This will be more difficult because many scripts like the translation adder and entry creator rely on their presence, because they subst them into the entry. Substing is not a problem in itself, the module itself is also substable so it is a matter of changing {{subst:{{{lang}}}}} into {{subst:#invoke:language utilities|lookup_language|{{{lang}}}|names}}. But we will need to update everything that still uses the templates. Unfortunately I don't know where all or even most of the uses are, and I am not very good with JavaScript so someone else needs to look at the gadgets. Bots that use these templates will also need to be fixed. —CodeCat 16:14, 28 May 2013 (UTC)
- Ok, I've thought about it and I propose the following. To give us some time to work out fixing all the things that need the templates, we can use a temporary solution that is compatible with the module. Every language template's contents is replaced with the following:
{{<includeonly>subst:</includeonly>#invoke:language utilities|lookup_language|{{subst:PAGENAME}}|names}}<noinclude>{{langt-deprecated}}. That way, the template still works, but can only be substed so it's not possible to use it in templates anymore. The template{{langt}}is also removed by this edit and replaced with{{langt-deprecated}}, which puts the template in a different category Category:Language code templates to be deleted. conl: and proto: templates can be deleted outright, because I don't think any scripts or bots subst them directly. Is that ok? —CodeCat 20:54, 28 May 2013 (UTC)- Are you talking about script templates or language templates? I strongly advise against deleting either any time soon. The language templates are used in things other than just substing, and can occasionally be very hard to replace with the module. --Yair rand (talk) 10:50, 29 May 2013 (UTC)
- Can you give an example of where the language templates are currently still used? Template:en is already orphaned so wherever it is, it's not on this wiki. —CodeCat 10:55, 29 May 2013 (UTC)
- They're probably still used by plenty of bots, and WT:ACCEL also still uses them, but those wouldn't be harmed by replacing them with a module invokation as you suggested above. The main issue we need to look out for is scripts/bots that pull out the wikitext directly, which definitely would break as a result. There are probably still a few of those, some of which might not have their code on-wiki. (There are probably also loads of data-miners that depend on them, but I'm not sure if we should take that into account.) --Yair rand (talk) 11:34, 29 May 2013 (UTC)
- I use the language templates for two purposes: (1) if I don't know what language is associated with a particular code, or whether a particular string of three letters exists as an ISO 639-3 code, I go to the template and see; (2) when I'm adding requested translations or assigning translations to be checked to the appropriate sense, I change
{{trreq|xyz}}and{{ttbc|xyz}}to{{subst:xyz}}rather than type out the language name, which saves typos as well as trying to remember exactly which name we use for certain languages. (Has the pendulum swung to Maori or Māori this month? How do you spell Guugu Yimithirr and Sḵwx̱wú7mesh? If I just type{{subst:mi}},{{subst:kky}}and{{subst:squ}}I don't have to worry about it.) —Angr 12:43, 29 May 2013 (UTC)- To Yair: That is mainly a matter of time and gradual migration, which is why I proposed the interim solution above. On one hand, we don't want to break too many things, but on the other hand, some things will probably not be fixed until they break because we weren't aware of them. Bots probably fall into the latter category as well. So, with my proposal, bots that use the raw wikitext of the template will break, but anything that substs the templates will still work. —CodeCat 15:43, 29 May 2013 (UTC)
- I use the language templates for two purposes: (1) if I don't know what language is associated with a particular code, or whether a particular string of three letters exists as an ISO 639-3 code, I go to the template and see; (2) when I'm adding requested translations or assigning translations to be checked to the appropriate sense, I change
- They're probably still used by plenty of bots, and WT:ACCEL also still uses them, but those wouldn't be harmed by replacing them with a module invokation as you suggested above. The main issue we need to look out for is scripts/bots that pull out the wikitext directly, which definitely would break as a result. There are probably still a few of those, some of which might not have their code on-wiki. (There are probably also loads of data-miners that depend on them, but I'm not sure if we should take that into account.) --Yair rand (talk) 11:34, 29 May 2013 (UTC)
- Can you give an example of where the language templates are currently still used? Template:en is already orphaned so wherever it is, it's not on this wiki. —CodeCat 10:55, 29 May 2013 (UTC)
- Are you talking about script templates or language templates? I strongly advise against deleting either any time soon. The language templates are used in things other than just substing, and can occasionally be very hard to replace with the module. --Yair rand (talk) 10:50, 29 May 2013 (UTC)
- I have deleted the documentation pages and preserved their contents on Wiktionary talk:Languages. —CodeCat 21:23, 28 May 2013 (UTC)
- Please note that I initiated a similar migration on fr.wikt (decision in progress). We plan to use a template {{language name|xxx}} to replace the names in the templates (instead of things like {{ {{{lang}}} }}). We will also keep all old language templates as they may be needed for historical reasons, but in that case we will replace their content with a call to the module to reflect the updated names. Dakdada (talk) 09:27, 29 May 2013 (UTC)
Archaic or Obsolete for spellings with 'æ'[edit]
Presently, there are a lot of words marked as obselete that are spelled with an 'æ' (e.g. most of the English terms spelled with Æ). But, I'm not sure that is appropriate because:
According to here obsolete means:
- No longer in use, and no longer likely to be understood. Obsolete is a stronger term than archaic, and a much stronger term than dated.
Whereas archaic means:
- No longer in general use, but still found in some contemporary texts (such as Bible translations) and generally understood (but rarely used) by educated people. For example, thee and thou are archaic pronouns, having been completely superseded by you. Archaic is a stronger term than dated, but not as strong as obsolete.
And dated is:
- Formerly in common use, and still in occasional use, but now unfashionable; for example, wireless in the sense of "broadcast radio tuner", groovy, and gay in the sense of "bright" or "happy" could all be considered dated. Dated is not so strong as archaic or obsolete; see Wiktionary:Obsolete and archaic terms.
I think obsolete is too strong for these spellings, as it will be understood by readers, although perhaps not in use, perhaps Archaic is appropriate?
This discussion started on the talk page for diæresis. WilliamKF (talk) 20:09, 28 May 2013 (UTC)
- Can a spelling ever really be obsolete and not even be understood? —CodeCat 13:56, 20 May 2013 (UTC)
- I think we need to change our definition of "obsolete" with respect to spellings. Archaic and dated both say the form is still found/still used, whereas an obsolete spelling is one that isn't used at all anymore. A spelling like sphære isn't "still found in some contemporary texts", nor is it "still in occasional use, but now [merely] unfashionable". It's not used nowadays at all, unless someone is deliberately using outdated spellings for effect. It's obsolete. —Angr 20:17, 28 May 2013 (UTC)
- A spelling doesn't have to be unintelligible to be obsolete, it just has to be unused. (Most soldiers recognise chariots when they see illustrations of them, but chariots are obsolete military equipment because no army uses them.) I have been marking ligature spellings as obsolete whenever no proof that modern authors still use them is available, or as archaic when such proof is available. - -sche (discuss) 23:41, 28 May 2013 (UTC)
- If we were to look at a population larger than that represented by contributors here, say, one that included those with no higher education, would we find that the majority were not confused by digraphs? I think not. It might be trivial for each individual to learn, but I do not think that most in the larger population ever have cause to decode such things and thus would not have had the opportunity to learn. If so, that would favor
{{obsolete}}rather than{{archaic}}. OTOH, if we intend to only serve those very much like ourselves,{{archaic}}might indeed be more accurate. DCDuring TALK 12:02, 29 May 2013 (UTC)
What about encyclopædia? I'd say this one is even dated, certainly not obsolete as presently listed. I still feel obsolete is too strong for this category of words given the ease in understanding it. WilliamKF (talk) 13:18, 29 May 2013 (UTC)
Good example. We don’t label categories of words, but words, each on its own qualities and usage. Sphære and encyclopædia do not deserve the same label.
DCD, labels should be chosen based on (our estimates of) word usage in the corpus. Without evidence, the discussion about what education levels are confused by what digraphs seems purely speculative. —Michael Z. 2013-05-29 14:38 z
- I agree. In my example above I picked sphære rather than encyclopædia for a reason: the former is truly obsolete, the latter isn't. As I understand it, though, the OP just picked Category:English terms spelled with Æ as an example of a good place to find spellings marked obsolete; there was no implication that all spellings there are obsolete. Nor are they the only ones; I'd call gaol dated, shew archaic, and queene obsolete (though I see that our entries do not agree with my opinion on all counts). —Angr 14:51, 29 May 2013 (UTC)
Permission for bot edits[edit]
Two proposed tasks are:
- convert
{{fr-adj-mf}}to{{fr-adj|mf}}. Literally involves replacing one character - convert
{{fr-noun-inv|(gender)}}to{{fr-noun|(gender)|inv}}, fr-noun now acceptsinv
Per WT:BOT#Policy am asking for 'consensus' rather than doing these outright. Mglovesfun (talk) 12:26, 29 May 2013 (UTC)
- This seems very uncontroversial so I don't think you need to ask permission for it. —CodeCat 15:39, 29 May 2013 (UTC)
- Go for it. Let me know when it's done (need to modify bot that generates adjective/noun forms). SemperBlotto (talk) 15:42, 29 May 2013 (UTC)
- I would like to edit WT:BOT#Policy a little, but then probably every policy we have, I would like to reword a little (or a lot). Mglovesfun (talk) 17:52, 29 May 2013 (UTC)
Support. --Yair rand (talk) 17:56, 29 May 2013 (UTC)- I abstain; looks good to me, on the face of it. --Dan Polansky (talk) 20:31, 30 May 2013 (UTC)
Migrating the family templates[edit]
I have started migrating the family templates as well. I've followed the same procedure and it seems that everything is working correctly now, using the information that was in Module:families. I would now like to orphan all the templates in Category:Family code templates and then mark them for deletion as they are orphaned. Is that ok? —CodeCat 22:07, 30 May 2013 (UTC)
Etymologies of Chinese Characters[edit]
Hello all. Wiktionarians have inserted paraphrased interpretations from my etymological dictionary (Kanji Networks) in the Etymology sections of the entries for about 30 Chinese characters. Examples include 二, 三, 四, 六, 七, 八, 九, 十, 中, 右, 百, 千, 年, 服, 者, and 暑. Links to KN also appear on the pages of several characters (林, 森 and 少) the etymologies of which are quite different from the KN interpretations.
I'm fine with the use of the Kanji Networks interpretations in Wiktionary. I'm also fine with the use of differing interpretations. What I'd like to confirm is that users be clear on the source(s).
If it does not represent a conflict of interest for me to build on work that has been started by others, I propose to take a hand in adding a Source note immediately following each KN interpretation, linking to the KN URL for that character. (The assistance of others in the task of expanding the number of characters etymologized will of course be greatly appreciated.) Entries with etymologies differing from those proposed by KN I will not touch in any way; other Wiktionarians may handle their presentation as they see fit, including the question of whether and how to source them.
Please advise whether this kind of participation is in line with Wiktionary policies. Thank you very much. Lawrence J. Howell (talk) 05:03, 1 June 2013 (UTC)
- Leaving aside matters of policies, ethics, and the like, I see a big practicality problem: this is a wiki- anyone can edit any entry at any time. You might put your source note on an entry, and 5 minutes later, someone might completely change and rearrange the entry, making it look like the new version was derived from KN. Unfortunately, not all of those contributions are the sort you would want to be associated with (currently we're doing battle with an anonymous contributor who thinks Manga fan-sites and Bing Translate are authoritative sources!). Or someone else may take that original interpretation, and replace it with thinly-disguised plagiarism from your site. Or it may flip back and forth as different contributors weigh in. It's a lot of work keeping on top of it all. Chuck Entz (talk) 07:48, 1 June 2013 (UTC)
-
- Thank you very much for the swift and helpful reply, Chuck. It appears I may have underestimated the maintenance aspect. Still, I'd be inclined to give things a shot on an experimental basis if at least a couple of Wiktionarians were to weigh in here on the ethical issue with a "don't sweat it" opinion. I'll hold off and see what develops on that score. Thanks again. Lawrence J. Howell (talk) 08:42, 1 June 2013 (UTC)
- Seems perfectly fine to me. If there are differing interpretations and you want to note the KN one, footnotes might be good; otherwise a reference is probably fine. As for the practicality; one more pair of eyes looking out for vandalism/plagarism is only going to make things better. Hyarmendacil (talk) 09:49, 1 June 2013 (UTC)
-
- Thank you, Hyarmendacil, for your input. As no dissenting voices have been raised, I'll set to work. To indicate what I'm doing, and why: At present, the entries of many Chinese characters lack an Etymology section. In the entries that do have such a section, we find "etymology" taken to apply to data as diverse as a) Unannotated graphic evolution (公 王), b) Bare-bones presentations of the phonetic and semantic elements of compounds (花 成 頭 就 機 國 etc.), c) Unsourced speculation (不 有 好 etc.), d) Interpretations contraindicated by historical evidence (see 可, where the explanation doesn't mesh with the attestation of 可 in oracle bone script and the absence of attestation for 奇 until seal script), e) Ancient phonology: Baxter/Sagart's reconstructions are mentioned in a handful of characters (代 的 家 and 海) and Schüssler (Schuessler) is noted in 大. In short, the Etymology sections exhibit little consistency in format, content and purpose. For that reason, I'm creating a distinct section for the data I can supply, and labeling it Phonosemantic Interpretation. After posting this message, I'll begin with the following fifty characters, which those interested may care to inspect: 一 二 三 四 五 六 七 八 九 十 百 千 万 年 林 森 大 小 少 中 右 左 下 上 木 服 者 暑 月 人 的 不 子 女 是 心 我 有 之 天 口 水 日 他 又 山 丸 金 力 无 Comments and suggestions are most definitely welcome. Lawrence J. Howell (talk) 05:15, 3 June 2013 (UTC)
- If the header is to be used, its second word should be lowercase ("Phonosemantic interpretation"; compare "Alternative forms" and "Usage notes"). However, the info added to [[八]] seems like primarily etymological information that could have stayed under an ===Etymology=== header, and if there is info that would not be appropriate under an etymology header, it might be OK to consider it ====Usage notes====.
- Also... I notice this claim in the entry 八: "[eight] may be regarded as having been considered the single-digit number divisible by two the greatest number of times". In what way is eight a single-digit number? In Arabic numerals, which were unknown to ancient Asia? Or in the sense that "八" is a 'single digit', in which case the argument seems circular (if someone says "this single digit was used for eight, because eight is the single digit number divisible by two the greatest number of times", one could ask "why wasn't a double digit chosen?"). - -sche (discuss) 01:47, 4 June 2013 (UTC)
-
- Thank you very much for the input, -sche. The presentation matters come as welcome advice, and I acknowledge the applicability of your point about 八. Before amending or redoing anything I'll wait a few days in hopes of obtaining further useful comments and suggestions. Thank you again. Lawrence J. Howell (talk) 04:01, 4 June 2013 (UTC)
June 2013
Thoughts from an experienced outsider[edit]
As a nine-year Wikipedian and very occasional editor here at Wiktionary, I would like to offer insight into what this community's atmosphere feels like to an outsider. Here are a few thoughts from 42 short hours of interacting with six established contributors (on various pages) with respect to the subject of untranslatable terms:
- Content upon which articles, books, research, and most members of the public express interest is useless
- Content upon which articles, books, research, and most members of the public express interest is irrelevant
- Content pertaining to knowable terms should be omitted because it could be contradicted by unknown information
- Content should be omitted if inspired by the popular press
- Content should be omitted if it requires substantial effort to produce
- The most widely acclaimed translation of a nation's most seminal work is unacceptable as a reference
- Scientific meta-analyses consist of subjective opinion and cannot be used as a reference
- Two books and an expert's paper aren't to be considered durably published references
- Verbal challenges to the veracity of references may not be overcome by verbal verification with one or more native speakers of an obscure language
- Verbal challenges to the veracity of references may not be overcome by directly contacting its author to obtain explicit provenance
These observations are not intended to ridicule the associated contributors in any way as individuals – in fact, I hope to work productively with each and every one of them even on this very issue. However, most new contributors to Wiki-projects cannot endure more than two or three blows like these before getting a pretty sour taste in their mouth or just outright leaving forever. — C M B J 09:08, 3 June 2013 (UTC)
- I think you are from Wikipedia. Having to follow these rules doesn't seem much different from having to follow Wikipedia's rule about "no original research" (which talking to a native speaker would also be, unless it were published and peer-reviewed). Perhaps it's just a matter of understanding why these rules exist, and adjusting to them? Equinox ◑ 09:19, 3 June 2013 (UTC)
- No original research is a fine policy, but that was not a point of concern for three very sound reasons: (a) the content was published by multiple sources and veritably by at least one, (b) its criteria for inclusion is lower due to special considerations for rare languages, and (c) past thinking on the matter was that attestation by a native speaker or knowledgeable individual would be considered appropriate. I am more than able to navigate policies foreign to me and I do not believe unfamiliarity or confusion on my part to have been a factor. — C M B J 09:40, 3 June 2013 (UTC)
- These are very quickly formed opinions and based on comments by individual users rather than the community as a whole. We cannot stop individual users having opinions, nor would we want to. Mglovesfun (talk) 09:21, 3 June 2013 (UTC)
- 'Source' here isn't a good word, we don't source entries, we cite them. If you see Appendix:English dictionary-only terms you'll see we can source quite a lot of words that don't appear to exist. Mglovesfun (talk) 09:22, 3 June 2013 (UTC)
-
- As per above, I am here speaking on good faith and, frankly, I really don't appreciate being accused of trying to break the project's rules for personal reasons, of being ignorant and uninformed after providing this explanation, or being disparaged for identifying with Wikipedia, or being asked to stop talking (while simultaneously being told "we cannot stop individual users having opinions, nor would we want to"), all because I dared to disagree with unsubstantiated claims. These unprovoked and irrational ad hominem attacks are ironically representative of why I felt that it was necessary to go out on a limb and share my experience in the first place. This environment is toxic. — C M B J 10:18, 3 June 2013 (UTC)
- I think there are two issues going on here, and some of the disagreements may stem from different understandings of what's under discussion. The sources you've cited for these words are fine (at least as far as I'm concerned) for having an entry on words like tingo#Rapa Nui. The sources are perfectly adequate in terms of WT:CFI for less well attested languages. What's more problematic is putting them in the Category:Terms without an English counterpart (or even having that category) because deciding what words do and do not "really" have an English counterpart is highly subjective. At some level of philosophizing, no word in another language has an exact English counterpart because each word in another language will have slight shades of meaning and connotation that the English word doesn't have. I think all of the words listed over at WT:RFD/O#Category:Terms without an English counterpart is worthy of having a Wiktionary entry, provided it's in the correct script with the correct capitalization and as long as confirmation from some published work by a recognized expert in the language (as opposed to the popular press) is provided to confirm the term's existence in the case of less-attested languages, and as long as three cites from durably archived sources are provided in the case of well-attested languages like German. But deciding what goes into the category of "English doesn't have a word for this" is problematic because there are no objective criteria for it. I myself have worked on Sättigungsbeilage, a word with no obvious English translation, and brought it to the attention of word mavens by nominating it for WT:FWOTD, but even I'd be wary of putting it into that category. —Angr 10:37, 3 June 2013 (UTC)
- As per above, I am here speaking on good faith and, frankly, I really don't appreciate being accused of trying to break the project's rules for personal reasons, of being ignorant and uninformed after providing this explanation, or being disparaged for identifying with Wikipedia, or being asked to stop talking (while simultaneously being told "we cannot stop individual users having opinions, nor would we want to"), all because I dared to disagree with unsubstantiated claims. These unprovoked and irrational ad hominem attacks are ironically representative of why I felt that it was necessary to go out on a limb and share my experience in the first place. This environment is toxic. — C M B J 10:18, 3 June 2013 (UTC)
-
-
-
- I wish to associate myself with Angr's comments. DCDuring TALK 11:42, 3 June 2013 (UTC)
- I agree with much of what Angr said and have replied on his talk page to help keep this thread on-topic. — C M B J 12:34, 4 June 2013 (UTC)
-
-
-
-
- CMBJ, please don't see disagreeing with you as a personal attack, as we can't agree with you purely so you don't feel attacked. Mglovesfun (talk) 11:29, 3 June 2013 (UTC)
-
-
-
-
- To quote:
- …here's a tip. If you don't know what you're talking about, stop talking."
- "…become better informed, please!"
- "You come across as a Wikipedian trying to get round the rules for his own personal reasons."
- "I'd suggest if you have nothing relevant to say, say nothing."
- These are not spirited disagreements over the issue at hand. They're wanton personal attacks, and even worse, they're non sequiturs. I am willing to forgive and forget and move beyond them, but I will not for one second tolerate further denigration—especially in a safe place and from an administrator. It's doubly unacceptable. — C M B J 12:05, 3 June 2013 (UTC)
- I stand by all of that. What's wrong with being better informed or not making ill-informed comments? What exactly do you object to? Mglovesfun (talk) 17:02, 3 June 2013 (UTC)
- The insulting tone, the implication that he doesn't know what he's talking about, has nothing relevant to say, and should shut up. I find these statements insulting too and I'm not even the one they're directed at. Any reasonable user would take these as personal attacks. —Angr 18:05, 3 June 2013 (UTC)
- I wish to associate myself with Angr's comments immediately above, as well. Even though I sometimes have delivered abusive comments, I don't think it is good practice, especially directed at a new contributor who is making good faith efforts to contribute. DCDuring TALK 18:25, 3 June 2013 (UTC)
- I stand by all of that. What's wrong with being better informed or not making ill-informed comments? What exactly do you object to? Mglovesfun (talk) 17:02, 3 June 2013 (UTC)
- To quote:
-
-
-
-
-
-
-
- I mostly agree that WT editors and admins, myself included, sometimes come across too tersely or even insultingly, perhaps as we let our frustrations get the better of us.
- However, the comment above that "You come across as a Wikipedian trying to get round the rules for his own personal reasons" does not itself strike me as all that accusatory -- it is simply a description of what CMBJ's push-back might be viewed as. Within the greater context of CMBJ's interactions, I can see how CMBJ might interpret it as inflammatory, however.
- Online discourse can be difficult. Without all the visual social cues that humans have evolved to give and receive, intent is often hard to discern. -- Eiríkr Útlendi │ Tala við mig 18:39, 3 June 2013 (UTC)
-
-
-
-
-
-
-
-
-
-
- Just to be clear here, these comments were not a clumsy exchange of benign text that just came out sounding wrong. Mglovesfun was not even a participant in the associated discussion prior to stating, with candor, that I was insidiously "trying to get round the rules" (what applicable rules?) for my "own personal reasons" (what possible reasons?) and that I should "stop talking" until I "become better informed" (about what subject?). There is also no reconciling "we cannot stop individual users having opinions, nor would we want to" with "I'd suggest if you have nothing relevant to say, say nothing" because they're contradictory advice in principle. Moreover, if dissecting and attempting to calmly refute unsubstantiated claims is perceived as frustrating push-back, then something's very wrong here, because that's how consensus is supposed to be developed.
-
-
-
-
-
-
-
-
-
-
-
- For what it's worth, I'm thick skinned. I'm not here calling for his de-sysopping. I respect his right to say anything to me, even if accidentally misconstrued or necessarily offensive, and even if in violation of the letter of policy if he genuinely felt it was justifiable for some reason. But that doesn't mean that the other 99 contributors who don't have the nerve to speak up will stick around after such shoddy treatment, which is the focus of this thread. — C M B J 04:26, 4 June 2013 (UTC)
-
-
-
-
-
- CMBJ, I recommend reading Wiktionary:Wiktionary for Wikipedians. --Yair rand (talk) 11:48, 3 June 2013 (UTC)
-
- Thanks for the suggestion. I seem to have already picked up just about everything it describes, but it would've undoubtedly been helpful to me not that long ago. Maybe an eventual goal should be to automatically display it in a dismissible sitenotice for unified accounts that have >500 Wikipedia edits and <25 Wiktionary edits. — C M B J 03:35, 4 June 2013 (UTC)
- CMBJ, you came in with your personal project. Many people who join Wiki-projects with broad ideas of how it should be leave frustrated. Your sources do not match up to what we expect for the project, and many of us don't think your new category is useful. Personally, the fact that your signature links back to Wikipedia doesn't inspire me to treat you as other then a Wikipedian tourist.--Prosfilaes (talk) 19:37, 3 June 2013 (UTC)
-
- First of all, this is not my personal project and I did not come here with an ulterior agenda based on broad assumptions of how everything should be. I did, however, come here and find that an expected level of detail was missing in an area that is of particular interest to many readers, and whether that information is most appropriately presented in the form of a category or not is aside the point. The problem here is that, for a new user, participating on this project is painful, and not for reasons that can be explained away as normal responses to personal fault. This is a chronic problem and that is made abundantly clear by the reactions that articulating it has provoked.
-
- Even in your case—and I stress that you're not even involved—the response has been to just further make this about me. The fact that the thought would even cross your mind to view cross-project editors as "tourists" is very telling of the climate here. The fact that you would for some reason consciously treat them differently, and feel comfortable and confident about stating that intention openly amongst peers and moderators—and to justify the behavior of others, no less—is even more telling because these are the very people who should sense the utmost of hospitality and collegiality and support while making their first contributions.
-
- Further to that point, this "we're not Wikipedia" mantra does not resonate with me at all; both projects are funded by the WMF and both projects labor for the same central goal. The fact that their content guidelines differ is not an excuse for abrasive and callous attitudes toward those who are stellar enough to contribute in multiple areas of concentration. — C M B J 03:30, 4 June 2013 (UTC)
-
-
- You can't say I stress that you're not even involved to any editor at this point, because you are making negative accusations about the entire project, i.e. all of us. Strictly regarding the project, if the issue is that the environment is toxic and painful, some users may experience that in isolated cases, but overall I think the editors do their best to be civil and helpful. When they fail, it's because of the limits human nature and of communicating via online forum. That's my opinion. The empirical support would be that 1000 active editors made it through the supposed toxicity somehow. --Haplology (talk) 04:48, 4 June 2013 (UTC)
-
-
-
-
- Actually, yes, I can, and previously did, and will continue to do so, because my perception of this problem is that it is systemic in nature. This is not an unreasonable assertion because attitudes and norms are contagious social factors. In this case, I am already familiar with the complications that you speak of from Wikipedia and other communities, but it is my view that, with respect to this particular community, they are above and beyond what would be considered normal. It is also more than possible for thousands to unknowingly endure such tendencies and then inadvertently and unintentionally perpetuate them forward without ever taking notice. — C M B J 09:34, 4 June 2013 (UTC)
-
-
-
-
- I don't view editors who also edit other projects as tourists. I view editors who set their signature to link to another project as tourists. It's an obnoxious habit, and it makes my eyes roll on any project I see it on. And anybody who waves a flag saying "I'm not interested in working with this project" is not really the person to devote extra care on.--Prosfilaes (talk) 04:29, 4 June 2013 (UTC)
-
-
-
-
- "I don't view cross-project editors as tourists, I just view cross-project editors like you as tourists" isn't exactly making the premise any less malicious. For your information, I personally provide a link to my home wiki to centralize my identity and so that others can receive a timely response to messages. I consider this configuration to be of mutual benefit and so I utilize a dirty workaround to make possible what will likely be a standard MediaWiki feature at some point in the future. Regardless, this has yet once again devolved into ignoring the issue while making this about me. — C M B J 09:34, 4 June 2013 (UTC)
-
-
-
-
-
-
- Granted, Mglovesfun was pretty rude, and your experience overall hasn't exactly been pink horsies and rainbows. Still, you aren't completely blameless, either. Before you came along, the discussion consisted of 5 edits totalling 778 characters. SemperBlotto called it a "useless category", but otherwise the comments centered on practical issues. Pretty mild stuff.
-
-
-
-
-
-
-
- A week later, you decided to weigh in. Ignoring the entire discussion, you set out to educate us about how your category was the only thing preventing us from descending into a morass of error and mediocrity. You started out with "This concept is itself independently notable and such categorization is necessary for the eventual completeness of our project".
-
-
-
-
-
-
-
- From your very first sentence, you set yourself to the task of telling us in absolute terms what Wiktionary has to have in order to be any good at all. Your comment later on is telling: "The fact that their content guidelines differ is not an excuse for abrasive and callous attitudes toward those who are stellar enough to contribute in multiple areas of concentration." No false modesty there. The fact that most of us also contribute to Wikipedia seems to have escaped you.
-
-
-
-
-
-
-
- Except you don’t seem to understand what you’re proposing to change: notability is strictly a Wikipedia concept- our CFI center on usage. What's more, categories aren't content- they're tools for organizing and navigating through the dictionary entries, which are the real content. The completeness of the project has nothing to do with categories.
-
-
-
-
-
-
-
- We do things differently than Wikipedia not because we don't know any better, but because Wiktionary is a dictionary, and Wikipedia is an encyclopedia. Dictionaries are highly structured and concise- we don't go into much detail, because people come to us for very specific types of information, and everything else is clutter. Your category has all the markings of a typical Wikipedia list article, starting with the interesting concept. As I mentioned above, our categories are mostly for organization and navigation- not for telling a story.
-
-
-
-
-
-
-
- You then added almost 50 lines of unnecessary examples regurgitated from popular websites, complete with footnotes/bibliographic references, for a total of 6 edits and 5876 characters- 7 1/2 times the size of the entire discussion- before even starting to address so much as a word of what anyone else had said. I'm pretty verbose, myself, but that’s a lot!
-
-
-
-
-
-
-
- To sum it up: you tried to graft encyclopedic concepts onto a dictionary, jumped into the discussion about it without addressing anything already said, dumped huge amounts of verbiage on us while still missing the point, talked to us like you were introducing civilization to the heathens, and then wondered why everyone got annoyed at you.
-
-
-
-
-
-
-
- What it boils down to, is this: the category that you thought of as the ideal way to dress up this nondescript little backwater of ours was nominated for deletion as useless by the locals. You seem to have taken this as a criticism of your judgment, and have a very strong emotional vested interest in fighting off the challenge. You don't want to hear that, so you've been repeatedly ignoring the issue while making this about Wiktionary. I would say more, but this has grown to almost half the size of your original post... Chuck Entz (talk) 09:52, 4 June 2013 (UTC)
-
-
-
-
-
-
-
- We now have one account across Wikimedia Wikis, and come August, the chance that anyone might believe there's two CMBJs editing on Wikimedia will be removed with the renaming of unified accounts. It's not mutually beneficial; it left me on another wiki when I was trying to check your contributions, I would have had to deal with completely irrelevant material if I wanted to leave a message there, and certain users may not be able to leave a message at all. (I know of at least two major editors on Commons that are blocked on en.WP.)
- You keep saying it's not about you, but you were one party in all these discussions. How could we have best informed you that we were deleting the category in all forms? If you can't think of a way, you're saying the members of this Wiki don't have the right to choose what content they find acceptable.--Prosfilaes (talk) 17:46, 4 June 2013 (UTC)
-
-
-
- Discussion is rapidly becoming uncivil and unproductive. Caution is advised. --Yair rand (talk) 10:05, 4 June 2013 (UTC)
- Don't feed the trolls! —This unsigned comment was added by 82.18.16.213 (talk • contribs).
- Agreed. I spent so much time trying to come up with a coherent explanation of the problems I saw in all this, that I just ended up tired and grumpy. I take back the negative tone of my comments, but I don't have time or energy to rework everything right now. It will have to stand in its current ugliness until I can rework it and address the real issues I was trying to get across. Chuck Entz (talk) 12:15, 4 June 2013 (UTC)
-
-
-
-
-
- Individually,
-
-
-
-
-
-
-
-
-
-
- “Granted, Mglovesfun was pretty rude, and your experience overall hasn't exactly been pink horsies and rainbows. Still, you aren't completely blameless, either. Before you came along, the discussion consisted of 5 edits totalling 778 characters. SemperBlotto called it a "useless category", but otherwise the comments centered on practical issues. Pretty mild stuff. A week later, you decided to weigh in.”
- The reason that I weighed in a week later on this matter is because no one had the courtesy to notify me of the deletion discussion. I found it accidentally while navigating for unrelated reasons.
-
-
-
-
-
-
-
-
-
-
-
- ”Ignoring the entire discussion, you set out to educate us about how your category was the only thing preventing us from descending into a morass of error and mediocrity. You started out with "This concept is itself independently notable and such categorization is necessary for the eventual completeness of our project". From your very first sentence, you set yourself to the task of telling us in absolute terms what Wiktionary has to have in order to be any good at all.”
- I strongly disagree that I ignored this discussion and in fact my original response was intended to address prior concerns (“useless”, “difficult to manage”, “necessarily subjective”) by presenting a cogent case otherwise (“independently notable”, “necessary for completeness”, and as a clarification, “scholarly examples exist”). Moreover, I do believe that this information is necessary for the eventual completeness of this project. I base that view on the observation that many publications have expressed interest in this particular area.
-
-
-
-
-
-
-
-
-
-
-
- “Your comment later on is telling: "The fact that their content guidelines differ is not an excuse for abrasive and callous attitudes toward those who are stellar enough to contribute in multiple areas of concentration." No false modesty there. The fact that most of us also contribute to Wikipedia seems to have escaped you.”
- This was not false modesty and this assertion may very well be the most offensive remark made since this ordeal began. The comment does not refer to my self-image but my view of each and every individual who meets this description, many of whom are truly stellar in every sense of the word.
-
-
-
-
-
-
-
-
-
-
-
- ”Except you don’t seem to understand what you’re proposing to change: notability is strictly a Wikipedia concept- our CFI center on usage. What's more, categories aren't content- they're tools for organizing and navigating through the dictionary entries, which are the real content. The completeness of the project has nothing to do with categories. We do things differently than Wikipedia not because we don't know any better, but because Wiktionary is a dictionary, and Wikipedia is an encyclopedia. Dictionaries are highly structured and concise- we don't go into much detail, because people come to us for very specific types of information, and everything else is clutter. Your category has all the markings of a typical Wikipedia list article, starting with the interesting concept. As I mentioned above, our categories are mostly for organization and navigation- not for telling a story.
- The only thing I was/am proposing is that this information—which, again, I believe to be necessary for the project's completion—not be needlessly eradicated. The way it is presented in makes little difference in my mind, so long as it's easily accessible to readers.
-
-
-
-
-
-
-
-
-
-
-
- ”You then added almost 50 lines of unnecessary examples regurgitated from popular websites, complete with footnotes/bibliographic references, for a total of 6 edits and 5876 characters- 7 1/2 times the size of the entire discussion- before even starting to address so much as a word of what anyone else had said. I'm pretty verbose, myself, but that’s a lot!”
- These examples were preceded by the question of “what would go in these categories?” and I consider them to have been a decent response. The sources were presented in such a way that would convey their journalistic nature, attributions were provided to avoid plagiarism, and they were formatted in the usual way.
-
-
-
-
-
-
-
-
-
-
-
- ”To sum it up: you tried to graft encyclopedic concepts onto a dictionary, jumped into the discussion about it without addressing anything already said, dumped huge amounts of verbiage on us while still missing the point, talked to us like you were introducing civilization to the heathens, and then wondered why everyone got annoyed at you. What it boils down to, is this: the category that you thought of as the ideal way to dress up this nondescript little backwater of ours was nominated for deletion as useless by the locals. You seem to have taken this as a criticism of your judgment, and have a very strong emotional vested interest in fighting off the challenge. You don't want to hear that, so you've been repeatedly ignoring the issue while making this about Wiktionary. I would say more, but this has grown to almost half the size of your original post.”
- No, I simply tried to incorporate popular lexicographical information into Wiktionary. I found that information silently nominated for deletion and sprung into action to help save it. I attempted to address the other participants' concerns individually and have continued to do so as best possible. I did and still do take issue with the unwillingness of multiple participants to address cogent counterarguments.
-
-
-
-
-
-
-
-
-
-
- Again, and as a final note, I want to reiterate and make unequivocal that this thread was not intended to be focused on the RfD. It is not about and was never about me or my opinions here. It is, however, about how toxic this environment feels from the perspective of a new user, which unfortunately has been further echoed by this discussion. — C M B J 11:19, 4 June 2013 (UTC)
-
-
-
-
- It seems most people here (especially judging from Mglovesfun's comments) just tried to bash the new contributor instead of taking their time and help him to contribute what he want in the "right" way (which differs in every project); of course because it's the easiest way to deal with new users. The worst part was the comment by the idiot who accused him of being a troll. Chuck Entz is right about the purpose of the category namespace, and CMBJ is also right that this information is necessary and quite useful. The solution here is that these informations should be put in the appendix namespace, as a list, and I think it would become a quite useful one. --Z 12:07, 4 June 2013 (UTC)
- I hereby admit to being rude and promise to try to do better.
- I wholeheartedly accept instances of this gesture as making amends. Additionally, if my own actions led to ill feelings for anyone involved at any point, then I ask forgiveness and offer my commitment to continued cooperation and respect in all efforts that contribute to the advancement of our common mission. — C M B J 11:15, 5 June 2013 (UTC)
- Silent deletions are really infuriating. Every {rfv, rfd}-ed page should have all of its respective contributors notified on their talk page (perhaps by a bot, it's easily automatable). Or even better - through a notification gadget like the one on Wikipedia. --Ivan Štambuk (talk) 14:07, 4 June 2013 (UTC)
-
- That is apparently part of the basic software and is available in user "Preferences". I find it very useful to track the limited number of pages I watch in WP, Species, Commons, and MediaWiki. DCDuring TALK 16:05, 4 June 2013 (UTC)
- I've created a proposal to help prevent this from happening to others in the future. — C M B J 11:10, 5 June 2013 (UTC)
- That is apparently part of the basic software and is available in user "Preferences". I find it very useful to track the limited number of pages I watch in WP, Species, Commons, and MediaWiki. DCDuring TALK 16:05, 4 June 2013 (UTC)
Lua-cising Template:context[edit]
I imagine we want to convert {{context}} to Lua at some point, so I am wondering what the best way would be. Ruakh made a start with creating a replacement some time ago, {{label}}. It's used on a few pages but it's template-based and uses subtemplates instead of "raw" templates. One of the advantages of using subtemplates is that it eliminates any conflicts between context labels and other templates. {{context}} would use any template that had the same name as the label, which often causes problems (the recent issue with {{abbreviation}} is one example). On the other hand, because {{label}} doesn't use the "bare" template as the label, it's not possible to write something like {{intransitive}} by itself, you'd need {{label|intransitive}} instead.
The most straightforward way to convert these to Lua is probably something like Module:languages, with a single data module containing all the information for the context labels, and a separate module to handle the processing and display. I think that the approach used by {{label}}, in which labels always need {{context}} or {{label}} prefixed, is preferred for a Lua implementation. It would drastically reduce the number of context templates we need to maintain, it would remove any conflicts between labels and other templates with the same name ({{plural}} for example!), and it would also prevent any desynchronisation between the templates and the module. For example, if someone creates a new context label, they'd need to remember to also create a matching template, which would not really add much value to the system and just be there for convenience. It makes more sense to not create those templates in the first place and to always require the same template to "initiate" the process. Another advantage is that bots, if they want to parse entries, no longer need a long list of which templates can possibly be used as context labels, because there'd only be one.
Another change I would like to make while we're at it, is to use the first parameter to specify the language code. It's common for editors to forget to specify the lang= attribute because they're not aware that some context labels add categories. The problem is compounded by the fact that only some labels categorise while others do not so editors need to remember this for every label. {{intransitive}} does not categorise for example, so it's easy to miss this and, when you want to add a second label like {{rare}} (which does categorise), to forget the language. I believe that requiring the language as the first parameter will help with these problems because then it can never be forgotten or skipped, so it makes editors more aware that they need to put something there and that they need to change it when copying content to another language.
I'd like to hear what you think and if you have any specific points to raise. —CodeCat 16:18, 3 June 2013 (UTC)
- I support everything. My only suggestion is that if we are going to make
{{context}}obligatory, we should create a shorthand, like{{x}}or something, redirecting to it. — Ungoliant (Falai) 19:23, 3 June 2013 (UTC)- We could also use something a bit more descriptive like
{{con}}.{{x}}is really vague. —CodeCat 19:27, 3 June 2013 (UTC)- That’s a language code, and
{{c}}is a grammatical label. What about{{ct}}or{{ctx}}? (But let’s not forget that any 2 or 3 letter template is a timebomb waiting for ISO to release it as a language code.) — Ungoliant (Falai) 19:31, 3 June 2013 (UTC)
- That’s a language code, and
- We could also use something a bit more descriptive like
Sounds good, but let’s abandon the misleading name “context.” Labels like {{pejorative}}, {{plurale tantum}} and {{abbreviation}} are nothing to do with context.
Context means two different things in lexicography. One is the context a word appears in in its citation, esp. in corpus lexicography. The other is in something called discourse analysis, and seems to be only vaguely related to usage as we consider it here.
We are using this template for both usage and grammatical labels, or tags. —Michael Z. 2013-06-03 20:34 z
-
-
-
-
-
- [after edit conflict] We shouldn't be wedded to the somewhat misleading name "context" as we use the beginning-of-the-definition-line position for many things, including topical labels, sense-specific complement information, semantic-grammatical classification (eg, intensifer, modal adverb), as well as register and regional and other context.
- One thing that might be very helpful in the long run would be to build in support for various default types of display for various types of tags. One useful thing would be to differentiate topic from usage context typographically. Another would be to allow semantic-grammatical tags to be non-displaying by default. This might also be useful for maintenance-related tags. I suppose such things could be done using CSS to make it easier to users to use common.css to customize display of such tags. DCDuring TALK 20:40, 3 June 2013 (UTC)
-
-
-
-
-
-
-
-
-
-
- With all of the labels codified in a Lua table, it should be easier to categorize the labelled entries, as well as to inject CSS classes. Perhaps something like
class="label-subject-history"orclass="label-grammar-intensifier", so CSS can be used to style or hide individual labels, general classes, or all of them. —Michael Z. 2013-06-03 22:22 z- We have 935 labels that might want to have their own CSS class or ID. For myself, I would rather be selecting groups, if at all possible. For some types I would think that we not need individual CSS classes. I take it that CSS does not allow one to select members of a class equal to specific text. DCDuring TALK 01:32, 4 June 2013 (UTC)
- With all of the labels codified in a Lua table, it should be easier to categorize the labelled entries, as well as to inject CSS classes. Perhaps something like
-
-
-
-
-
-
-
-
-
-
-
-
-
- Are you asking whether CSS can select and style based on the text of the content? No. But if we are putting that text into the page, then there’s practically no overhead in also putting it into the class attribute.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Actually, using simple class selectors would require separate classes for the levels of categorization, as
class="label label-subject label-subject-history", allowing one to style all labels, or labels in a category, or a specific label. Leaving out the individual label class would save a tiny bit of overhead in loading time and page weight, but I guess it would be insignificant.
- Actually, using simple class selectors would require separate classes for the levels of categorization, as
-
-
-
-
-
-
-
- Why over-Luaize everything?
{{label}}seems like a simple enough solution that should fit all our needs. -- Liliana • 20:42, 3 June 2013 (UTC)
-
- Template:label's subpages all have a common piece of code, which I don't like. It's harder to maintain, if one decides to perform any change then a lot of pages need to be changed. I support the proposed change. --Z 21:09, 3 June 2013 (UTC)
- Although I do think there's a risk of going too far with Lua,
{{context}}is one template that absolutely should be Luacized, for performance, for readability, and for correctness. (The demo that I created, and that Liliana-60 copied illegally to{{label}}, is an improvement over{{context}}in all three respects, but a Lua module would be a much greater improvement. I would never have created that demo if I had known that we'd get Lua so soon.) —RuakhTALK 06:44, 4 June 2013 (UTC)- Illegally? Are you the dictator of Wiktionary? -- Liliana • 09:41, 4 June 2013 (UTC)
- I think Ruakh is referring to the copyright violation. --Yair rand (talk) 09:42, 4 June 2013 (UTC)
- Illegally? Are you the dictator of Wiktionary? -- Liliana • 09:41, 4 June 2013 (UTC)
Then if nobody minds, I will convert the few uses of {{label}} that are still present back to {{context}}, so that we can work on it and eventually convert {{context}} to the new Lua-powered {{label}} altogether. —CodeCat 11:52, 4 June 2013 (UTC)
I've been working on adding an explicit call to {{context}} to the labels, but there are a lot of them (160 thousand...) so it will take some time even with a bot. The progress is at Category:Context label called directly. I noticed that quite a few pages misuse the templates by using them as something other than a context (like where {{qualifier}} whould be better). But I have also realised that there is a more fundamental problem with some of the labels we need to address. Labels can have different "scopes" so to say: it can be used to specify a topic, it can indicate restricted usage (by field, place), and so on. Currently, the labels are just names and do not distinguish between these types, but there can be some ambiguity in quite a lot of cases. For example, it could be desirable to use a label to restrict a term to the topic of a particular country, but all of our country labels are currently used for restricted usage (that is, dialectisms), so this is not possible. If you write {{context|Britain}} then the term is assumed to be a Britishism, even when you really want it to mean that the term pertains to Britain. So you may get something like Category:British Dutch when you really wanted Category:nl:Britain. I'm not really sure how to solve this currently, but I do think it's important. —CodeCat 15:51, 6 June 2013 (UTC)
- Can you give an example? I’m not sure how the topic of a particular country is a usage, but it might be used only the academic field of British studies, or have a special meaning when speaking about Britain, or only when referring to a sense of a thing that is in Britain (although the last is properly part of a definition and not a usage). This kind of usage categorization is problematic, because many editors start to categorize things with them rather than terms (like animal was being applied to names of animals). We once had label
{{London}}and category:London, but got rid of them because they were just labelling the names of things in London. —Michael Z. 2013-06-10 15:42 z
-
- Aw, crap. —Michael Z. 2013-06-10 15:45 z
- I think bush would be an example. It has a meaning that originated in Australia, but is now used worldwide in that sense. Nevertheless, the word doesn't refer to that same thing outside the context of Australia, so only when speaking of Australia it has that specific meaning. —CodeCat 15:49, 10 June 2013 (UTC)
- Aw, crap. —Michael Z. 2013-06-10 15:45 z
-
-
-
- The definition already says “area of Australia,” so it is clear what thing is the referent. This is not usage.
-
-
-
-
-
- You could refine the usage label as
{{chiefly|_|Australian}}or{{originally|_|Australian}}to indicate that it is not only used in Australian English.
- You could refine the usage label as
-
-
-
-
-
-
- Of course, the lexicographer could also account for nuance: an Australian in Asia may refer to the local countryside as the bush. This could be analyzed as the Australian/Canadian sense of bush meaning “countryside,” and the global sense meaning “Australian countryside.” —Michael Z. 2013-06-10 16:18 z
- We do have a lot of labels like
{{chess piece}},{{enzyme}},{{organic compound}},{{logical fallacy}}and so on. Should we get rid of those? —CodeCat 16:20, 10 June 2013 (UTC)
- We do have a lot of labels like
- Of course, the lexicographer could also account for nuance: an Australian in Asia may refer to the local countryside as the bush. This could be analyzed as the Australian/Canadian sense of bush meaning “countryside,” and the global sense meaning “Australian countryside.” —Michael Z. 2013-06-10 16:18 z
-
-
-
-
-
-
-
-
-
- I would say yes, with care.
{{enzyme}}is used to label maltase with biochemistry, but the term is not restricted to biochem.{{organic compound}}labels the widely-known substances like amyl nitrate, ethanol and lactic acid, etc as technical terms restricted to the field of organic chemistry and put them into the non-lexicographical category:en:Organic compounds (why do we need to distract readers from the much better w:en:Category:Organic compounds?). Shortcut templates like these just encourage editors to categorize referents instead of labelling usage. A label should be explicitly what it is, so editors can understand what it is and does. Mzajac
- I would say yes, with care.
-
-
-
-
-
- For the record, I have a lot of doubts about bot edits in which "vulgar" is replaced with "context|vulgar". Making these edits from a thread entitled "Lua-cising Template:context" seems rather inadvisable, to say the least. I am disappointed. --Dan Polansky (talk) 18:10, 7 June 2013 (UTC)
Trademark discussion[edit]
Hi, apologies for posting this in English, but I wanted to alert your community to a discussion on Meta about potential changes to the Wikimedia Trademark Policy. Please translate this statement if you can. We hope that you will all participate in the discussion; we also welcome translations of the legal team’s statement into as many languages as possible and encourage you to voice your thoughts there. Please see the Trademark practices discussion (on Meta-Wiki) for more information. Thank you! --Mdennis (WMF) (talk)
Universal Language Selector to replace Narayam and WebFonts extensions[edit]
On June 11, 2013, the Universal Language Selector (ULS) will replace the features of Mediawiki extensions Narayam and WebFonts. The ULS provides a flexible way of configuring and delivering language settings like interface language, fonts, and input methods (keyboard mappings).
Please read the announcement on Meta-Wiki for more information. Runab 14:07, 5 June 2013 (UTC) (posted via Global message delivery)
- Excellent. We'll finally have an easy way of typing in different languages. --Yair rand (talk) 15:50, 5 June 2013 (UTC)
This seems to be breaking font specification. See Talk:Fraktur, where the specified font in the Fraktur sample only appears up if JavaScript is disabled, in Safari/Mac 6.0.5, Firefox/Mac 21.0, and Chrome 27.0. To sum up:
This works:
style="font-family:UnifrakturMaguntia, UnifrakturCook, Unifraktur, serif;"
Keinen Unparteiiſchen wird der Einwand ungläubiger Theologen: wenn es Typen geben ſolle, ſo müſte ihre Abſicht von den Zeitgenoſſen ſchon erkannt worden ſehn, ſonderlich beunruhigen können.
This fails:
lang="de" style="font-family:UnifrakturMaguntia, UnifrakturCook, Unifraktur, serif;"
Keinen Unparteiiſchen wird der Einwand ungläubiger Theologen: wenn es Typen geben ſolle, ſo müſte ihre Abſicht von den Zeitgenoſſen ſchon erkannt worden ſehn, ſonderlich beunruhigen können.
—Michael Z. 2013-06-14 19:00 z
- Wasn't the point of WebFonts that the user didn't have to have the font installed in order to have it render correctly for him? Because what you have written above after "This works:" is perfectly legible for me, but doesn't appear in Fraktur. Maybe if I tracked down and installed UnifrakturMaguntia on my computer it would, but doesn't that defeat the purpose? —Angr 19:16, 14 June 2013 (UTC)
-
-
- I don’t know the precise point of the ULS, but in this case it steals control. If its point is readability, then it should add fallback fonts, not override the editor’s choices. It isn’t a case of the user didn’t have to have the font installed, it’s the user may as well not have it.
-
-
-
- It is also not as smart as it should be, because it is stupid about script tags. If I correctly tag the language-script as German Fraktur with
de-Latf, it still prevents correct rendering. —Michael Z. 2013-06-14 20:59 z Also drives me fucking nuts by making a stupid keyboard icon pop on and off constantly while I type in this edit field.
- It is also not as smart as it should be, because it is stupid about script tags. If I correctly tag the language-script as German Fraktur with
-
Keinen Unparteiiſchen wird der Einwand ungläubiger Theologen: wenn es Typen geben ſolle, ſo müſte ihre Abſicht von den Zeitgenoſſen ſchon erkannt worden ſehn, ſonderlich beunruhigen können.
Format of articles, why not put the definition in a lede at the top?[edit]
Take a look at toches as just one example, wouldn't that article be a lot more useful (and nicer) if right at the top of the page there was a lede that gave the definition? As it is, the definition is dead last to many various and sundry less important things. I'm sure this question has come up, but I did some looking and couldn't begin to find it. 108.54.62.155 17:30, 5 June 2013 (UTC)
- It comes up on WT:FEED quite a lot. There's an argument for it, one argument against it is when say 'the definition', many words have more than one definition, some words have dozens of definitions. Since we're a multilingual dictionary we do need to put what the language is. The definition of lit changes a lot depending on what language you're speaking. Inflection is quite important but could I suppose go after the definitions, as could just about everything else. I've seen at least one suggestion to use the ===Definitions=== headers which I quite like. I suppose, one factor that shouldn't be underestimated is how users will get used to the format if they use Wiktionary enough. It's pretty simple; I'd've thought most people can learn it in a matter of minutes. Mglovesfun (talk) 17:43, 5 June 2013 (UTC)
- It would be a good idea. If we were able to separate data from presentation (the two concepts are currently hideously intertwined here), we could easily generate custom layouts with javascript. Until then any change would be counterproductive. DTLHS (talk) 18:03, 5 June 2013 (UTC)
-
- But remember that in most documents, certainly in web pages, the text is serialized – it has an inherent order. This order will manifest itself in many contexts – excerpts, search results, mobile view, in non-visual browsers including screen readers and braille readers, when our data is reused elsewhere, etc.
-
-
As things currently stand, we've got data that's supposed to have a specific structure, but that structure is enforced manually by humans (and by bots), rather than by the system itself. This is rather horrible, in a number of different ways -- it's terribly inefficient, it's error prone, it unavoidably mixes data and presentation in ways that have been recognized to be huge no-nos, and it's very labor intensive. Building a database by hand is not the best way to go about it. ;)
- In a couple of my jobs now, I've had occasion to poke around looking at various terminology management apps. One that was quite interesting was TermWiki, which (as best I understand it) is mostly MediaWiki with the semantic extensions added in (http://semantic-mediawiki.org/, itself relying much upon mw:Extension:Semantic_Forms), one or two other openly-available extensions, and some custom whizbang. I have no idea how much pull or push we have with regard to how the Wiktionary back-end is set up, but Semantic MediaWiki is high on my wish list for this site.
- If any MediaWiki extensions are entirely off the table, I think we should explore building tools ourselves to emulate that kind of automated structure building and integrity management. Why should I be expected to remember all the wrinkles of WT:ELE? That kind of structure is exactly what a database provides, automatically. Users shouldn't have to even be aware of this; it should just happen. We would avoid a huge class of entry maintenance problems if we could do this. -- Eiríkr Útlendi │ Tala við mig 17:31, 6 June 2013 (UTC)
-
- Basically I agree. On the other hand you could look at it though as being similar to Wikipedia pages about people: they start out with the person's childhood, even though that is often not really what most people care about. Yet it's natural to begin at the beginning. The etymology is sort of a description of how the word grew up, if you will. --Haplology (talk) 15:32, 6 June 2013 (UTC)
- Wikipedia's articles about people start with a summary of one's life and important achievements. It is strange to talk about the etymology of a word for which no definition has been given yet. I've also never understood why pronunciations are placed before the definitions. On fr.wikt we've moved the pronunciation some time ago, but the etymology remains at the top, too (but in our case we still have a general section for all homographs, which is both good and bad). Dakdada (talk) 16:16, 6 June 2013 (UTC)
- I think the original reason for putting the etymology at the top was to distinguish words with different etymologies. I'm not really sure that is the best solution, though. -- Liliana • 16:23, 6 June 2013 (UTC)
- Also print dictionaries traditionally put etymology and pronunciation before the definition, so our order matches what you might see in one of them:
- I think the original reason for putting the etymology at the top was to distinguish words with different etymologies. I'm not really sure that is the best solution, though. -- Liliana • 16:23, 6 June 2013 (UTC)
- Wikipedia's articles about people start with a summary of one's life and important achievements. It is strange to talk about the etymology of a word for which no definition has been given yet. I've also never understood why pronunciations are placed before the definitions. On fr.wikt we've moved the pronunciation some time ago, but the etymology remains at the top, too (but in our case we still have a general section for all homographs, which is both good and bad). Dakdada (talk) 16:16, 6 June 2013 (UTC)
horse [OE. hors] /hɔrs/ n. 1. A hoofed mammal, Equus ferus caballus, often used throughout history for riding and draft work. 2. A piece of gymnastics equipment with a body on two or four legs, approximately four feet high with two handles on top. [...]
-
-
-
- However, all the print dictionaries I currently have to hand do in fact put their etymologies at the end of the entry, not at the beginning. They do all put the pronunciation first, though. Instead of ===Etymology 1===, ===Etymology 2===, and the like, maybe we could find some other heading like ===Word 1=== or ===Form 1=== or ===Lexeme 1=== to use instead. One thing that's bothered me about separating by etymology is that people are often tempted to separate parts of speech this way, for example listing the noun house as ===Etymology 1=== and saying it comes from Old English hūs, and the verb house as ===Etymology 2=== and saying it comes from Old English hūsian. Which is strictly speaking true, but not the way I think we should be using those headings. —Angr 17:08, 6 June 2013 (UTC)
-
-
-
-
-
In Japanese, that (breaking an entry down by each etym) does actually seem to be the best organization -- sometimes you have a single "term" as written, but it's actually umpteen different "terms" as spoken, and each has its own etymology. Lumping all of those different readings and definitions together and then trying to explain the etymologies of each after the fact would be horribly confusing. C.f. 愛#Japanese, 大人#Japanese (multiple readings shown, but the entry still needs etyms & pronunciations, etc.), 目#Japanese, and so forth.
- One suggestion would be to amend the CSS or JS to make etymology sections auto-collapse, so users only see the text if they want to (by either clicking or by customizing their CSS/JS). Just the etym text itself, not including the subsections thereunder. Inspecting the rendered page shows that the etym headers are
<h3>elements, generally followed by a series of<p>elements until the next header. I know that some of our subsections use collapsible divs, like in{{der-top}}; I have no real idea how difficult it would be to auto-collapse a series of<p>elements based on their relative position in the page. -- Eiríkr Útlendi │ Tala við mig 17:18, 6 June 2013 (UTC)
-
-
- I have always thought that XML would be more suited to making a dictionary, because it more strictly separates formatting and structure. It also has the advantage of being able to validate pages and reject them if they are invalid. That would allow the software itself to check the formatting, and add categories for missing inflections and so on. —CodeCat 17:40, 6 June 2013 (UTC)
- That or JSON. Anyway, some things that would have to happen: 1. write a parser / validator (in Lua?) 2. create a new editing interface (javascript) 3. figure out how to make it work with our existing template, Lua and javascript infrastructure. 1 and 2 are easy enough but I'm not sure how 3 would work (can you call templates from a string in Lua?). DTLHS (talk) 19:51, 6 June 2013 (UTC)
- Writing our own parser seems a bit pointless because we would want to make use of something like XML schema for validation, and XSL as well to transform the data into HTML. —CodeCat 19:56, 6 June 2013 (UTC)
- Yes you're right. How do you envision something like this working with our existing infrastructure? Or would we have to scrap everything. DTLHS (talk) 20:10, 6 June 2013 (UTC)
- Just write a new mw:ContentHandler (though this would probably require some coöperation from the WMF…). As for scrapping everything, I would implement migration this way: first split pages into per-language elements containing blocks of wiki markup, and then successively refine the markup schema to capture more of the entry structure, and rewrite pages into the new schema. With current quite consistent usage of templates, I think Wiktionary markup is already quite machine-readable, so actually bots could be doing the conversion. And when they fail, they could just leave wiki markup in place as a fallback. Keφr 20:28, 6 June 2013 (UTC)
- Something like that would work, yes. One effect of having a strict separation of content and presentation is that we would need to split our templates and modules in the same way. A template or module could, under this scheme, either generate content or display it, but not both. This would have quite a few consequences that we would need to work out. Inflection tables are probably the easiest to do, but other things will need more thought. —CodeCat 20:41, 6 June 2013 (UTC)
- Just write a new mw:ContentHandler (though this would probably require some coöperation from the WMF…). As for scrapping everything, I would implement migration this way: first split pages into per-language elements containing blocks of wiki markup, and then successively refine the markup schema to capture more of the entry structure, and rewrite pages into the new schema. With current quite consistent usage of templates, I think Wiktionary markup is already quite machine-readable, so actually bots could be doing the conversion. And when they fail, they could just leave wiki markup in place as a fallback. Keφr 20:28, 6 June 2013 (UTC)
- Yes you're right. How do you envision something like this working with our existing infrastructure? Or would we have to scrap everything. DTLHS (talk) 20:10, 6 June 2013 (UTC)
- Writing our own parser seems a bit pointless because we would want to make use of something like XML schema for validation, and XSL as well to transform the data into HTML. —CodeCat 19:56, 6 June 2013 (UTC)
- That or JSON. Anyway, some things that would have to happen: 1. write a parser / validator (in Lua?) 2. create a new editing interface (javascript) 3. figure out how to make it work with our existing template, Lua and javascript infrastructure. 1 and 2 are easy enough but I'm not sure how 3 would work (can you call templates from a string in Lua?). DTLHS (talk) 19:51, 6 June 2013 (UTC)
- @CodeCat, we already have the database, no? Why recreate it using XML? And if we're about to embark on anything that "would probably require some coöperation from the WMF", shouldn't we first look at things like Semantic MediaWiki, given that they've already done a lot of the work? I know some folks like reinventing the wheel for the thrill of learning how it was done, but I'm more interested in having a back-end that works well, and sooner rather than later. ;) -- Eiríkr Útlendi │ Tala við mig 21:39, 6 June 2013 (UTC)
- XML and a relational database are very different things. Databases are meant for representing raw data without structure, whereas we definitely do want structure. XML is also much easier for people to edit because it's human-readable text, whereas for a database we'd need to make a whole interface as well. —CodeCat 21:50, 6 June 2013 (UTC)
- Really, I think we need to look at what's already out there before embarking on anything this major. What you're proposing sounds to me an awful lot like something that's already been done. To wit:
- We already have the MW database. Readily available extensions already provide much of the functionality required to ensure structural consistency and integrity. See above about the semantic and form extensions.
- Wikitext is also human-readable, and it's much less verbose than XML, and it's already supported.
- Those same readily available extensions also already supply an interface.
- Jumping right into creating a whole huge infrastructure for reworking everything about Wiktionary to use XML and then reworking everything about the UI to deal with that XML, all pretty much from scratch, strikes me as potentially foolhardy. Again, I'm sitting here looking at something that looks an awful lot like a wheel, and that already exists. And then I hear you talking about plans to invent one. :-/
- Note, I'm honestly not trying to be obstructionist, I'm really just trying to make sure that we proceed with our feet on the ground, and in possession of all the relevant facts. -- Eiríkr Útlendi │ Tala við mig 22:17, 6 June 2013 (UTC)
- Why would we need a whole new UI? My intention is that when you click edit, you are served with XML-ified page content instead of wikitext, and you edit it in that form. Then you save, and saving will validate the page and if it's valid, it's done. The editing process itself would not change at all. The only thing that would change is that there is an XML validation step followed by an XSL-driven pre-parser which converts the XML into wikitext (or directly to HTML cache). So from the wiki's point of view we would still store things in pages like we do before, only the source code of those pages would be the more data-oriented XML instead of the current presentation-oriented wikitext. —CodeCat 23:29, 6 June 2013 (UTC)
- I'm confused -- it sounds like you're suggesting that editors would still have to remember everything about WT:ELE, and now a lot of stuff about XML, and would still be editing the raw code -- the only addition you describe that seems to make sense is the validation.
- Again, I think Semantic Wiki does everything good that you describe (enforcing structure, validating data, etc.), only without the ugliness -- data is entered using forms, not raw XML.
- Besides which, doesn't MediaWiki filter out or disallow certain kinds of markup in the raw wikitext? Or does that only apply to a small subset of HTML, like
<a>tags? -- Eiríkr Útlendi │ Tala við mig 23:36, 6 June 2013 (UTC)- No, the whole point is that we can ignore ELE because of the XSL processing step. The XSL style sheet will determine which parts of the XML tree go where, so it will rearrange the source to match whatever we like. The order of elements in the source would be dictated by the XML schema but it would not affect the end result. So you could put translations first or last in the source, or even reorder all the languages, and they'd still show up the same on the page thanks to XSL. —CodeCat 23:42, 6 June 2013 (UTC)
- Still, 1) editing raw XML == yuck. No thank you. 2) And we couldn't ignore WT:ELE entirely, as we'd still have to be aware of what kinds of content were allowed. 3) Why reinvent the wheel? -- Eiríkr Útlendi │ Tala við mig 23:52, 6 June 2013 (UTC)
- I'm not trying to reinvent the wheel. It's just how I would envision a purely source-based dictionary wiki like we do now. Of course if we drop source editing altogether we can change a lot of things, but I feel that that would not actually be very practical in many situations. Being able to copy parts of a page in one go has its advantages. —CodeCat 23:55, 6 June 2013 (UTC)
- Yes, copying chunks is quite useful. I believe that's still possible using Semantic MediaWiki; I just popped over there, and you can still get at the raw source just fine. Their page on wiki/Semantic_Forms describes some more of what I was thinking -- directing user input in ways that hide away the minutiae of proper wikitext. -- Eiríkr Útlendi │ Tala við mig 00:14, 7 June 2013 (UTC)
- I'm not trying to reinvent the wheel. It's just how I would envision a purely source-based dictionary wiki like we do now. Of course if we drop source editing altogether we can change a lot of things, but I feel that that would not actually be very practical in many situations. Being able to copy parts of a page in one go has its advantages. —CodeCat 23:55, 6 June 2013 (UTC)
- Still, 1) editing raw XML == yuck. No thank you. 2) And we couldn't ignore WT:ELE entirely, as we'd still have to be aware of what kinds of content were allowed. 3) Why reinvent the wheel? -- Eiríkr Útlendi │ Tala við mig 23:52, 6 June 2013 (UTC)
- No, the whole point is that we can ignore ELE because of the XSL processing step. The XSL style sheet will determine which parts of the XML tree go where, so it will rearrange the source to match whatever we like. The order of elements in the source would be dictated by the XML schema but it would not affect the end result. So you could put translations first or last in the source, or even reorder all the languages, and they'd still show up the same on the page thanks to XSL. —CodeCat 23:42, 6 June 2013 (UTC)
- Why would we need a whole new UI? My intention is that when you click edit, you are served with XML-ified page content instead of wikitext, and you edit it in that form. Then you save, and saving will validate the page and if it's valid, it's done. The editing process itself would not change at all. The only thing that would change is that there is an XML validation step followed by an XSL-driven pre-parser which converts the XML into wikitext (or directly to HTML cache). So from the wiki's point of view we would still store things in pages like we do before, only the source code of those pages would be the more data-oriented XML instead of the current presentation-oriented wikitext. —CodeCat 23:29, 6 June 2013 (UTC)
- XML and a relational database are very different things. Databases are meant for representing raw data without structure, whereas we definitely do want structure. XML is also much easier for people to edit because it's human-readable text, whereas for a database we'd need to make a whole interface as well. —CodeCat 21:50, 6 June 2013 (UTC)
- We don't really have very structured data at this point. We really should be using things like
{{senseid}}and (hopefully improved versions of){{etymtree}}/{{findetym}}. Switching straight over to XML or anything else to far from our current format is probably not going to happen, at least not any time soon, for a number of fairly obvious reasons. That shouldn't stop us from making more usable data, or dealing with the main topic at hand, that the definition is too hard to find. We could put cognates in a collapsible box, preferably one much smaller than the normal full-sized 100%-width large-padding boxes that translation tables use. We could make the pronunciation section more compact. We could shrink down the ginormous headers and inflection lines and the space between them. All sorts of things would help. --Yair rand (talk) 00:40, 7 June 2013 (UTC)- Thanks for mentioning
{{senseid}}, I didn't know about that feature. That's exactly the type of thing I've been thinking WT ought to have. I tried it just now with ハウス. --Haplology (talk) 06:13, 7 June 2013 (UTC) - Thank you Yair, yes, back to the main topic :) -- (rabbit hole well and duly fallen into for the day) -- in addition to the visual reworking of an entry page's style, which should be easy enough to do, another of the suggestions above that we could implement fairly easily would be to use a
===Definitions===header, clearly pointing the user to the defs. -- Eiríkr Útlendi │ Tala við mig 05:03, 7 June 2013 (UTC)- Definitons are put in PoS sections which use level >3 headings, so it should be a level >4 heading. I'm not agree with this idea though, we are already over-sectionizing informations, which works for bigger entries but just causes problems in most other cases. For example, having a separate section for pronunciation is a good idea for some languages, but doesn't work for most other ones (compare ܟܣܐ with this version). --Z 09:32, 7 June 2013 (UTC)
- @Z, it would make much more sense to put
===Definitions===at L3 with all the POS under that. That way you only need one defs header. -- Eiríkr Útlendi │ Tala við mig 15:24, 7 June 2013 (UTC) - @Z, part of the issue with Syriac (and presumably other Semitic languages that don't mark vowels) is that one written form can have multiple spoken forms. Japanese does this too, only in different ways. I've been treating each reading (i.e. spoken form) as a separate etymology, since, frankly, each spoken form has its own derivation, which users may want to know about (I do myself, which is partly why I'm happy to dig up the whys and wherefores and write it all down here). Have a look at 愛#Japanese for one example -- this single written form has three spoken forms (that I know about) that are regular words, each with its own derivation, and then tons of exceptional uses in names. 目#Japanese has four spoken forms, 盆#Japanese has three, and so forth. Hopefully food for thought, at any rate. :) -- Eiríkr Útlendi │ Tala við mig 15:36, 7 June 2013 (UTC)
- @Z, it would make much more sense to put
- Definitons are put in PoS sections which use level >3 headings, so it should be a level >4 heading. I'm not agree with this idea though, we are already over-sectionizing informations, which works for bigger entries but just causes problems in most other cases. For example, having a separate section for pronunciation is a good idea for some languages, but doesn't work for most other ones (compare ܟܣܐ with this version). --Z 09:32, 7 June 2013 (UTC)
- Thanks for mentioning
- Moving pronunciation below definitions is doable, IMHO. However, when I made the proposal in Wiktionary:Beer_parlour/2012/April#Moving_.27pronunciation.27_down_in_ELE, there were no supporting posts. --Dan Polansky (talk) 17:08, 7 June 2013 (UTC)
- Separating readings by etymologies sometimes results in repeating definitions word-for-word such as 旋風 where we have exactly the same definition all of four times. Meanwhile maybe only one of them is common and the others are freaks that only live in dictionaries, and the only note to this effect is hidden in a Usage Note. Sometimes I use
{{ja-altread}}to avoid this, but it's cheating. --Haplology (talk) 16:49, 8 June 2013 (UTC)- That particular entry could certainly use some cleanup, expansion, and clarification (I'll add it to my list).
- In general, though, Japanese presents a bit of an odd case. For folks not familiar with the mechanical issues of Japanese, imagine that eat, consume, and ingest all had their individual pronunciations, etymological derivations, and meanings -- but all shared the same single single spelling. Consequently, all would go under the same headword here at Wiktionary. So the question then becomes how to organize that entry.
- Duped defs in and of themselves I don't think are necessarily a problem; if the overlap is complete, a simple "see [some other reading elsewhere on the page]" could suffice. If the readings are really distinct, I think the etyms should be broken out, with proper etymological descriptions given. If the readings are only slightly different, and this produces no difference in meaning, I think
{{ja-altread}}is probably the way to go. See 伊邪那岐 for one such example -- this can be read as Izanagi, or Izanaki, with no real change in meaning or derivation. The difference is explained by a simple sound shift, mentioned in the etym. Meanwhile, 紫苑 is a different example, where each of the three different readings has the same def for the proper noun, but other aspects are different -- the Shien reading is never used for the common noun, and shioni is never used in the derived term. - The combination of the MW back end, the need to accommodate multiple languages per headword, and the oddities of Japanese orthography all lead to a bit of an inelegant matching, but there you have it. :) -- Eiríkr Útlendi │ Tala við mig 18:51, 8 June 2013 (UTC)
- This is a tangent, but one example of semi-synonymous English homographs that comes to my mind (though it isn't a very good example) is board+board. Amusingly, en.Wikt currently conflates the two separate but overlapping etymologies and groups of senses which that string of letters has, but if you can read German, you can take a look at how I handled them on de.Wikt. - -sche (discuss) 19:25, 8 June 2013 (UTC)
- That's interesting (though I struggle to follow German). There are indeed two separate origins, but I'm not convinced that such a clear case can be made for separation of modern meanings since the two words had already become conflated in Old English. Dbfirs 12:02, 10 June 2013 (UTC)
- There should at least be a note somewhere explaining how "a length of a piece of wood" came to mean "food". As the board entry currently stands, that development is completely unclear, probably leaving any reader quite puzzled. -- Eiríkr Útlendi │ Tala við mig 17:54, 10 June 2013 (UTC)
- That's interesting (though I struggle to follow German). There are indeed two separate origins, but I'm not convinced that such a clear case can be made for separation of modern meanings since the two words had already become conflated in Old English. Dbfirs 12:02, 10 June 2013 (UTC)
- This is a tangent, but one example of semi-synonymous English homographs that comes to my mind (though it isn't a very good example) is board+board. Amusingly, en.Wikt currently conflates the two separate but overlapping etymologies and groups of senses which that string of letters has, but if you can read German, you can take a look at how I handled them on de.Wikt. - -sche (discuss) 19:25, 8 June 2013 (UTC)
Make Template:alternative form of explicitly say that a US/UK spelling's definitions are found in the other entry[edit]
- Continued from User talk:Dbfirs#Double_links.
Recently, I noticed several edits like this, where a second link to sense of humour is put immediately after the templatised one, explicitly stating that the definitions of sense of humor are found in [[sense of humour]]. Is this desirable? Dbfirs thinks it is, because "users will expect to see definitions for a valid word in their region, so the repetition makes is clearer that the definitions are only a click away". I think it isn't, because users are no more likely to expect content on favour/favor than on their preferred spelling of kinnikinnik/kinnikinnick/kinnickinnick/etc, and no less likely to understand how a soft redirect works on one of those pages than on the other. What do you think? If it is desirable, can the extra text be added by {{alternative form of}} itself (or by whichever other template we use to redirect US/UK spellings; e.g. {{form of|Standard form}} or *{{standard form of}}), rather than being added by hand? - -sche (discuss) 22:12, 6 June 2013 (UTC)
- Redundant links with different link text are confusing for users.
- The link text is also wrong, because English spellings cannot be divided into binary US and British sets. Humour is a British and Canadian spelling, while curb is a Canadian and US spelling. Unless you disavow the classic linguistic meaning of British English, in which case humour is a UK, Irish, Canadian, Indian, South African, Australian, and New Zealand spelling.... —Michael Z. 2013-06-06 22:48 z
-
- Yes, you're right, it's more complicated than I anticipated. Perhaps the only way to deal with the variations is by adding individual usage notes? Dbfirs 05:37, 7 June 2013 (UTC)
-
-
-
-
- No, those entries are perfectly clear as they stand. It's when we provide a soft redirect from a correct spelling (for one region) to an incorrect spelling (for that region) that we need clarification. Dbfirs 21:49, 7 June 2013 (UTC)
-
-
-
-
-
-
-
-
- So how come we have a “soft redirect” from sense of humor to the simple definition at sense of humour? But full entries at humor and humour, where two separate, moderately complex pages for the spelling variations of a single actual English term will be forever impossible to keep in sync?
-
-
-
-
-
-
-
-
-
-
- Humour and humor haven't been merged yet only because no one has gotten to them yet. As I wrote in the Tea Room, I've been reducing such content duplication for about a year now, but it's hard to find entries: when I started, there were 13 pairs of supposedly synced entries (findable via Category:English synchronized entries and via HTML comments), and not a single one actually was or had even recently been synced. Unfortunately, there are many pairs like humo(u)r that are such terrible messes that they're not even categorised.
- Postscript: I just consolidated humo(u)r and colo(u)rful, leaving only colour/color (which are only synced because I synced them) as a monument to failure, a reminder of naïveté, a proof for anyone who, in the future, ever finds it hard to believe that people would have attempted something as foolish as the total duplication and perfect synchronisation of content across dozens of pages. (Let all who doubt view the entries' edit histories see that Wiktionary did once try that, and that the entries did indeed spend much of their time out-of-sync.) - -sche (discuss) 02:56, 8 June 2013 (UTC)
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- I still prefer separate entries, but I have to admit that the problem of synchronisation is almost insurmountable until we find a way to include a single set of definitions in both entries. Meanwhile, sche's amendments have produced unambiguous entries. Is everyone happy that we standardise on this format, with the optional addition of individual usage notes where the spelling situation is complicated? Dbfirs 12:14, 8 June 2013 (UTC)
- I hope that your thoughts allow for the occasional possibility that there are regional differences in meaning and usage that happen to correspond to the regional prevalence of the spelling. I hope we don't end up discouraging contributions because of seeking 'coordination'. DCDuring TALK 15:18, 8 June 2013 (UTC)
- Indeed, there are subtleties that are best explained by separate entries, but the consensus of editors seems to be that the problem of trying to keep these synchronised is insurmountable. (I agree that synchronisation is a problem.) I like your suggestion below. Do you think the experts here would be happy with the links? Should I try a few to see what people think? Dbfirs 08:24, 10 June 2013 (UTC)
- I hope that your thoughts allow for the occasional possibility that there are regional differences in meaning and usage that happen to correspond to the regional prevalence of the spelling. I hope we don't end up discouraging contributions because of seeking 'coordination'. DCDuring TALK 15:18, 8 June 2013 (UTC)
- I still prefer separate entries, but I have to admit that the problem of synchronisation is almost insurmountable until we find a way to include a single set of definitions in both entries. Meanwhile, sche's amendments have produced unambiguous entries. Is everyone happy that we standardise on this format, with the optional addition of individual usage notes where the spelling situation is complicated? Dbfirs 12:14, 8 June 2013 (UTC)
-
-
-
-
-
- There is
American and British English spelling differences on Wikipedia.Wikipedia:American and British English spelling differences. We could provide, under Usage notes, a section-specific link to the section covering specific classes of "UK-US"-type (pace MZ) spelling differences. We could even have templates like {{en-spelling -or-our}}that provided the section link and could be updated when we get our own vastly superior coverage of this and similar matters next month. DCDuring TALK 14:30, 7 June 2013 (UTC)
-
-
-
I can’t imagine how “subtleties [...] are best explained by separate entries.” Even a series of blatant contrasts in usage are difficult to assimilate while flipping between two, three, or four variant entries, or having them open in separate browser windows. And how is the reader under this heavy cognitive load supposed to discern “subtleties” from synchronization errors?
Labor, labor, Labour, and labour are not four different words. No other dictionary, print or electronic, considers moving them to separate pages, because that would be a disservice to their readers and reduce the value and utility of their resource. —Michael Z. 2013-06-10 14:58 z
- Explaining meaning is the most important task here. Usually differences are of interest to linguists. A meaning that is present in one spelling, but not in the other, belongs to the entry in that spelling in which it exists. Usage examples for a sense are often regional as well, though that would argue for revising them rather than duplicating content. DCDuring TALK 16:37, 10 June 2013 (UTC)
-
- When a non-linguist hears an unfamiliar usage of /ˈleɪbər/ on the radio, she has to learn of the existence, locate, and click through four separate articles, then compare all their senses in her head before she can be confident that she has determined this one word’s meaning. That’s a failure of the dictionary. —Michael Z. 2013-06-10 17:03 z
- I don't disagree with the general advantage of combining senses across alternative spelling entries. I'm just suggesting that we need to not prevent or even discourage user input on whatever page they prefer. Their choice of one page rather than another might be useful data about the actual distribution of the sense or they might be knowledgeable about the term's usage. I fear the consequences of the accretion of layers of rigidity and complexity leading to a gradual and premature ossification of Wiktionary, especially in English.
- It's an empirical matter as which layout might give the best results for various classes of users with various look-up needs. I am extremely skeptical that we will manage a major advance with a part-time crew of amateurs and no empirical data on how our current and potential non-linguist users use and could use Wiktionary. Wikipedia hews close to the style of an encyclopedia in most regards. Just as the QWERTY and telephone-style keyboards largely define many, many user-input interfaces, the limited variation in styles of dictionaries provides the total range of base interfaces we can use. Feasible paths to desirable innovations seem to me to be limited to incremental changes.
- For incremental changes, we don't even know whether our existing users would prefer that the main entries typically (subject to variation by type of alternative spellings and even by individual term) be the "US" or "UK" (or "North American") variants. Suggested proxies for the missing empirical data are the relative sizes of English-speaking populations in countries assumed to prefer one or the other, the preferences of our contributors generally, the preferences of our contributors who make the first entry in one of the main alternative forms, usage in the large controlled corpora like BNC and COCA, and usage in the various Googles, especially News, which enables regional differentiation. DCDuring TALK 17:32, 10 June 2013 (UTC)
- When a non-linguist hears an unfamiliar usage of /ˈleɪbər/ on the radio, she has to learn of the existence, locate, and click through four separate articles, then compare all their senses in her head before she can be confident that she has determined this one word’s meaning. That’s a failure of the dictionary. —Michael Z. 2013-06-10 17:03 z
-
- (tl;dr summary: this is a non-issue that can be sidestepped anyway in most cases)
- I doubt many senses are attested only in certain US or UK spellings. Some UK/US authors use US/UK spellings, and UK/US works are often copyedited when published in the US/UK, so even British senses of -our or -ise words are usually attested in American spelling, too.
- Even if a sense is attested only in a word's US spelling, I doubt the same word has a second sense attested only in its UK spelling. Thus, in most cases, we can sidestep the issue by making whichever spelling has the peculiar sense the lemma: then the sense is on the "right" page, and all the senses are on one page.
- If a word does exist which has some senses only when spelt -ise and other senses only when spelt -ize (can anyone think of such a word?), then I still think the content should be centralised: I agree with Michael that scattering not-entirely-overlapping sets of definitions over different pages obfuscates, rather than explains, a term's meaning. (It requires casual readers to, first of all, realise that the content is spread out across multiple pages, and then to compare the pages to see which spellings have which senses.)
- Capitalisation is an at least slightly different issue. What I've most often seen done, and what I do, when a word has different (and overlapping) senses when capitalised bzw. uncapitalised, is what is done on gypsy/Gypsy (each term has a definition line linking to the other). I try to employ similar linking from singular entries to plural-only senses (see message, messages). I suppose such linking some also be employed between -ise and -ize words, etc, but that seems unideal—less ideal that simply putting all the definitions on one page.
- DCDuring's comment of 17:32, 10 June 2013 makes me think we may be talking past each other / about two different things, however. I tend to accept that new users should be allowed to add content to whichever page they like; if the content they add to an alt form is already found in the lemma entry, their edit can be rolled back; if the content isn't found in the lemma entry, it can be moved thither. (In this as in many other things, Wiktionary might benefit from adopting "Stabilversionen", so that we could clean up noobs' misformatted entries at leisure, and be less rigid about formatting in the meantime.) - -sche (discuss) 17:43, 10 June 2013 (UTC)
-
-
- I see the basic issue as what is a term and its lemma (WT:Lemmas is sorely lacking). We have separate entries for spellings and capitalizations (while keeping together completely unrelated other-language terms that share an orthography). But a dictionary entry is properly for a term in a particular language, and offering a survey of its variable properties like spelling and especially capitalization.
-
-
-
- From the readers’ point of view, they may need to find a lemma entry based on any alternative, regional or historical spelling, and especially capitalization, which can be absent in their source – as in spoken form or most subtitles and captions – and can vary freely – as at the beginning of a sentence, in title caps, or in all-caps texts.
-
-
-
- The one lemma entry should guide the reader as to the normal or preferred spelling and capitalization, or its range of usage. So in addition to usage and grammatical labels, either entries/headwords or particular senses may require an indication of the usual or preferred spelling and capitalization, plus the variations of these in different regions, over time, or in different contexts. —Michael Z. 2013-06-10 19:32 z
-
So, can these notes – as found in entries favour and tumour – be removed? (Or at least replaced with a template so they can be easily found and altered?)
In “UK and Canada spelling of favor”, it is clear what the link takes the reader to.
In “UK and Canada spelling of favor. (For definitions, see the American spelling.)” it is unclear what “the American spelling” is referring to. There is no indication to the reader why there are two choices, and no evidence of what two things they lead to. —Michael Z. 2013-06-12 14:35 z
- Also confusing: “the American spelling,” when the target headword is labelled US, alternative in Canada. Labels on links should use the same terminology as labels in entries. Links should not be redundant, and their destination should be clear. A better alternative might be one of:
- But we have enough difficulty labelling headwords. It’s a mistake to also start labelling links in headword lines too.
- If you want to explicitly mark the relationship, then add an “Alternative forms” header. —Michael Z. 2013-06-12 15:17 z
-
- Yes, I'd already conceded to the majority view, so I've removed my addition. I've also replaced "Canada only" with "Commonwealth", even though I don't really like the label. What alternative would include NZ, Oz, SA etc? Dbfirs 16:13, 14 June 2013 (UTC)
-
-
-
- In that case, I fully agree with you. Can we change the template to say "British English" rather than "UK" when the parameter "from=British" is used? Dbfirs 08:02, 16 June 2013 (UTC)
-
-
Adding explicit "context" before context templates on definition lines[edit]
MewBot (talk • contribs) has started to replace {{vulgar}} with {{context|vulgar}} and the like on definition lines. An example edit: diff. Apparently, no one has protested so far. Let this be a record in Beer parlour that this is ongoing. I have doubts about advisability of these replacements, but do not see them as obviously wrong. I am disappointed that this was not expressly discussed.
Before the example edit:
# {{pejorative}} A man who uses services rendered by [[whores]].
After the example edit:
# {{context|pejorative}} A man who uses services rendered by [[whores]].
--Dan Polansky (talk) 18:23, 7 June 2013 (UTC)
- Can you be specific about these doubts, or are you just concern trolling? DTLHS (talk) 18:26, 7 June 2013 (UTC)
- I do not know what is "concern trolling". The new text is longer and looks more messy in the wiki text, for one doubt. WT:AGF. --Dan Polansky (talk) 18:28, 7 June 2013 (UTC)
- Okay, as per concern troll: "Someone who posts to an internet forum or newsgroup, claiming to share its goals while deliberately working against those goals, typically, by claiming "concern" about group plans to engage in productive activity, urging members instead to attempt some activity that would damage the group's credibility, or alternatively to give up on group projects entirely." --Dan Polansky (talk) 18:29, 7 June 2013 (UTC)
- Fair enough. For what it's worth, I also wonder about how productive it is to add everything under
{{context}}if that template is eventually going to be orphaned or changed to something more broad like "label". DTLHS (talk) 18:37, 7 June 2013 (UTC)- Subjectively speaking, I just don't like the new wikitext. I suppose there must be some advantage, or else the bot would not be doing it. But I am not very clear about what the advantage is. --Dan Polansky (talk) 18:42, 7 June 2013 (UTC)
- I am mainly doing it as a first step so that the initial problems have been worked out, and we can work out what (concretely) to do next. It may be somewhat redundant but it's not harmful either, and it eases the conversion later. One of the main problem points for having a template for every label (which I listed above) is that there is an automatic conflict between any label and a template with the same name. This created problems like
{{acronym}}which was originally not a context template and which was causing script errors because some pages used{{context|acronym}}as a label (correctly). Another example is our inability to use "law" as a label because of{{law}}, having to resort to "legal" instead. Removing this point of conflict, even if we don't do anything else, is therefore a definite benefit. - The advantage of adding
{{context}}now is is that once all calls are through{{context}}, we can start editing the label templates themselves without worrying about breaking this backwards compatibility, because we can assume that they are always called by{{context}}. This makes it possible to do the transition in small steps gradually, without any fear of breaking anything with one huge step. Also, the next step I was intending to do was to add lang= to the template where it's missing, as the proposal above (the name of the final template hasn't been settled yet) would have a mandatory language code as the first parameter. It will also will catch many errors. This is a lot easier to do this replacement if there is only one template to consider than if there are hundreds. —CodeCat 18:52, 7 June 2013 (UTC)- I see. The ultimate verbosity fest that you are planning is this:
# {{context|pejorative|lang=en}} A man who uses services rendered by [[whores]].
- Yuck. --Dan Polansky (talk) 19:04, 7 June 2013 (UTC)
- No, not the ultimate. It's just an intermediate step in the process. Once all labels are called via
{{context}}and they all have a language, converting all of them to another call (using another template) becomes trivial. I am just doing groundwork right now. Also see my post below. —CodeCat 19:07, 7 June 2013 (UTC)
- No, not the ultimate. It's just an intermediate step in the process. Once all labels are called via
- I see. The ultimate verbosity fest that you are planning is this:
- I am mainly doing it as a first step so that the initial problems have been worked out, and we can work out what (concretely) to do next. It may be somewhat redundant but it's not harmful either, and it eases the conversion later. One of the main problem points for having a template for every label (which I listed above) is that there is an automatic conflict between any label and a template with the same name. This created problems like
- Subjectively speaking, I just don't like the new wikitext. I suppose there must be some advantage, or else the bot would not be doing it. But I am not very clear about what the advantage is. --Dan Polansky (talk) 18:42, 7 June 2013 (UTC)
- Fair enough. For what it's worth, I also wonder about how productive it is to add everything under
- CodeCat’s proposal included: “I think that the approach used by
{{label}}, in which labels always need{{context}}or{{label}}prefixed, is preferred for a Lua implementation.” - If people read and supported the proposal, and they did, they also supported mandatory
{{context}}. — Ungoliant (Falai) 18:46, 7 June 2013 (UTC)- You seem to be referring to #Lua-cising Template:context. There, two people supported:
- "I support everything. ..." --Ungoliant
- "Sounds good, but let’s abandon the misleading name “context.” ... —Michael Z. 20
- On the day on which the post was made, the bot started replacing. But whatever. The point right now is, if someone can explain the benefits, they should do so. --Dan Polansky (talk) 18:52, 7 June 2013 (UTC)
- I don't really see the point of the effort under challenge except to remove an implementation barrier to deprecating all direct use of context tags. Presumably that is being done to bring Luacization within the capabilities of the technical resources that we have. If Luacization requires millions of extra keystrokes, thousands of them mine, then it doesn't seem like such a good idea to me. A philosophy of adding keystrokes where not necessarily essential seems exactly the opposite of what we need. I thought that Luacisation would not be used as a reason for such an obvious regression. I thought a basic principal of user interface design is: NO REGRESSIONS. DCDuring TALK 18:55, 7 June 2013 (UTC)
- I reject any notion that simply adding more text is automatically a regression. Besides, "context" can be renamed to something shorter eventually. DTLHS (talk) 19:01, 7 June 2013 (UTC)
- Indeed, and that was already discussed as well. Proposals were
{{label}},{{x}},{{ctx}},{{ct}}or{{c}}, although DCDuring you yourself said that we should abandon the name "context". Once the language code templates are out of the way, we will have a lot of freedom in naming because we can use almost any two-letter name. How would{{lb|nl|rare|archaic}}be in contrast with the current{{context|rare|archaic|lang=nl}}? It's quite a bit shorter. —CodeCat 19:05, 7 June 2013 (UTC)- You are misreading by lack on enthusiasm for the name "context" for some kind of implicit support for adding to the the typing burden. One of the great virtues of the
{{context}}system is that it allows folks to add whatever they thought was appropriate for the entry. The implementer of the system would periodically review the labels added and make a new direct context label where appropriate, sometimes making it a redirect. On the NO REGRESSION principle, I assumed that this behavior would be continued as a matter of course. That requires that there be a template to support labels not yet implemented as direct labels. That is the sole essential use at present for{{context}}. It seems that the current conception is to discard direct labeling for programming convenience. I would be inclined to add Category:Entries with redundant context templates and eliminate all those instances where context preceded a label which has its own template. DCDuring TALK 19:35, 7 June 2013 (UTC)- That part would always work. Any new proposal would certainly allow you to use "new" labels, as well as to make labels aliases of one another (something we currently use redirects for). The difference would be that the list of "recognised" labels (which get a category or a link or something else) is stored in a Lua module rather than in templates. So you could still type
{{lb|nl|I'm a label}}and it would work. And you could still then, later on, define a new label called "I'm a label" and the entry would then use that. —CodeCat 19:41, 7 June 2013 (UTC)
- That part would always work. Any new proposal would certainly allow you to use "new" labels, as well as to make labels aliases of one another (something we currently use redirects for). The difference would be that the list of "recognised" labels (which get a category or a link or something else) is stored in a Lua module rather than in templates. So you could still type
- You are misreading by lack on enthusiasm for the name "context" for some kind of implicit support for adding to the the typing burden. One of the great virtues of the
- Indeed, and that was already discussed as well. Proposals were
- I reject any notion that simply adding more text is automatically a regression. Besides, "context" can be renamed to something shorter eventually. DTLHS (talk) 19:01, 7 June 2013 (UTC)
- You seem to be referring to #Lua-cising Template:context. There, two people supported:
I support this. CodeCat is eliminating all of the conflicts and open-ended problems with {{context}}. Huzzah! 🐱
DCD, the concept of invoking a template without typing its name is completely flawed. An open-ended set of keywords is not templates, it is parameters. Abandoning the idea is not a regression. Being able to move forward is progress. —Michael Z. 2013-06-07 19:48 z
- Another way to see it is in terms of "namespaces". Not namespaces as they are in the wiki but a bit more abstract, they are the set of all possible names. Currently, the namespace that contains all possible context labels is the set that is the union of: 1. all templates that are context labels, plus 2. all other names in the Template: space that do not exist. This has created many problems because as Michael said it's an open-ended set of names for a namespace that is not open-ended, because it already contains other templates. Every template you create restricts the space of possible context labels further and there have already been conflicts.
{{plural}}is not a context label template, but most of its current transclusions are via{{context|plural}}, and they just happen to work, kind of, by chance (but{{context|plural|rare}}will break!). And then there is the problem I mentioned with{{law}}and{{acronym}}. Ruakh's original{{label}}proposal as well as my own both avoid this by making the namespace for context labels distinct from that of template names. This does mean that you have to call the template explicitly each time, but I think that is a small price to pay compared to the issues we otherwise have (and have had). —CodeCat 19:54, 7 June 2013 (UTC)- I understand the point about restrictions on available context names, though it seems rather rare in practice. I'm not in a position to evaluate what is or is not possible technically. What I see is simple regression from a entry-content-contributor PoV. For templates that are used commonly, like context or its successor a one letter name, or alias, eg
{{c}}, would seem best. At more than 300K uses before all the conversion of all the directly template labels, it would seem to have earned the right to usurp that name from the "common gender" template. - The principle of liberating text strings from their least efficient uses has even broader application. Why not reserve all one- and two-letter strings for things that are directly input by humans. The principle has already been established by such templates as
{{m}},{{l}},{{g}},{{a}}, etc. Perhaps it is time to revisit the use of two- rather than three-letter language codes to liberate more one- and two-letter codes for those doing manual data entry.{{n-g}}shows another approach for two-part names. DCDuring TALK 20:26, 7 June 2013 (UTC)- The gender and number templates may disappear in the future as well, as we now have a module for them. There are still many cases where they're present directly in entries, but we can find solutions for that. So that will free up
{{c}},{{f}},{{n}},{{p}}and so on. Currently,{{head}}already uses the module, but none of the others do, and it will take a while to migrate all of our headword-line templates over, so we can't use{{c}}just yet. I do prefer{{lb}}over{{c}}though, and we can't usurp{{l}}because that template is probably used even more widely than{{context}}. —CodeCat 20:48, 7 June 2013 (UTC)- It's slightly longer, but I'd like to make a plug for using
{{lbl}}instead of{{lb}}--{{lbl}}is an obvious mnemonic for "label" for English speakers, while{{lb}}is obtuse enough that I had to reread a few paras above to remind myself of what it was supposed to stand for. -- Eiríkr Útlendi │ Tala við mig 21:15, 7 June 2013 (UTC)
- It should be possible to put Mewbot to work on renaming the 28K instances of
{{c}}and get that done quickly.{{temp}}has fewer than 19K instances. All of these are far fewer than such templates{{head}}and the reigning champion of both raw count and redundancy(?):{{Latn}}@ 2,958K.{{Latn}}is an example of something that could easily be much longer as it is not very commonly typed in by humans. DCDuring TALK 21:18, 7 June 2013 (UTC)- It's not that easy. Only a small proportion of all uses of
{{c}}is actually from direct transclusions in entries. Most of them are called through other templates like{{t}}. So we would need to track down all the templates that use genders in such a way and change them, but that is quite a big task because it has to be done by hand, and there are many languages whose nouns might use gender templates. It would take a few weeks to track them all down and fix them, assuming of course that there's a consensus for such an operation. We can't rename{{Latn}}because its name follows an ISO standard for script codes. —CodeCat 21:55, 7 June 2013 (UTC)
- It's not that easy. Only a small proportion of all uses of
- It's slightly longer, but I'd like to make a plug for using
- The gender and number templates may disappear in the future as well, as we now have a module for them. There are still many cases where they're present directly in entries, but we can find solutions for that. So that will free up
- I understand the point about restrictions on available context names, though it seems rather rare in practice. I'm not in a position to evaluate what is or is not possible technically. What I see is simple regression from a entry-content-contributor PoV. For templates that are used commonly, like context or its successor a one letter name, or alias, eg
- Just to make it explicit: I, too, support what Mewbot is doing. It's good for the reasons Ruakh outlined earlier and the reasons CodeCat outlined recently. It's an important first step in updating our context templates, which currently have to use highly complex, recursive code to account for the possibility that they might be called directly, or called from
{{context}}, or called from another template, or followed by |another template}}, or followed by |something that isn't a template}}, or... - -sche (discuss) 02:16, 8 June 2013 (UTC)- I don't see why it is that the replacement for context can't simply compare a given label with a list of pre-existing valid labels and operate as if context had been called explicitly. Lua should make that much easier. Furthermore and more importantly, I think more attention needs to be paid to the process by which new labels are added. Our systems are increasingly effectively closed to many types of user input due to creeping top-down templatism - exactly contrary to wikiness. DCDuring TALK 17:41, 10 June 2013 (UTC)
- I'm not sure what you mean by your first point. A replacement for
{{context}}can do just that, and it will have to because that's how it already works currently. If you call{{context|label}}then currently it will look for Template:label and transclude it if it exists, otherwise it will just show the text "label". The Lua form will do the same, except that instead of looking for Template:label it will look in a Lua table of labels. —CodeCat 17:49, 10 June 2013 (UTC)
- I'm not sure what you mean by your first point. A replacement for
- I don't see why it is that the replacement for context can't simply compare a given label with a list of pre-existing valid labels and operate as if context had been called explicitly. Lua should make that much easier. Furthermore and more importantly, I think more attention needs to be paid to the process by which new labels are added. Our systems are increasingly effectively closed to many types of user input due to creeping top-down templatism - exactly contrary to wikiness. DCDuring TALK 17:41, 10 June 2013 (UTC)
North Frisian (Mooring Dialect)[edit]
Is this an acceptable level 2 language header? See greewe as an example. If not, didn't we used to have a bot that sorted these things out? SemperBlotto (talk) 08:37, 8 June 2013 (UTC)
- Wiktionary:Todo/invalid L2s picks them up. PS there is an unresolved one. Mglovesfun (talk) 08:39, 8 June 2013 (UTC)
- Editing this entry, and leaving a comment on the creator's talk page, has raised a question in my mind. We are currently moving away from directly called context tags towards using parameters of
{{context}}for everything. Does this mean that new directly called context tags are not to be created? If the creator of greewe wants to start a category for the Mooring dialect of North Frisian, can he start a new template{{Mooring}}to allow dialect terms to populate a Category:Mooring North Frisian? Or are such templates now deprecated, and everything is done by parameters of{{context}}? If the latter, what edits would have to be made to{{context}}to get it to know that anything labeled{{context|Mooring|lang=frr}}goes into the appropriate category? —Angr 11:27, 8 June 2013 (UTC)- Currently,
{{context}}still relies on the presence of directly-named templates because that's how it works. The label templates always had this dual way of using them, and can still be used that way, except that it's now discouraged to use them directly. So you still need to create{{Mooring}}for now, but you should use it as{{context|Mooring|lang=frr}}. It's likely that sometime soon the templates will be changed so that only the latter works. —CodeCat 12:15, 8 June 2013 (UTC)
- Currently,
Phonosemantic interpretation[edit]
Lawrence J. Howell (talk • contribs) is adding information under the nonstandard 'Phonosemantic interpretation' header. I don't know what to do; is this information relevant, if so, how do we include it? Mglovesfun (talk) 11:38, 9 June 2013 (UTC)
- I would categorize it under "wild theories with nothing to back them up". -- Liliana • 11:47, 9 June 2013 (UTC)
- He's also linking copiously to his own site (with his name on it), which sells books. I smell spammer. Equinox ◑ 12:13, 9 June 2013 (UTC)
- IMHO, these should not be added as long as they are sourced from no more than a single source. --Dan Polansky (talk) 12:20, 9 June 2013 (UTC)
- He asked about it here, but none of you commented. I wasn't sure what to say, myself, so I kept waiting for someone to weigh in. Regardless of the merits, he did ask. Also, I wouldn't call it spamming, since we have been using information from his site. Chuck Entz (talk) 14:43, 9 June 2013 (UTC)
- Yes, he has asked; I am not crying foul. I merely doubt that speculative information sourced from a single source is suitable for Wiktionary. --Dan Polansky (talk) 18:22, 9 June 2013 (UTC)
- He asked about it here, but none of you commented. I wasn't sure what to say, myself, so I kept waiting for someone to weigh in. Regardless of the merits, he did ask. Also, I wouldn't call it spamming, since we have been using information from his site. Chuck Entz (talk) 14:43, 9 June 2013 (UTC)
-
-
-
- Hello all. Thank you for the comments above, which are of the sort I expected might be generated in response to my original inquiry. Allow me to address them here.
-
-
The major sticking point appears to be the one expressed by Dan Polansky, who believes that single sourcing for this data is inadequate. As it happens, this research overlaps with that of an unimpeachable source for Old Chinese studies, Axel Schuessler, who discusses the relation between particular sounds and meanings in his ABC Etymological Dictionary of Old Chinese. I've uploaded a brief article detailing these overlaps, which anyone interested can find by searching my name and the name of the book. If dual citations are necessary, there are, among the characters in common use today in China and Japan that I cover, approximately 2,500 corresponding to the half-dozen sound/meaning relations specifically described by Schuessler. This corpus would seem to be an unobjectionable starting point for the Wiktionary entries.
Turning to other concerns mentioned, it appears something needs to said about the links. The editor who originally used my material linked with a (sitename).com format, providing unsolicited free advertising for my site. I was not comfortable with retaining that format for entries I was redoing, and chose to use only the family names of the individuals standing behind the claims, hiding the site name in the link coding. My intent was to be less, not more obtrusive, but the decision seems to have brought on the law of unintended consequences.
Finally, DCDuring mentioned the possibility of inserting data in a Usage Notes section, a suggestion that - -sche too had made earlier. I look forward to hearing from experienced editors about how best to go about doing this, especially in respect to coordinating the cites from my data and from Schuessler's. This would have the added benefit, I suppose, of allaying concerns about link spam. Thank you for your consideration. Lawrence J. Howell (talk) 07:16, 11 June 2013 (UTC)
- This is utter phonesthemic nonsense, largely based on outdated reconstructions. I suggest simply reverting these additions. (reverting the reverting) Wyang (talk) 09:33, 11 June 2013 (UTC)
- @Wyang, you say this is nonsense, but I don't really know how to weigh your comment -- based on what? Lawrence has pointed us to a scholarly work, providing a way for us to verify that someone out there has this theory, and ostensibly to follow up on that author's sources as well. (Google Books, for the interested.) On the opposing side, we have you and Angr, and while I respect both of you as WT editors, I have no real idea what your academic background might be, nor any real idea what underlies your calling this "nonsense". I don't have any solid basis for evaluating your comment.
- Could you unpack things a bit? Maybe even point us to sources disagreeing with this Schuessler person? Curious, -- Eiríkr Útlendi │ Tala við mig 17:38, 11 June 2013 (UTC)
- I don't have access to sources, but the statement "Old Chinese Initial /*k-/ lends semantic value Frame. Final consonant /*-t/ lends semantic value Cut/Divide/Reduce." at [36] was the first thing that set off my bullshit detector. It is axiomatic in phonological theory (the field my Ph.D. is in, since you asked about academic background) that phonemes have no inherent meaning (e.g. [37]), and although sound symbolism has some reality, it generally holds across languages rather than being language-specific, and it tends to correlate sounds with vague physical properties like "big/small", "pointy/rounded" etc. rather than very specific concepts like "frame" and "cut/divide/reduce". —Angr 19:39, 11 June 2013 (UTC)
-
- Schuessler (2007) is a pioneering work in Chinese etymology and is in general reliable, but the "phonosemantic interpretations" posted here are far from what was written in that publication. Schuessler tentatively proposed some Old Chinese prefixes and suffixes (not initials, vowels and final consonants), some of which are reconstructable at the Proto-Sino-Tibetan level, and identified some phonesthemic patterns, such as *m– for "darkness", *–m/p for "closure", which are patterns generally found translingually, especially in the E/SE Asia region. However, the theory advanced by User:Lawrence J. Howell is that every Old Chinese monosyllabic word can be reanalysed in terms of initial, vowel and final consonant, in that all or part of its phonological shape resulted from its meaning, which is of course untrue for any language without an established pattern of such word formation. For example, for the word for "two" (二, *nij-s in Baxter-Sagart (2011) or *njis in Zhengzhang (2003)), based on the outdated reconstruction *ȵi̯ær by Karlgren (1957), he proposed that it can somehow be reanalysed as the diliteral root *n–r, which encoded meanings in its individual consonants (suppleness + continuum). This is obviously false since the word came from Proto-Sino-Tibetan *g/s-ni-s ("two"), and it's not even a marginally accepted theory in Chinese or Proto-Sino-Tibetan linguistics that the word for "two" was constructed from "phonemes expressing suppleness and continuum". Linguistics is not my profession; my area is in phylogenetics or hominid evolution rate. Wyang (talk) 23:48, 11 June 2013 (UTC)
-
-
- Thank you both, that's exactly the kind of detail I felt I was missing previously. @Angr, your mention of specificity articulates an unvoiced worry in the back of my head about how over-specific a couple of these phonosemantic interpretations have been. @Wyang, your comments are particularly damning in pointing to where Lawrence is diverging from the actual sourced material.
- @Lawrence, what can you say in your defense? As it stands, I am now strongly in support of removing your additions as apparent bogosity. -- Eiríkr Útlendi │ Tala við mig 00:09, 12 June 2013 (UTC)
-
-
-
-
- @Eirikr: Thank you for steering the discussion in a productive direction. On the other hand: In my defense? Bogosity? Wow.
-
-
@Wyang: Your presentation of the contents of the ABC Etymological Dictionary of Old Chinese merits close inspection.
Schuessler tentatively proposed some Old Chinese prefixes and suffixes (not initials, vowels and final consonants)... (Tentatively? But let me not get sidetracked.) Please refer to p. 27 of the dictionary, Section 2.9 Meaning and sound. You'll find that your assertion does not square with the terms the author employs in his discussion of the nexus of particular sounds and meanings (quotation marks omitted): OC words; final -*p; final *-m; stem initial *-m; roots; stems; initial *w-; variants with other vowels; initial *l-; initial consonant; start with *n-.
... every Old Chinese monosyllabic word can be reanalysed in terms of initial, vowel and final consonant, in that all or part of its phonological shape resulted from its meaning. Regrettably, you misstate my theory, which properly accounts for the presence in OC of loan words and terms originating in onomatopoeia.
You take issue with my interpretation of initial *n- term 二. Perhaps my interpretation of a particular character is dissonant. But try coming at things from the opposite direction. In other words, begin with the normative and sort out the exceptional. You do not care for my data, so let's follow Schuessler who, on the page noted above, writes Words for 'soft, subtle, flexible', including 'flesh; female breast' start with *n-...
Turn to p. 395 of the dictionary. Over the following dozen pages, scattered among the loan words you will find enough examples of terms beginning with *n- and associated with soft/subtle/flexible that Schuessler felt justified in making the statement quoted immediately above. If you have a bone to pick with that conclusion, the author would be the person with whom to remonstrate.
As for me, among the thousands of characters I interpret, I may unwittingly be offering scores of problematic examples. I will readily acknowledge those when presented with compelling arguments. For one, I will revisit 二, and thank you for that.
The ultimate point here, however, is that Schuessler identifies what he calls phonesthemic or phonoaesthethic phenomena in OC, where certain meanings are associated with certain sounds. Nothing you have written counters that fact or undermines his finding.
Thank you, Wyang, for assisting the Wiktionary community to perceive where the lines are drawn in this matter. Lawrence J. Howell (talk) 05:51, 12 June 2013 (UTC)
- I realised I missed the bit on phonesthemic patterns but you beat me by about ten minutes in submitting the reply. I've slightly revised my post. It's true that such patterns are found in Old Chinese, but these patterns are of limited derivational consequences in OC as the majority of OC lexicon does not conform to the semantic expectations from generalised phonesthemic patterns (as proposed in KN). In addition, these patterns are generally true for other E/SE languages as well (or even wider, see Phonosymbolism and the Verb cop). Postulating that words are consistently derived using this principle is a bit like positing a proto-phoneme initial *f– for English, denoting "movement", which is responsible for deriving Modern English words like fare, fast, fight, flee, flow, fly. Even if a large proportion of Old Chinese appears to be of non-Sino-Tibetan origin, a large part of which does not even seem to be related to anything else found in languages in the vicinity (!), there is no need to resort to such extensions of the sound symbolism principles, especially when the model is 1) unrecognised elsewhere; 2) nonspecific and ambiguous in the semantic descriptions; 3) relying on outdated reconstructions; 4) gives numerous misfits apart from the true sound symbolisms (Sorry). For OC words with established ST comparanda, listing the PST etymology is sufficient. Wyang (talk) 06:40, 12 June 2013 (UTC)
-
- Thank you for your thoughtful, pointed response, Ywang. It'll be a day or two before I can reply properly. Lawrence J. Howell (talk) 07:49, 12 June 2013 (UTC)
-
-
- Oops, talk about replying properly. Sorry, Wyang! Lawrence J. Howell (talk) 08:06, 12 June 2013 (UTC)
-
- I have no knowledge of Chinese and can't make any judgement on this issue, but I'd like to point out that only a few days ago there was a rather heated discussion about bashing newbies - and utter phonesthemic nonsense, wild theories with nothing to back them up and spammer seem rather harsh judgements to make without actually asking the editor what he has to say. Isn't everyone supposed to AGF? Lawrence has been polite and has cited his source(s); so it would be good if we could be polite back. Hyarmendacil (talk) 05:52, 13 June 2013 (UTC)
-
- @Hyarmendacil: Thank you for the call to civility. You missed my favorite, though: The implication that this is a tribunal, and that my options are to clear my name or face sanctions. I know it was written tongue-in-cheek, but still ... And bogosity? Perhaps after this thread winds down someone will be kind enough to assign it a bogosity level rating, so we all know where we stand.
As for the editors you quote: Wyang has managed not only to regain his equilibrium but to perform an immensely constructive service for the Wiktionary community: Confirming that a reputable authority maintains the existence in Old Chinese of phonoesthemic (phonosemantic) patterns. (That information, I might add, has been in circulation since 2007, contained in a readily available book that has received both scholarly and popular acclaim. For that reason, the knee-jerk rejections caught me by surprise.) His example holds out hope that the other editors too may eventually come around and offer positive contributions here.
@Wyang: OK, so we're agreed on the existence of phonesthemic patterns in OC. Great start.
Can you tell me the basis for your statement ... the majority of OC lexicon does not conform to the semantic expectations from generalised phonesthemic patterns ...? AFAICT, the majority of OC lexicon conforms surprisingly well.
... these patterns are generally true for other E/SE languages as well ... I understand you to be offering this as a rationale for not applying the patterns to Wiktionary's entries for Han Chinese characters. I take it to be, on the contrary, an excellent reason to apply the patterns across the Wiktionary board.
Skipping to your last point before circling back, I would second your idea about adding a PST etymology (source?) for each character. But that doesn't have to come at the expense of listing the true sound symbolisms.
Now to your four points, in order.
1) Your indication that the model is unrecognized elsewhere. (Shrugs shoulders.) Speaking here to the Wiktionary community: Bearing in mind that this discussion is, in the end, about whether or not to add certain data to Wiktionary, and presuming consensus can be obtained for adding consensus true sound symbolisms (minimal definition: Contained in both Schuessler and KN): Is lack of precedent a make-or-break issue?
2) ... nonspecific and ambiguous in the semantic descriptions ... Can you elaborate? Earlier in this thread certain aspects of my data came under suspicious for being overly precise.
3) Outdated reconstructions. I don't dispute your use of the descriptor outdated reconstructions; scholarly reconstructions of OC have progressed considerably since Karlgren. Nonetheless, the considerable handicap of ORs did not prevent my research collaborator and I from identifying the phonosemantic/phonesthemic patterns in OC noted at KN, which overlap with those of Schuessler. Also, it is entirely possible that the ORs should be credited for enabling us to see just a bit more deeply than Schuessler in certain cases. For example on p. 21 of the dictionary he presents a chart with a small number of examples of labial initials connected with the meanings swell, protrude, prominent, bloom, bud etc. These are all, the KN data indicates, manifestations of the single concept Spread, encompassing a number of related terms many times larger than the number of characters in the chart. Also in this context, I could discuss at some length the Cut/divide/reduce aspect of final *-t (ABC: sometimes transcribed as *-t, occasionally as *-ts) that was called into question earlier in this thread, but maybe some other time.
An additional point with regard to reconstructions is their mutable nature. It's 1992, and William Baxter has the stage pretty much to himself with A Handbook of Old Chinese Phonology. 1999, however, brings competition in the form of Laurent Sagart's The Roots of Old Chinese, a work glowingly reviewed by Wolfgang Behr. Sergei Starostin's reconstructions are coming out, too. Fast-forward to 2007 and here's Schuessler with his ABC Etymological Dictionary of Old Chinese, giving us four differing sets of OC reconstructions by contemporary scholars (not counting works published in China). What are conscientious editors of an online reference source such as Wiktionary to do? The solution seems to have been to ignore them all, a shrewd move as it turned out because just four years later Baxter and Sagart (neither scholar having been satisfied with his earlier work) bestow upon the world their collaborative Old Chinese reconstruction files. My point is, of course, that when it comes to OC reconstructions, the goalposts rarely remain in place for long. And so one may be excused for regarding them with a jaundiced eye.
4) Misfits. These are, as I see them, the undesirable flip side of the ORs which, as I describe above, have their strong point too. I look forward to identifying misfits and amending their interpretations as necessary. The process, if it is to be carried out on Wiktionary, depends on the issues presented below. (/@Wyang)
Now, as a practical matter, and returning to the topic that has brought us here, I'd like to ask the Wiktionary community whether there is a consensus to add OC reconstructions to the Han character entries. If so, whose? Baxter/Sagart's? Starostin's? Schuessler's? Someone else's? Some combination of two or more?
Scenario One: No consensus for adding scholarly OC reconstructions. In this case, I propose adding KN data as it stands, accompanied by footnotes or usage notes with verbiage such as: Reconstructions based on B. Karlgren; Interpretations by H & M; No scholarly approbation implied.
Scenario Two: Consensus to add scholarly OC reconstructions from one or more source. In this case, what objection might there be to adding interpretations to the entries for those characters in which KN data overlaps with Schuessler's, providing two cites for each entry, which should satisfy the concerns of Dan Polansky and others who share them? Lawrence J. Howell (talk) 08:17, 13 June 2013 (UTC)
- My point was that the majority of your additions could not be found anywhere other than your site, not in any publications, including Schuessler (2007). Looking at your most recent edits, almost none of these character etymologies can be backed by Schuessler, and so is your methodology of phoneme decomposition and semantic value association. Even if a few are also identified by Schuessler as true sound symbolisms, the fundamental reliance of KN data on an outdated reconstruction has made such analyses unreliable. The multitude of reconstructions is not an excuse for choosing an obsolete one; in fact recent reconstructions have been surprisingly convergent (as evident in the case for "two" above), for example the reconstruction of the uvular series, lateral initials and voiceless sonorants, none of which is reflected in KN.
- Let me use an example to illustrate this - 馬 ("horse", OC /*mˁraʔ/). Below is the passage from Schuessler (2007) (no copyright infringement intended):
- and this is your added content:
- How is your theory that the OC word for "horse" was perhaps derived phonesthemically from the phoneme initial /*m-/, signifying "concealment", or your theory that the word was perhaps onomatopoeic in origin, backed by Schuessler (2007) or other publications? If not, isn't the above paragraph entirely your envisagement against established consensus? Wyang (talk) 02:12, 14 June 2013 (UTC)
-
- ABC and KN both maintain the existence of phonesthemic patterns in OC. How Schuessler arrived at that conclusion and how he chooses to shape the material is his prerogative. Likewise for KN. Remember, the only reason Schuessler has been brought into the discussion is to address concerns about single sourcing (NB: sourcing of hermeneutic principles, not one-by-one interpretations).
More (much more) below. Lawrence J. Howell (talk) 04:43, 16 June 2013 (UTC)
-
-
- @Wiktionary editors: Unless I am greatly mistaken about how things work around here, the community will at some point be shifting into decision-making mode to determine: Do KN interpretations belong in the dictionary? As a quick reference for those to be involved in that process, allow me to contribute a recapitulation.
-
・I have come to the dictionary with the intention of helping to improve the presentation of existing material.
・I stated my relation to that material, offered for inspection a sample of entries in a format I believe is compatible with Wiktionary style, and requested feedback.
・I waited until responses petered out, implemented formatting improvements that had been suggested for the sample entries, and uploaded similarly formatted new entries.
・Apparently unaware of the thread I had initiated here in the Beer Parlour, an editor called attention to my uploading activity and asked what should be done about it.
・A half-dozen other editors swiftly converged, casting aspersions on my motives and vilifying the idea of phonosemantic principles being operative in Old Chinese.
・One editor asked why none of these issues had been raised in the initial post. (No response.)
・Everyone disappeared save a single editor whose rhetoric thus far has included:
Vituperation: utter phonesthemic nonsense
Volte-face: It's true that (phonesthemic) patterns are found in Old Chinese ...
Untenable claim: Schuessler tentatively proposed some Old Chinese prefixes and suffixes (not initials, vowels and final consonants).
False attribution: ... the theory advanced by User:Lawrence J. Howell is that every Old Chinese monosyllabic word can be reanalysed in terms of initial, vowel and final consonant, in that all or part of its phonological shape resulted from its meaning ...
False analogy: Postulating that words are consistently derived using this principle is a bit like positing a proto-phoneme initial *f– for English, denoting "movement", which is responsible for deriving Modern English words like fare, fast, fight, flee, flow, fly. (No, actually, it is not a bit like it at all. The subject is OC, not Old English, and nobody is claiming that universals across languages maintain applicability along all categories.)
Non sequitur: ... almost none of these character etymologies can be backed by Schuessler ...
This is kettle logic, AKA chucking out arguments in hopes that one of them will stick (or create the desired impression, which can be almost as good).
This is not to assert that Wyang is completely off target. For example, with reference to OC reconstructions, and for what it's worth, it's safe to say that the opinion ... recent reconstructions have been surprisingly convergent ... is mainstream in Sinologic circles. S/he also notes the advances resulting in the reconstruction of the uvular series, lateral initials and voiceless sonorants; what if any influence they may bear on KN interpretations is a matter for study.
As for the thrust of Wyang's rhetoric, however, s/he appears to be arguing that the current state of OC reconstruction (partially/largely) invalidates the KN interpretations. Also (and Wyang will correct me if I'm wrong), it appears the intent is to persuade the community to exclude the interpretations from Wiktionary; expending such energy on the thread makes little sense otherwise.
I'll continue the debate with Wyang as long as necessary, but I wonder if it would be too much to request some form of indication that the community is working toward a resolution of this issue. To that purpose, interested editors may wish to look at the explanations found in the Etymology sections of Han Chinese characters (ones not taken from KN). A few dozen entries will be enough to create a valid if minimal sample. For each, do you find that the citing make the origin of the explanation evident? If not, is that not a problem? If so, do the sources conform with the inclusion standard to which KN data is being held? I refer especially though not exclusively to that prickly issue of multiple sourcing. I believe you'll agree that the answers to these questions are of great relevance.
There's more to say, but I think it's time for the community to return to the stage.
Real life considerations dictate that I'll be able to offer nothing beyond cursory remarks in the next few days, and none at all in the week following. I will however be back at the end of the month, so please do carry on without me. Thank you all for your consideration. Lawrence J. Howell (talk) 04:43, 16 June 2013 (UTC)
WT:ELE#Definitions[edit]
The following sentence has been out of date since before I started editing here: "Each definition may be treated as a sentence: beginning with a capital letter and ending with a full stop."
In fact I can trace it all the way back to User talk:Mglovesfun/Archives/1#formatting (2009). Definitions of non-English terms are formatted without a full stop or an initial capital letter (with the obvious exception of words that always require a capital letter like Spain) and English definitions have full stops and initial capitals. Can we finally update WT:ELE to cover this?
Furthermore, a separate but much more minor point:
::'''to end''' (''third-person singular simple present'' '''[[ends]]''', ''present participle'' '''[[ending]]''',
''simple past'' '''[[ended]]''', ''past participle'' '''[[ended]]''')
It shouldn't have ended twice as {{en-verb}} doesn't show that anyone. Since en-verb only categorizes in the main namespace, we can just use it directly. The entire section Headword line could use the templates directly, but I don't think I could do it and pass it off as an uncontroversial edit, so here it is. Mglovesfun (talk) 12:48, 9 June 2013 (UTC)
- Also just noticed the particle 'to' in 'to end' which we no longer use. Mglovesfun (talk) 12:49, 9 June 2013 (UTC)
- I noticed that quite a few pages have some kind of substituted version of templates. I don't think that's a good idea because then things like this happen. It's better to put the actual template in there, so that they always match whatever the template really looks like. —CodeCat 12:53, 9 June 2013 (UTC)
- If anyone's drafting a vote, another problem is that ELE currently allows an explicitly "unlimited" variety of headers. - -sche (discuss) 04:01, 10 June 2013 (UTC)
- I'd draft a vote if people said in this thread they'd broadly support it. Mglovesfun (talk) 08:36, 10 June 2013 (UTC)
- *Nudge*. Mglovesfun (talk) 09:58, 11 June 2013 (UTC)
- I'd draft a vote if people said in this thread they'd broadly support it. Mglovesfun (talk) 08:36, 10 June 2013 (UTC)
South Picene alphabet[edit]
The South Picene language (spx) has Old Italic (Ital) as its alphabet. However, not every character used to write it is encoded by Unicode; it lacks one of the characters transliterated as ‘í’ and the word separator (looks like a vertical ellipsis), and the characters for ‘ú’, ‘t’, ‘f’, ‘o’ and the other ‘í’ are rather different. I propose we change its script to Latn until the Unicode coverage of the South Picene alphabet is adequate (compare how we treat Iberian and Egyptian.) — Ungoliant (Falai) 05:03, 10 June 2013 (UTC)
- Support. The Noric language also uses a variety of Ital, but I didn't even try to use it at Artebudz as Ital is LTR by default, and the Noric inscription is RTL; also, the letter shapes are different. —Angr 08:24, 11 June 2013 (UTC)
- If we had some kind of "Wiktionary font" (I think that was discussed previously) we wouldn't have to deal with this kind of problem, as we could just devise our own encodings in the PUA. But oh well. -- Liliana • 19:31, 15 June 2013 (UTC)
bad-iw filter[edit]
Why not just block all bad interwiki edits (there aren't all that many) but with a note explaining why the edit has been blocked. It will stop good faith bad edits, vandalism and experienced editors who make a typo (I've done it) will be able to correct their work with minimal fuss. Mglovesfun (talk) 16:20, 11 June 2013 (UTC)
- Good idea. — Ungoliant (Falai) 16:30, 11 June 2013 (UTC)
- Here's an example of a bad-iw edit that isn't a bad-iw edit and shouldn't be blocked. —Angr 19:55, 11 June 2013 (UTC)
- Can’t we change the regex so it doesn’t trigger a bad-iw for pages with the prefix “Unsupported titles/”? — Ungoliant (Falai) 10:31, 12 June 2013 (UTC)
- The "bad-iw" filter
already excludes those, since it only looks at mainspace entries. As far as I know, it pretty much exactly matches the rules that have been used by bots like Interwicket and Rukhabot to correct iw entries. There are a few cases such as straight-versus-curly apostrophes and variations in rules for representing Hebrew lemma entries that make for a few mismatches between WTs that might cause problems. I'm not sure how Rukhabot and the bad-iw filter deal with those. Chuck Entz (talk) 13:36, 12 June 2013 (UTC)- Oops! Didn't read closely enough. Yes, I'm sure we could exclude Unsupported titles (I thought we already did). Chuck Entz (talk) 13:43, 12 June 2013 (UTC)
- The "bad-iw" filter
- Can’t we change the regex so it doesn’t trigger a bad-iw for pages with the prefix “Unsupported titles/”? — Ungoliant (Falai) 10:31, 12 June 2013 (UTC)
- Here's an example of a bad-iw edit that isn't a bad-iw edit and shouldn't be blocked. —Angr 19:55, 11 June 2013 (UTC)
- The filter blocks all edits that add (or, in some cases where diff messes up, retain) bad interwikis. I would agree with blocking an edit that only adds a bad interwiki (and I doubt it'd be too hard to write one), but I can't agree with blocking an edit like diff, which adds a lot of info besides the bad interwiki.—msh210℠ (talk) 07:23, 16 June 2013 (UTC)
Problems w/recent changes?[edit]
Maybe it's just my internet, but I haven't been able to get to recent changes all day yesterday or today. At all, and it's the only page that's not loading (I'm on Chrome on Mac OSX 10.6.8). Anyone else having problems with it? Thanks. --Neskaya … gawonisgv? 17:14, 12 June 2013 (UTC)
- OK for me (Win 7, FF 21, Vector). DCDuring TALK 17:22, 12 June 2013 (UTC)
- Must just be me. Thanks though. --Neskaya … gawonisgv? 00:16, 17 June 2013 (UTC)
Watchlist wishlist[edit]
There is a WMF project called [Watchlist wishlist]. It may be a pipedream, but it is at least one step closer to reality than the complaints about our wishlist that sometimes surface here. I would like to collect any thoughts that folks here have about how to make watchlists more useful for us. I have already mentioned there the great utility of limiting watching to sections (in particular language L2s). We also already have the problem with editing the watchlist and even using large watchlists on record.
The project has a list of some suggestions that may be thought-starters. DCDuring TALK 18:13, 13 June 2013 (UTC)
- I would like to be able to automatically watch all entries in a given language, and perhaps sort them under separate tabs within the list so I can quickly browse it. —CodeCat 18:42, 13 June 2013 (UTC)
-
-
- Me three :).
- Opening the Watchlist on mobile devices (excluding iPad's) leaves much to be desired. Crashes in various browsers on simple operations, such as resizing. The Watchlist specific to mobile phones is useless, even if it doesn't crash.
- My Watchlist is too big, so I can't edit it raw, even on a desktop computer. --Anatoli (обсудить/вклад) 03:05, 14 June 2013 (UTC)
- I hadn't thought about mobile and I don't think anyone else had mentioned that yet.
- Someone had suggested category-specific watchlists, which could include the language categories. That seems second-best to something "section"-specific, ie, only changes in L2 sections for one's selected languages. Our page architecture of having multiple languages on the same page does make things harder and out of sync with what WP typically needs.
- @CodeCat: I don't understand "sort them under separate tabs within the list so I can quickly browse it". If you are watching an entire language, what sorting would you want? Where are the tabs coming from? DCDuring TALK 04:27, 14 June 2013 (UTC)
-
Template:param[edit]
My post at Wiktionary:Beer parlour/2013/March#Use of Template:param in template documentation didn't garner any responses at all, so fair warning: unless someone complains, I will "soon" implement the recommendation I suggested at Template talk:param. - dcljr (talk) 23:44, 13 June 2013 (UTC)
Manual transliteration and transliteration from modules[edit]
After I've added manual transliteration to the translations (to a few selected languages where it's possible), some editors started removing previously added manual transliteration. I'm against this practice. The transliteration may be out of date but it can easily be updated from preview (User:Conrad.Irwin/editor.js). Note that auto-transliteration is only added when it's missing. If people wish to add the new transliteration, then perhaps a bot could do this - as a once-off job - overwrite existing transliteration and add where it's missing. Perhaps one of User:Conrad.Irwin/editor.js or User:Kephir/gadgets/xte could do that?
What's the general opinion about this? I also think there should be the transliteration written in entries and translations. How it gets there - manually or via a bot is another thing. Can someone create a bot to update/insert transliterations or modify the scripts, so that auto-translit is written to translations if it's not supplied manually (this condition is important)? --Anatoli (обсудить/вклад) 06:57, 14 June 2013 (UTC)
- I think that auto-transliteration should always override manual transliteration. Manual transliteration will not coincide with auto-transliteration only if an editor made an error in transliterating. By forcing auto-transliteration we can neutralize such errors. Consider this: historically different editors have used different transliteration schemes for Armenian on Wiktionary. By adding auto-transliteration to
{{hy-noun}}and the rest I made sure Armenian is transliterated consistently. We should do the same to{{t}},{{l}},{{term}}and others.
- Anticipating your objection, that in Russian we show stress in the transliteration and so it does not coincide with auto-translit, I say we should show the stress on the Russian word (like this, {{ru-noun|head=соба́ка}}) and let auto-translit pick up the stress from there.
- Using a bot to upload transliterations is not a good idea, IMO. The bot would need to rerun every time we decide to modify a transliteration scheme. On the other hand, with auto-transliteration you need only change Module:Armn-translit once. --Vahag (talk) 09:42, 14 June 2013 (UTC)
-
- Transliteration is supposed to convey orthography. Maybe we should consider dropping foreign-style stress marks from transliterations in the few languages where we use them, and only indicate stress in the pronunciation, where it properly belongs. —Michael Z. 2013-06-14 14:59 z
- Perhaps, but it might come in handy in cases where there are homographs with differing stress, so it's obvious which one is meant. Chuck Entz (talk) 15:12, 14 June 2013 (UTC)
- Transliteration is supposed to convey orthography. Maybe we should consider dropping foreign-style stress marks from transliterations in the few languages where we use them, and only indicate stress in the pronunciation, where it properly belongs. —Michael Z. 2013-06-14 14:59 z
- Our transliteration system for Burmese is pronunciation-based, not orthography-based. In probably 75% of the cases the pronunciation-based transliteration can be correctly mechanically predicted from the orthography, but in the remaining 25% of the cases it can't and will need to be done manually. The alternative would be to switch over to an orthography-based romanization of Burmese, which I would actually be in favor of but which met some opposition a few years back. —Angr 16:03, 14 June 2013 (UTC)
- I hope this is not intended to apply to uses of
{{term}}in etymology sections. In etymology sections for terms mostly transmitted over time in writing I think a pronunciation-based transliteration system can be quite misleadinging. For example, the writers who took Greek terms into Latin followed a practices that must have fit their modified pronunciations and created precedents that are followed to this day, ie υ (upsilon) -> "y", not "u". I don't know for how many situations this objection is relevant beyond what I've drawn from. DCDuring TALK 17:00, 14 June 2013 (UTC)- Actually, I think we're supposed follow the standards for transliteration of entries for transliterations in etymologies. This isn't really followed all that much, because a lot of editors just use the transliteration in the source they got the etymology from, and have no clue what the Wiktionary practice is. Chuck Entz (talk) 17:13, 14 June 2013 (UTC)
- What I wish we did in etymology sections is the same thing most English-language dictionaries do, namely present all foreign words in transliteration. We could still link to the original-script page, of course: if we say that raj comes "from Sanskrit
{{term|राज्य|rājyá|lang=sa|sc=Latn}}" rather than "from Sanskrit{{term|राज्य|tr=rājyá|lang=sa}}", for example, it displays as "from Sanskrit rājyá", saving space and not confronting readers with possibly unfamiliar Devanagari, while still linking to the Sanskrit entry. I've tried that on one or two pages, but it always gets reverted. —Angr 17:53, 14 June 2013 (UTC)- I wonder what percent of our actual and likely future users prefer and find more useful English Etymology sections the way we do them to an alternative presentation having no non-English script, just transliterations (with no cognates visible by default). Are we just doing this all just for a small population of scholars and for machines that will render it all more useful for humans? DCDuring TALK 18:03, 14 June 2013 (UTC)
- I think it's a good idea. A better form is mentioning the transliteration first and putting the term in its native script(s) after it, in parentheses, e.g. "rājyá (राज्य)", as we do in Wikipedia. --Z 18:15, 14 June 2013 (UTC)
- What group of users do you think prefer it that way, rather than the current way or a presentation with no non-Latin script? DCDuring TALK 18:26, 14 June 2013 (UTC)
- English Wiktionary is for English-speaking readers, most of whom can't read non-Latin scripts. A small group who are familiar with that non-Latin script prefer to see the term in that way and others prefer Angr's suggestion I think. But I think saving space is not big advantage to completely remove it. I think it would be even preferred by readers whose native script is not Latin; it's kinda hard for the reader to switch to a non-Latin script while reading an English text. --Z 18:47, 14 June 2013 (UTC)
- What group of users do you think prefer it that way, rather than the current way or a presentation with no non-Latin script? DCDuring TALK 18:26, 14 June 2013 (UTC)
- What I wish we did in etymology sections is the same thing most English-language dictionaries do, namely present all foreign words in transliteration. We could still link to the original-script page, of course: if we say that raj comes "from Sanskrit
- Actually, I think we're supposed follow the standards for transliteration of entries for transliterations in etymologies. This isn't really followed all that much, because a lot of editors just use the transliteration in the source they got the etymology from, and have no clue what the Wiktionary practice is. Chuck Entz (talk) 17:13, 14 June 2013 (UTC)
- I hope this is not intended to apply to uses of
-
-
-
-
-
-
-
- It’s bad enough already when a reader follows a link राज्य (rājyá) to a heading “Sanskrit,” and has to read down past Etymology and Adjective to find the precise text “राज्य (rājyá).” In many entries they may have to scroll just to see the headword. (Maybe our headwords should be a head of a language entry instead of just of the page.)
-
-
-
-
-
-
-
-
-
-
-
-
-
- Presenting transliterations only would force the reader to additionally extrapolate that what they clicked on is a derivative representation of राज्य. And I don’t see the point of linking rājyá (राज्य) to represent राज्य (rājyá) – these should be consistent, and I think the current use of brackets to indicate that the transliteration is a representation derived from the original, in both entry and link, is clearest. —Michael Z. 2013-06-15 16:55 z
-
-
-
-
-
-
Synonym internationalization[edit]
How do you translate a page, if each synonym should lead to a different word in the other language.
Say Foo has two meanings: (1) food, the other (2) disgusting.
But in OtherLang the word food is ol:Phould, while the disgusting is Fouyah.
What to do in those cases. Pashute (talk) 07:07, 14 June 2013 (UTC)
- Are you talking about automated translation of entire pages? If so, you don't, it doesn't work yet. To get anything but a very poor translation, you need a human being in there somewhere. Tell me if I've missed the point. Mglovesfun (talk) 08:42, 14 June 2013 (UTC)
- To see how we handle the translation of polysemous words (words with multiple meanings), look at get#Translations as an example. Each meaning has its own separate translation box. —Angr 09:59, 14 June 2013 (UTC)
Hebrew and Aramaic terms here inside English wiktionary[edit]
It seems there has been much work done on Hebrew and Aramaic here in the English Wiktionary. May I ask why? What is the rational? (Perhaps exactly the above issue, but if so it is a very bad solution, and differs from all the other many languages referred to from here) ... Pashute (talk) 07:10, 14 June 2013 (UTC)
- Second thought - perhaps it was meant for Talmudic and Kabalic or biblical transaltions - and if so: Why not open a separate wiktionary exactly for that, and move the terms to there? There are many benefits: There could be words specific to the time that are not used anymore, there could be a special entry for words that have changed meaning from the modern language, or from the other versions of the language (say between Biblical and Talmudic hebrew - there are many examples of that...)
So actually this is another topic: Ancient languages...
Back to my question here: Why are there Hebrew and Aramaic terms here in the English Wiktionary, what is the rational of those who worked on it extensively, and is it possible change this without loosing their obviously hard work. Pashute (talk) 07:17, 14 June 2013 (UTC)
- The rationale is that these languages exist and contributors have decided to give their time and effort to make these entries, which are perfectly valid and so haven't been deleted. Mglovesfun (talk) 08:41, 14 June 2013 (UTC)
- Pashute, the point of English Wiktionary is not only to list English words, but to list all words in all languages. Words in languages other than English are provided with English translations. We don't just have Hebrew and Aramaic, we have Swedish, French, German, Arabic, Hausa, Swahili, Zulu, Persian, Sanskrit, Burmese, Indonesian, Chinese, Japanese, Russian, Navajo, Quechua, and thousands of other languages. There is already a Hebrew Wiktionary, and its point is also to list all words in all languages—but with Hebrew glosses for words in languages other than Hebrew. —Angr 09:48, 14 June 2013 (UTC)
- I agree with above, except that AFAICT hewikt lists only Hebrew words.—msh210℠ (talk) 07:28, 16 June 2013 (UTC)
- Good heavens, you're right. Is that their policy, or is it just that no one's gotten around to adding words in other languages yet? I thought "all words in all languages" was the goal of each Wiktionary and would be dismayed to learn that certain Wiktionaries had decided not to accept that. —Angr 14:19, 16 June 2013 (UTC)
- Yes. (Specifically, it is their policy.) Keφr 15:04, 16 June 2013 (UTC)
- Yes, en.Wiktionary has been visited by a few exiles from that Wiktionary who lamented that policy and the (in their opinion) unresponsive and unreasonable admins who used their power to ban anyone who opposed it. (e.g. in 2007) - -sche (discuss) 16:47, 16 June 2013 (UTC)
- Good heavens, you're right. Is that their policy, or is it just that no one's gotten around to adding words in other languages yet? I thought "all words in all languages" was the goal of each Wiktionary and would be dismayed to learn that certain Wiktionaries had decided not to accept that. —Angr 14:19, 16 June 2013 (UTC)
- I agree with above, except that AFAICT hewikt lists only Hebrew words.—msh210℠ (talk) 07:28, 16 June 2013 (UTC)
- Pashute, the point of English Wiktionary is not only to list English words, but to list all words in all languages. Words in languages other than English are provided with English translations. We don't just have Hebrew and Aramaic, we have Swedish, French, German, Arabic, Hausa, Swahili, Zulu, Persian, Sanskrit, Burmese, Indonesian, Chinese, Japanese, Russian, Navajo, Quechua, and thousands of other languages. There is already a Hebrew Wiktionary, and its point is also to list all words in all languages—but with Hebrew glosses for words in languages other than Hebrew. —Angr 09:48, 14 June 2013 (UTC)
-
-
-
-
-
- I am shocked. What an abominable policy! — Ungoliant (Falai) 16:56, 16 June 2013 (UTC)
- I wonder if we could submit this to the Wikimedia foundation? We could do it on the grounds that they are discriminating against Hebrew speakers, since they do not have access to foreign translations in the way that speakers of other languages do. I doubt that the foundation wants to sponsor that. —CodeCat 17:03, 16 June 2013 (UTC)
- It should be brought up at Meta (m:Requests for comment; if there is obvious misuse of sysop/crat rights, m:Stewards' noticeboard) (preferably, by active users of he.wikt). --Z 17:09, 16 June 2013 (UTC)
- Yes, it would be preferable for active users of he.Wikt to open any RFC; they would also know better if there has been any recent admin/crat action (the discussion I linked to above being from 2007). Of course, if "Hebrew only" has been he.Wikt's policy long enough that opponents of it no longer edit there, that could complicate matters. - -sche (discuss) 17:57, 16 June 2013 (UTC)
- I don't think it's necessary to find editors from he.wiktionary. After all, this concerns a policy that goes (IMO) against the neutral and open spirit of Wikimedia projects, and I think we, being also editors of Wikimedia projects, are entitled to have a say about it even if we do not edit there. Consider for comparison if Wikipedia adopted a policy stating that articles about things in the English-speaking sphere were inherently more notable than other things. I'm sure we'd have something to say about that even if we didn't edit there. —CodeCat 18:05, 16 June 2013 (UTC)
- Yes, it would be preferable for active users of he.Wikt to open any RFC; they would also know better if there has been any recent admin/crat action (the discussion I linked to above being from 2007). Of course, if "Hebrew only" has been he.Wikt's policy long enough that opponents of it no longer edit there, that could complicate matters. - -sche (discuss) 17:57, 16 June 2013 (UTC)
- You wish to argue that hewikt — whose editors are Hebrew speakers — is discriminating in its editor-written policies against Hebrew speakers. Seriously?—msh210℠ (talk) 04:33, 17 June 2013 (UTC)
- It should be brought up at Meta (m:Requests for comment; if there is obvious misuse of sysop/crat rights, m:Stewards' noticeboard) (preferably, by active users of he.wikt). --Z 17:09, 16 June 2013 (UTC)
- Having a monolingual dictionary is abominable? It's a different goal than having a pan-lingual dictionary, is all.—msh210℠ (talk) 02:31, 17 June 2013 (UTC)
- Not a goal that a Wiktionary should force onto its contributors, IMO. You can still have a Hebrew dictionary and allow foreign language entries. Look at how good our coverage of English is. — Ungoliant (Falai) 02:42, 17 June 2013 (UTC)
- And you can have a pan-lingual dictionary and allow a numerical (Roget-style) thesaurus. But we've decided not to, and that's something we force on our contributors. They've decided to limit the focus of their project, much as we have. I don't see the problem with it, at all.—msh210℠ (talk) 04:14, 17 June 2013 (UTC)
- Wiktionary projects are supposed to contain all words in all languages,[38] and I don't think decision about changing this goal can be made by user community. --Z 05:56, 17 June 2013 (UTC)
- That text was added by Dominic, an enwikt (and enWP) denizen. AFAICT he acted alone (and from an enwikt perspective) in making that edit — though of course we can ask him. I have no reason to believe that it reflects the Foundation's official view.—msh210℠ (talk) 06:48, 17 June 2013 (UTC)
- It is still obviously against the nature of WMF projects. We may not force the contributors to focus their contributions on what we prefer; everyone must be free in contributing as far as possible. --Z 07:18, 17 June 2013 (UTC)
- Allow me to introduce you to WT:CFI. I can't tell whether you're being disingenuous or you really don't see the analogy between us and hewikt.—msh210℠ (talk) 16:22, 17 June 2013 (UTC)
- The purpose of WT:CFI and whatever WT:... else is to indicate what is an improvement and what is not. Adding a non-Hebrew entry to he.wikt is nothing but improvement of this project of WMF, and people should be free to improve WMF wikis, that's all we tried to tell you. Anyway, lets stop this discussion, it doesn't belong here in en.wikt and is none of our business, he.wikt editors should decide about it. --Z 16:57, 17 June 2013 (UTC)
- Arguably, including a numerical thesaurus is an improvement of enwikt, and including information about every startup music band is an improvement of enWP. Anyway, that's all I meant also: that hewikt editors should decide on this. That is, I didn't mean that I agree that hewikt should not have foreign entries: only that the outrage against it and comments denouncing it, above, are uncalled for. Glad we quasi-agree.
:-)—msh210℠ (talk) 17:04, 17 June 2013 (UTC)
- Arguably, including a numerical thesaurus is an improvement of enwikt, and including information about every startup music band is an improvement of enWP. Anyway, that's all I meant also: that hewikt editors should decide on this. That is, I didn't mean that I agree that hewikt should not have foreign entries: only that the outrage against it and comments denouncing it, above, are uncalled for. Glad we quasi-agree.
- The purpose of WT:CFI and whatever WT:... else is to indicate what is an improvement and what is not. Adding a non-Hebrew entry to he.wikt is nothing but improvement of this project of WMF, and people should be free to improve WMF wikis, that's all we tried to tell you. Anyway, lets stop this discussion, it doesn't belong here in en.wikt and is none of our business, he.wikt editors should decide about it. --Z 16:57, 17 June 2013 (UTC)
- Allow me to introduce you to WT:CFI. I can't tell whether you're being disingenuous or you really don't see the analogy between us and hewikt.—msh210℠ (talk) 16:22, 17 June 2013 (UTC)
- It is still obviously against the nature of WMF projects. We may not force the contributors to focus their contributions on what we prefer; everyone must be free in contributing as far as possible. --Z 07:18, 17 June 2013 (UTC)
- That text was added by Dominic, an enwikt (and enWP) denizen. AFAICT he acted alone (and from an enwikt perspective) in making that edit — though of course we can ask him. I have no reason to believe that it reflects the Foundation's official view.—msh210℠ (talk) 06:48, 17 June 2013 (UTC)
- Wiktionary projects are supposed to contain all words in all languages,[38] and I don't think decision about changing this goal can be made by user community. --Z 05:56, 17 June 2013 (UTC)
- And you can have a pan-lingual dictionary and allow a numerical (Roget-style) thesaurus. But we've decided not to, and that's something we force on our contributors. They've decided to limit the focus of their project, much as we have. I don't see the problem with it, at all.—msh210℠ (talk) 04:14, 17 June 2013 (UTC)
- Not a goal that a Wiktionary should force onto its contributors, IMO. You can still have a Hebrew dictionary and allow foreign language entries. Look at how good our coverage of English is. — Ungoliant (Falai) 02:42, 17 June 2013 (UTC)
- I wonder if we could submit this to the Wikimedia foundation? We could do it on the grounds that they are discriminating against Hebrew speakers, since they do not have access to foreign translations in the way that speakers of other languages do. I doubt that the foundation wants to sponsor that. —CodeCat 17:03, 16 June 2013 (UTC)
- I am shocked. What an abominable policy! — Ungoliant (Falai) 16:56, 16 June 2013 (UTC)
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- I’m not familiar with Roget’s Thesaurus, but it does sound like it’s similar to our Wikisaurus project. In any case, I’m not sure submitting this to the WMF is a very good idea. It sets a bad precedent, and soon enough other Wiktionaries will start demanding that we change our practices too (and you know how some people feel about our logo and SOP-deletion.) — Ungoliant (Falai) 09:32, 17 June 2013 (UTC)
-
-
-
-
-
-
-
-
-
- Request for comment sounds about right. No reason to prejudge or get ahead of ourselves. Mglovesfun (talk) 18:08, 17 June 2013 (UTC)
Ancient languages[edit]
Is it possible to open a wiktionary for an ancient language, that is now being studied extensively? Such as Medieval Latin, Talmudic Aramaic, Zohar Aramaic etc. ? Pashute (talk) 07:17, 14 June 2013 (UTC)
- The short answer is yes, but this isn't really the place to ask. There's a Wikimedia incubator for wikis which are not ready to go 'live' yet. Mglovesfun (talk) 08:38, 14 June 2013 (UTC)
- la: (Latin Wiktionary) already exists. Mglovesfun (talk) 09:52, 14 June 2013 (UTC)
- The short answer is no; although there are a few Wikimedia projects in ancient languages, the only projects that can still be created for ancient languages are Wikisource and Wikiquotes (since they don't provide original content). New projects that provide original content, such as Wikipedia and Wiktionary, can be created only in living languages with native speakers. But words in those languages can certainly be added to English Wiktionary, and indeed we already have many words in Category:Latin language and Category:Aramaic language (as you noticed in your post above this one). —Angr 09:57, 14 June 2013 (UTC)
- Yes, for dead languages only Wikisource is allowed according to the new policy. --Z 10:15, 14 June 2013 (UTC)
- Fair enough. Glad to hear it, actually! Mglovesfun (talk) 10:17, 14 June 2013 (UTC)
- Not that new a policy; it's been that way for almost six years. —Angr 10:52, 14 June 2013 (UTC)
- Fair enough. Glad to hear it, actually! Mglovesfun (talk) 10:17, 14 June 2013 (UTC)
- Yes, for dead languages only Wikisource is allowed according to the new policy. --Z 10:15, 14 June 2013 (UTC)
- The short answer is no; although there are a few Wikimedia projects in ancient languages, the only projects that can still be created for ancient languages are Wikisource and Wikiquotes (since they don't provide original content). New projects that provide original content, such as Wikipedia and Wiktionary, can be created only in living languages with native speakers. But words in those languages can certainly be added to English Wiktionary, and indeed we already have many words in Category:Latin language and Category:Aramaic language (as you noticed in your post above this one). —Angr 09:57, 14 June 2013 (UTC)
- la: (Latin Wiktionary) already exists. Mglovesfun (talk) 09:52, 14 June 2013 (UTC)
[edit]
At per there are two etymologies, a preposition derived from Latin and a pronoun/adjective coined in 1979. The entry also includes a statistics section noting that it was the 760th most common word prior to 1923.
Obviously those statistics cannot be for the senses coined fifty-five plus years after 1923, so should the statistics section be moved to a L4 heading at the end of Etymology 1 rather than a L3 heading at the end of the English section? Thryduulf (talk) 07:37, 15 June 2013 (UTC)
- If it's just purely based on frequency, then no. Because it's probably just searching the equivalent of the regex
. In human terms, a space, the three letters per followed by a space, or a period, or a comma, or a semicolon, or a colon. Mglovesfun (talk) 10:58, 15 June 2013 (UTC)
per( |\.|,|;|:)
- I have yet to see any comprehensive statistics about the frequency of usage of meanings of spellings. That seems beyond the capability of corpus analysis at this time. It wouldn't even seem possible in principle without some kind of standardization of meaning. PoS-level and Etymology-level statistics might be possible. But, for example, COCA's PoS reporting seems not ready for prime time.
- Perhaps we can find some studies of small sets of words that report such frequency information so that we have good reason to show additional statistics and decide how to show them. DCDuring TALK 12:14, 15 June 2013 (UTC)
- I'd guess \b[Pp][Ee][Rr]\b. In any event, I agree with Mg: the frequency stats are independent of sense so should be listed ===thus===.—msh210℠ (talk) 07:39, 16 June 2013 (UTC)
Progress with Template:context[edit]
All uses of context labels have been converted so that they explicitly call {{context}} now. That has allowed me to significantly rework the template and, maybe the most important, to get rid of the recursion. So all the numbered context templates should now be orphaned (it will take the software a few days to catch up, I expect). The new version of the template now uses a few new helper templates, {{context/show}}, {{context helper}} and {{context test}}. {{context/show}} is called for every numbered parameter that is passed to {{context}}, and is responsible for showing the label and transcluding the label template when it exists. {{context helper}} is called by the labels themselves. Because the recursion is gone, the label templates no longer need to be passed all the remaining labels. However, they still need to know the next label, because that determines whether or not to show a comma after the label. Some labels explicitly omit the comma as well. So, labels are now passed only the next label that follows them, but they do not display it; they only use it to determine what separator to show.
The labels originally called {{context {{{sub|}}}| where the parameter "sub" was supplied by {{context}} or its numbered varieties. I thought it would be useful to co-opt that mechanism for another purpose. That is what {{context test}} is for. When {{context/show}} needs to see whether a template is indeed a context label (because of the naming conflict that still exists), it calls the label template and passes it sub=test. This causes the label to call {{context test}} rather than {{context helper}}, and it will return the text "valid context label". {{context/show}} then checks for that text and considers the label valid if so. —CodeCat 14:51, 16 June 2013 (UTC)
- Thanks for all your work, and, perhaps more importantly, for your initiative. One question: It looks as though
{{context labelcat}}still works fine. Is that your intent for the future? If not, what are you thinking of doing with its current uses?—msh210℠ (talk) 04:22, 17 June 2013 (UTC)- I don't really know. What purpose does it serve exactly? —CodeCat 12:23, 17 June 2013 (UTC)
- [e/c] It displays the context template's label and categorizes the entry in the context template's category, but does so without parentheses or italics. It's used in usage notes ("Considered
{{informal|sub=labelcat}}when construed with for" or whatever) and in some templates.—msh210℠ (talk) 16:46, 17 June 2013 (UTC) - On pages like [[clamour]], it's used by
{{alternative spelling of}}. On pages like extract the urine and [[enchanted]], it's used by context labels via syntax like {{obsolete|sub=labelcat}}, which allows people to put the labels in usage notes (and suppress their parentheses and italics) rather than on the definition lines and/or in the POS sections where the labels would in many cases be more at home. On pages like [[gáagii]], it's used (by context labels) in etymologies, where a dedicated, categorising etymology template vaguely like{{borrowing}}might be better. - -sche (discuss) 16:42, 17 June 2013 (UTC)- So... do we really need it? Or is there another, better way of doing it? —CodeCat 18:06, 17 June 2013 (UTC)
- IMO, we don't need it. It would make more sense for
{{alternative spelling of}}to apply categories and display text on its own, without invoking{{British}}etc via{{context labelcat}}. Many uses of the template in usage notes should be deleted in favour of regular uses of context templates on sense lines. Even if a few uses are left over once that's done, I expect they'll be very few (because the template only has <150 mainspace uses even now), and to replace them, people could just write out 'jocular' by hand without encasing it in brackets and appending |sub=labelcat, and add any necessary categories manually. (The few etymologies which use it to describe parts of words as onomatopoeia could use a dedicated onomatopoeia template and/or write out any necessary categories manually.) - -sche (discuss) 20:09, 17 June 2013 (UTC) - I agree with -sche (just above) that the uses in etymology sections can be converted to 'manual', though template use is certainly more editor-friendly and it'd be a shame to see the template go. But
{{alternative spelling of}}and{{eye dialect of}}'s use of it is one that allows those templates to include any regional context tag, including such as are created in the future, and IMO that's an important feature of those templates which we should definitely not lose. So either{{context labelcat}}or an equivalent (i.e., some other template that does what it does, reading any (at least regional) context tag, displaying its label, and categorizing) is necessary.—msh210℠ (talk) 05:49, 18 June 2013 (UTC)
- IMO, we don't need it. It would make more sense for
- So... do we really need it? Or is there another, better way of doing it? —CodeCat 18:06, 17 June 2013 (UTC)
- Question: is there any reason one might want
{{context labelcat}}to not work? I mean: It sounds from the above that{{context}}is now stable. In that case,{{context labelcat}}should be fine. Is that correct? If not, what further changes might be desired to{{context}}?—msh210℠ (talk) 05:49, 18 June 2013 (UTC)- I did some checking and it turns out that
{{rare|sub=labelcat}}produces the same as{{rare}}itself. So we don't really need the extra template. —CodeCat 12:26, 18 June 2013 (UTC)- Perfect. Thanks for checking; it looks like you're right. If that's to be true for the foreseeable future — is it? — then we can simply redirect
{{context labelcat}}to{{context helper}}and no further work is necessary for this.—msh210℠ (talk) 17:10, 18 June 2013 (UTC)
- Perfect. Thanks for checking; it looks like you're right. If that's to be true for the foreseeable future — is it? — then we can simply redirect
- I did some checking and it turns out that
- [e/c] It displays the context template's label and categorizes the entry in the context template's category, but does so without parentheses or italics. It's used in usage notes ("Considered
- I don't really know. What purpose does it serve exactly? —CodeCat 12:23, 17 June 2013 (UTC)
do you support Wiktionary:Non-free content criteria[edit]
- Previous discussion (example of non-free file): File talk:Far Side 1982-05-28 - Thagomizer.png.
In the past, one user proposed to delete the small number of "fairly used" copyrighted files en.Wiktionary hosts locally, citing the fact that en.Wikt did not have an EDP (exemption doctrine policy, allowing copyrighted images to be hosted locally and fairly used) of the nature required by the WMF. In response, I drafted Wiktionary:Non-free content criteria, based on Wikipedia's EDP, but heavily adapted to Wiktionary. The deletion discussions were closed with the files kept... and discussion of our EDP petered out. We are still listed on meta:Non-free content as having a "draft proposal only [on which] consensus has not been reached". So: do you support the non-free content policy I drafted? And/or would you propose a slightly or significantly different policy? Or do you think en.Wikt should not host non-free files locally under any circumstances? Let's see if we can get consensus for a EDP (whether it's my draft or not), or if consensus is that we shouldn't host non-free files. - -sche 21:17, 16 June 2013 (UTC)
- Do we need a formal vote or will a poll on this page suffice? DCDuring TALK 22:16, 16 June 2013 (UTC)
Support[edit]
Support DCDuring TALK 22:16, 16 June 2013 (UTC)
Support. — Ungoliant (Falai) 22:51, 16 June 2013 (UTC)
Support. Points 2 and 3 are important (no free equivalent; minimal usage). We ought to be able to do our job as a dictionary with little or no use of non-free content, and that will remove a source of possible trouble. Equinox ◑ 00:23, 17 June 2013 (UTC)
Support (current version or substantively similar).—msh210℠ (talk) 04:29, 17 June 2013 (UTC)
Support. User: PalkiaX50 talk to meh 04:59, 17 June 2013 (UTC)- S —Michael Z. 2013-06-17 15:40 z
Support. --Haplology (talk) 14:22, 18 June 2013 (UTC)
Support. The fact that we may only very rarely have a reason to use such material is no reason not to have a policy for its use where such use is legal. bd2412 T 14:45, 18 June 2013 (UTC)
Oppose[edit]
Oppose In my opinion we don't need fair use images at all. From what I see, we only have one right now, so... yeah. -- Liliana • 12:27, 17 June 2013 (UTC)
Abstain[edit]
Abstain no strong feelings. Minimal usage is a good idea in case there's a MediaWiki ban on all such images (or files, not necessarily images) so we can remove them quickly if we need to. Mglovesfun (talk) 11:29, 18 June 2013 (UTC)
Arabic dictionary (Sakhr) down but its data can be useful[edit]
It used to be the best online Arabic dictionary. It's the only comprehensive dictionary that consistently provided pronunciation (with vowel points) for most of the words. Others have so far failed to do it. I've been in contact with them in the past. When it went down, I've contacted them, they replied a year ago that they were still fixing it. So far, no progress. I hope we can hold of the data and import it into Wiktionary. I made another contact today in hope they can release the data:
Dear Sir/Madam,
One of the links above says "under construction", the other never returns anything.
The dictionary has been offline for quite a long time. Will it ever be back up again? Is there another site where the dictionary is working.
If there are no resources to restore the dictionary, are you able to release the data, so that it can be used elsewhere?
There are two possibilities - the English Wiktionary:
English Wiktionary or OMITTED If your data is in a readable format, it can be reused, so that learners of Arabic could still use it.
Please let me know if you're able to release your data and on what terms or please advise about the progress with restoring the dictionary.
Signed --Anatoli (обсудить/вклад) 03:59, 18 June 2013 (UTC)
- Who wrote this dictionary, and is it an original work? Or did they compile it from various sources? DTLHS (talk) 04:29, 18 June 2013 (UTC)
- The approach to create this dictionary was similar to Wiktionary, EDICT (ja), CEDIC (cmn). Various users added their contributions but I'm not quite sure, as the volume was quite big, so they may have some initial data from somewhere. I could find a lot of various words there in their lemma form with Arabic short vowels written, so that a person knowing the letters could read. It wasn't too smart, as it didn't separate various senses but every word's translation was split into parts of speech (Arabic word). A user with very basic knowledge of Arabic could find what they were looking for.
- I'm waiting for their response but wanted to advise that some major importing work may be forthcoming and in case there are any licensing issues. Also, in case anyone found any other decent comparable resource (I doubt there is). العربية (Arabic) - WordReference Forums is as close you can get to it, it has sample sentence but only some words have marked pronunciation - the main hurdle in learning to read Arabic well is not the alphabet but missing letters, one has to know those words, grammar and patterns. --Anatoli (обсудить/вклад) 04:55, 18 June 2013 (UTC)
Bot(s) to fix Category:Context label misused wherever possible[edit]
That's it really. Seems uncontroversial enough so I'll get on with it soon unless someone objects. Mglovesfun (talk) 11:28, 18 June 2013 (UTC)
- How you mean "misused"? Do you mean used without explicit
{{context}}? Without lang= tag? Wrong section? Used other than is a definition line? DCDuring TALK 14:02, 18 June 2013 (UTC)
-
-
- Is that a misuse? I thought that usage and grammatical labels could be applied to entire entries or individual senses. —Michael Z. 2013-06-18 15:26 z
- Yes, that was my understanding as well. I put it on the headword line if there are multiple definitions/translations and they all have the same transitivity. SemperBlotto (talk) 15:30, 18 June 2013 (UTC)
- Is that a misuse? I thought that usage and grammatical labels could be applied to entire entries or individual senses. —Michael Z. 2013-06-18 15:26 z
-
-
-
-
-
- Should headword templates incorporate this for basic info, like (in)transitivity of verbs? Or should they be able to accept any usage or grammar label as a parameter? I suspect the most important consideration here is a consistent UI for editors. —Michael Z. 2013-06-18 15:50 z
- That would be good.
{{fr-verb}}accepts type= (but not for (in)transitive),{{it-verb}},{{de-verb}},{{es-verb}}and{{pt-verb}}don't (yet) support this. SemperBlotto (talk) 16:03, 18 June 2013 (UTC)- But verbs are not intransitive or transitive. Senses of verbs are. So this information belongs on the definition lines. —CodeCat 16:14, 18 June 2013 (UTC)
- And so, presumably, nouns are not masculine, feminine or neuter, countable or uncountable, only their meanings are. SemperBlotto (talk) 16:17, 18 June 2013 (UTC)
- But verbs are not intransitive or transitive. Senses of verbs are. So this information belongs on the definition lines. —CodeCat 16:14, 18 June 2013 (UTC)
- That would be good.
- Should headword templates incorporate this for basic info, like (in)transitivity of verbs? Or should they be able to accept any usage or grammar label as a parameter? I suspect the most important consideration here is a consistent UI for editors. —Michael Z. 2013-06-18 15:50 z
-
-
-
-
-
-
-
-
-
-
-
- I don’t understand the nuance of the meta-semantics, but if a label applies to all senses of a term, then isn’t it clearer for the reader if the label is applied at the headword? I believe print dictionaries do it thus. Are our usage and grammatical labels clearly in a different class from mainly-headword labels like m, uncountable, plural only, plural ---, or superlative most ---? —Michael Z. 2013-06-18 18:03 z
-
-
-
-
-
-
-
- Because no one has done so explicitly, I'm objecting to this "fix", whether by bot or not. (I agree with Mzajac and SB.)—msh210℠ (talk) 17:13, 18 June 2013 (UTC)
- Semper's right on this one, in French and I believe some other Romance languages, there are verbs that can only be used transitively or only used intransitively. I picked the wrong fix. Mglovesfun (talk) 18:06, 18 June 2013 (UTC)
- How about just qualifier? The templates
{{transitive}},{{intransitive}}and{{reflexive}}don't categorize anyway. Mglovesfun (talk) 18:18, 18 June 2013 (UTC)
- How about just qualifier? The templates
- Semper's right on this one, in French and I believe some other Romance languages, there are verbs that can only be used transitively or only used intransitively. I picked the wrong fix. Mglovesfun (talk) 18:06, 18 June 2013 (UTC)