Wiktionary:Beer parlour/2012/September

From Wiktionary, the free dictionary
Jump to navigation Jump to search
This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live.
Beer parlour archives edit
2024

2023
Earlier years

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002
December


A question on Latvian "neologisms", take 2

In Latvian, a number of words that didn't exist before were introduced into the language in the mid-to-late 19th century by certain authors (among which the most important was A. Kronvalds; for instance, ķermenis (body) and ziedonis (springtime). I placed them in a new Category:Latvian neologisms, where I am classifying them by author, but User:EncycloPetey pointed out in my talk page that these aren't really neologisms, since they've been around for over a century. I had already worried about that before (I asked this same question in the Information Desk once (see here). Do you guys have any good suggestions as to what I should name these categories, without the word "neologism"? (In the original discussion, someone had suggested "Latvian 19th-century coinages", but that doesn't include all words, since a few of them were invented actually in the beginning of the 20th century).--Pereru (talk) 03:54, 3 September 2012 (UTC)[reply]

The terms that you refer to were introduced by the Young Latvians, right? How does Young Latvian coinages sound? Or maybe Latvian National Awakening coinages? —RuakhTALK 04:17, 3 September 2012 (UTC)[reply]
Mostly, yes, but there are exceptions -- notably J. Neikens and K. Mīlenbahs. The Young Latvians were a political/cultural movement born in opposition to the Baltic Germans and their ideas (e.g., that the Latvians were not "a people", that they were simply a peasant class, etc.). It of course included a strong element of valorisation and modernisation of the language. But there were those (like Neikens and Mīlenbahs) who were interested in the language without the politics. You could say that the Young Latvians (and also, to a lesser extent, the Jaunā strāva) were a vital part, perhaps the visible cause, of a climate of growing interest in the Latvian language and culture, but they were not the only ones involved. --Pereru (talk) 14:23, 3 September 2012 (UTC) A further exception is G. F. Stenders, who coined new Lativan words in the late 18th centuries, over half a century before the Young Latvians began their activities. --Pereru (talk) 01:46, 4 September 2012 (UTC)[reply]
Would "Latvian linguistic-nationalist coinages" be too strong? —RuakhTALK 12:01, 4 September 2012 (UTC)[reply]
That seems like a value judgment that it's not a dictionary's job to make. How about "Latvian 18th- and 19th-century coinages"? —Angr 20:11, 4 September 2012 (UTC)[reply]
Well, Pereru initially objected to "Latvian 19th-century coinages" on the grounds that some of them are actually early-20th-century coinages. I don't think "Latvian 18th- and 19th-century coinages" is much improvement in that regard. As I understand it, the common factor in these coinages is that they were part of an attempt to replace foreign words with words formed on native Latin roots. You see the same thing in other linguistic-nationalist movements. I'm not sure it's really a "value judgment", in that SFAICT the term doesn't seem to have specific positive or negative connotations, only people who feel positively or negatively about linguistic nationalism itself. —RuakhTALK 21:54, 4 September 2012 (UTC)[reply]
Maybe they could be divided into three categories, one for 18th century coinages, one for 19th century coinages, and one for 20th century coinages? (Andthen a fourth for 21st century coinages, while we're at it?) The thing is, the term "nationalist" does have negative connotations, at least for a lot of people. —Angr 22:06, 4 September 2012 (UTC)[reply]
Maybe I'm wrong, or atypical, but I feel that while "nationalist" frequently has negative connotations (albeit not as bad as "jingoist"), "linguistic nationalist" does not. Or maybe it's that "nationalist" doesn't have negative connotations when used in academic contexts, and which is the only context that "linguistic nationalist" appears. —RuakhTALK 22:14, 4 September 2012 (UTC)[reply]
I almost feel like suggesting "Latvian words coined by Latvian writers" or something like that? The idea was not only to replace foreign words (which they did -- ķermenis (body) was formed on the basis of Old Prussian to replace the germanism ķerpers; cf. German Körper), but also to create necessary "modern" words that didn't exist yet in Latvian, or where expressed in awkward ways -- words like "history", "science", "linguistics", "verb", "hospital", "hotel", "institution", "history", etc... As a first step to get rid of "neologism", I renamed the subcategories with "words" instead (i.e., Category:Latvian words coined by J. Neikens, etc. rather than Category:Latvian neologisms coined by J. Neikens, etc., so that only the superordinate category retains the word "neologism." I liked the idea of "Latvian 17th-century coinages", "Latvian 18th-century coinages", "Latvian 19th-century coinages", but which superordinate category should contain them? I suppose you wouldn't want them in Category:Latvian neologisms, right? --Pereru (talk) 17:33, 5 September 2012 (UTC)[reply]
Here's an idea. How about calling the overarching category for these words "Latvian ex-neologisms" or "Latvian old neologisms"? In principle, these words were neologisms when they were first proposed; they subsequently lost this status as they became widely accepted, but indeed that is how they started out. Other possibilities that occur to me: "Latvian post 16th-century neologisms" (or "coinages"); "Latvian old literary neologisms" (or "literary coinages"). Would any of these labels seem appropriate to you all? --Pereru (talk) 21:54, 5 September 2012 (UTC)[reply]

Request!

So, I'm inept at generating things from the database, but it would be really, really helpful to me if someone could generate a wikilinked list of pages that include/match /\{\{infl\|hil/ from a recent db dump, or wait until the next one and generate it for me then. Conrad had generated this list for me several years ago, and it's a list of pages that I eventually need to update to match current templating for Hiligaynon, but I'm not capable of generating the list myself. It could be dumped at User:Neskaya/Hiligaynon infl if anyone would be so kind. Thanks in advance. --Neskayagawonisgv? 02:28, 4 September 2012 (UTC)[reply]

None have {{infl|hil, which is due to recent bot activity changing infl to head; I've updated that page with a list of (I hope) all entries with {{head|hil. Striking this section (which, anyway, should probably have been at the GP).​—msh210 (talk) 20:01, 4 September 2012 (UTC)[reply]

Scribbling

From the end of August, the new Scribunto extension to MediaWiki has been available on the Test2 Wikipedia. I've spent some time Scribbling some of the English Wikipedia's templates to see whether and what improvements to page rendering speed will be obtained. You can see some of the results at w:Project:Village pump (technical)#Scribunto comparisons.

Now my thoughts turn to Wiktionary. Do we have any particularly complex templates containing {{#if:}}, {{#switch:}}, {{#expr:}}, and so forth? I have half a mind to Scribble {{en-verb}}, or perhaps the Classical Greek third declension, but if there's something better to attempt that you know of, please speak up. Bear in mind that I'm not about to tackle the Scribbling of every Scribblable template on every project single-handed.

Scribunto is scheduled (at least) to hit this wiki in a day or so.

Uncle G (talk) 18:58, 4 September 2012 (UTC)[reply]

The main advantage I see in this is that it would reduce the amount of and parameters of inflection templates. If we have string functions in Lua, then we would be able to automatically remove things like infinitive endings, and add different ones based on the original (so no more -ir, -ar, -er templates for Romance languages, just adding {{es-conj}} with no parameters could suffice). —CodeCat 21:18, 4 September 2012 (UTC)[reply]
I think we should try not to overdo attempting to pick what the user is likely to want. On mostly consistent languages, (especially eo and io), string manipulation could eliminate the need for supplying parts of the page title as arguments. --Yair rand (talk) 21:47, 4 September 2012 (UTC)[reply]
  • @Yair rand: I'm not convinced that {{context}} and {{Xyzy}} should be Luacized. Both of them require large amounts of data. Currently, their data are supplied by helper templates (e.g. {{medicine}} in the case of {{context}}; e.g. {{he/script}} in the case of {{Xyzy}}); Lua modules don't seem to have an analogous mechanism, so presumably all of the data would simply have to be hardcoded into the module. (Right?) I admit that I don't know very much about Scribunto, but it seems unlikely to me that that would be a win. —RuakhTALK 21:34, 4 September 2012 (UTC)[reply]
  • Addendum: Actually, I now see that this section seems to argue that we really should include data-tables in our modules. I mean, it gives that as an alternative to having a Lua function call out to a real template, and my above comment was already taking for granted that that's a bad idea, but still. —RuakhTALK 21:41, 4 September 2012 (UTC)[reply]
  • There's already no need for {{context 1}}, etc. — it's a long-obsolete approach — but the last time the subject came up, you said we might as well not bother fixing it until the Lua extension came out, so here we are. :-P   —RuakhTALK 21:57, 4 September 2012 (UTC)[reply]
    Hm, so I did. I really don't know about the performance difference between Lua and ParserFunctions. If Lua is always/mostly faster than parserfunctions, and transcluding is at least as fast, then I don't know how any rewrite of context could be as fast as having it in Lua. If transluding is slower, on the other hand... --Yair rand (talk) 22:12, 4 September 2012 (UTC)[reply]
  • I talked a bit about this topic on IRC a while ago, and yes, Lua would mean we should probably move all our language and script templates and whatnot into huge Lua modules, which can be parsed far more efficiently than transclusing thousands of pages like what water is doing right now. -- Liliana 20:27, 5 September 2012 (UTC)[reply]
  • Sorry, but that statement is pretty meaningless. [[water]] is the most template-intensive page on the wiki, by a wide margin, but optimizing [[water]] at the expense of every single other page would still probably be a net loss, overall. (I'm not saying we shouldn't optimize [[water]]. But it's a special case, not a representative one, and that should be a consideration in any optimization of it. For example, instead of using {{t}}, maybe it should be using some sort of custom template that demands that all of its information supplied via parameters rather than via template lookups.) —RuakhTALK 01:50, 6 September 2012 (UTC)[reply]
A system for generating Hebrew conjugation tables would be awesome, by the way, though I have no idea how that could work.
{{isValidPageName}} could become no longer reliant on hacks, {{head}} could be made lighter (?), transliterations (including {{IPA}}-to-SAMPA) could be automated until the transliterator extension is installed (though I don't know if it will ever be installed at this point). --Yair rand (talk) 21:47, 4 September 2012 (UTC)[reply]

I've created WT:LUA so that we can develop and discuss best practices. Maybe all further discussion should be on its talk page? —CodeCat 21:52, 4 September 2012 (UTC)[reply]

O, frabjous day. DCDuring TALK 01:07, 5 September 2012 (UTC)[reply]
  • So, was Uncle G simply mistaken when he said Scribunto was set to be installed here today? Because the roadmap that he linked to says only that 1.20wmf11 would be installed today; that's already happened, and the item on the roadmap is marked with a "done" checkbox, but we still don't have Scribunto. —RuakhTALK 19:36, 5 September 2012 (UTC)[reply]
    • They forgot about us... :( —CodeCat 22:15, 5 September 2012 (UTC)[reply]
    • I'm sorry for the misunderstanding. Yes, Uncle G was mistaken. Wiktionary did not get Scribunto today. I'm going through the documentation to make it all reflect reality better and be clearer; there is no timetable, at present, for deployment of Scribunto to wikis other than mediawiki.org and test2.wikipedia.org, and I will make our roadmaps and mw:Lua scripting reflect that better, then report here with any more info. Sumanah (talk) 23:42, 5 September 2012 (UTC) (Wikimedia Foundation's Engineering Community Manager)[reply]
      • And one more thing -- meta:Tech/Ambassadors is the beginnings of a draft regarding our Tech Ambassadors Network: The Tech Ambassadors Network (TAN) is a community of liaisons among developers and local Wikimedia wikis.....One goal of the ambassadors network is to make sure that users are notified of technical discussions and possible changes that impact them. The other goal is for users to get involved, as peers in the development process, so that they can inform and guide software development, not just provide feedback after it's done....If you've ever asked "Why wasn't I consulted about this feature?", you should become an ambassador. Anyone here who wants more notifications and discussion on Lua and other upcoming features should consider subscribing. And comments on the TAN idea are also welcome on the talk page. Thanks. Sumanah (talk) 23:46, 5 September 2012 (UTC)[reply]

NOTE: official request for this to be enabled: https://bugzilla.wikimedia.org/show_bug.cgi?id=40031 - Amgine/ t·e 00:11, 6 September 2012 (UTC)[reply]


Babel templates for template proficiency

I've created a set of templates, {{User template-1}} and so on, so that users can show how well they know template coding. I hope it's useful, and hopefully people will include it even if they don't know any at all, so that others know not to ask them. :) —CodeCat 23:23, 4 September 2012 (UTC)[reply]

Would that indicate how well they know MediaWiki, or how well they know Wiktionary? Like, if someone has written and edited fifty million templates on en.wiki, and knows the ins and outs of every parser-function even better than the parser itself does, and then they come here, should they have a {{User template-0}} because they don't even know the difference between {{term}} and {{onym}}, let alone the effect of modifying {{he/script}} and {{etyl:he-IL}}? —RuakhTALK 23:29, 4 September 2012 (UTC)[reply]
I don't really know. Someone who has little knowledge of Wiktionary specifics could still very easily make an inflection table for example, so they could start off as level 2. I suppose part of proficiency in templates is knowing how all the templates they themselves use as well, to know the 'whole picture'. —CodeCat 23:33, 4 September 2012 (UTC)[reply]

Legal Fees Assistance Program

The Wikimedia Foundation is conducting a request for comment on a proposed program that could provide legal assistance to users in specific support roles who are named in a legal complaint as a defendant because of those roles. We wanted to be sure that your community was aware of this discussion and would have a chance to participate in that discussion. If this page is not the best place to publicize this request for comment, please help spread the word to those who may be interested in participating. (If you'd like to help translating the "request for comment", program policy or other pages and don't know how the translation system works, please come by my user talk page at m:User talk:Mdennis (WMF). I'll be happy to assist or to connect you with a volunteer who can assist.) Thank you! --Mdennis (WMF) (talk) 15:32, 5 September 2012 (UTC)[reply]


Speedying merged transwikis

I think that if a nonadmin merges the useful content of a transwiki: page into the mainspace, he should then be able to tag the former with {{d}} instead of {{rfd}}. Thoughts?​—msh210 (talk) 22:15, 6 September 2012 (UTC)[reply]

If we take content from a page, aren't we required to attribute that content to its copyright holders? That generally means that deletion is impossible. —RuakhTALK 22:38, 6 September 2012 (UTC)[reply]
What do you mean? Of course it's possible. The original edit history is copy-pasted into the talkpage of the article that incorporates it. (If I'm missing something here, do tell.) --Μετάknowledgediscuss/deeds 23:24, 6 September 2012 (UTC)[reply]
If by " [] it's possible. The original edit history is copy-pasted into the talkpage [] ", you mean "it's possible: the original edit history can be copy-pasted into the talkpage", then yes, I suppose that's true. But do people actually do that? For example, you recently expanded [[cadet#Etymology]] with attribution to [[Transwiki:Cadet (genealogy)]] — and then, with your blessing, the latter was deleted. As far as I'm aware, a non-admin can no longer track down the list of contributors. Is this because you (and others) searched for copyright implications and determined that there were no problems? (Information per se, of course, is not copyrightable, so this is plausible.) Is it because you thought that copyright implications were addressed in some way? (If so, I think you were mistaken.) Or is it because no one thought about copyright implications at all? No one made any suggestive comments in any of these directions, so it's hard to know. —RuakhTALK 00:39, 7 September 2012 (UTC)[reply]
I've seen it done before, and I rather assumed that the deleting admin would copy-paste it again. I obviously am not one of those more knowledgeable about copyright, so I do it just in case. Minor note: I assume that for your edit summaries you are copy-pasting Μετάknowledge instead of typing it out on a Greek keyboard. Either way, I really don't mind if you call me Metaknowledge or Meta or MK or really anything else that's obviously me, if it makes it easier for you. --Μετάknowledgediscuss/deeds 00:54, 7 September 2012 (UTC)[reply]
Re: first sentence: I'm guessing — and it's quite possible that I'm wrong, and that you've seen exactly what you describe — but I'm guessing that what you've seen is the remnants of really old transwikis, from before transwikis were an actual software feature. Back then, transwiki-ing was performed by a bot that copied the contents of a Wikipedia article over to Wiktionary, and copied the history of the article over to the corresponding talk-page.   Re: minor note: Yes, I'm copy-pasting from the edit-window. Don't worry, I don't feel obligated to do it; it had literally never once crossed my mind that you might object to one of the alternatives. :-PRuakhTALK 01:49, 7 September 2012 (UTC)[reply]
That's probably it. I should never expect human diligence when bot code is much more likely. Re Msh210's original post, I think it's fine as long as the said copy-pasting occurs, unless, of course, somebody has a reason why we needn't do so. --Μετάknowledgediscuss/deeds 02:17, 7 September 2012 (UTC)[reply]
Well, when what's to be merged is information (but the wording is completely changed), I doubt it's necessary to worry about copyright. (IANAL.) And that's often, but certainly not always, the case if we already have an entry and are merely adding in some additional information from the WP article. (When we don't already have an entry, we move the Transwiki: page anyway, so this is moot.) I agree with you (Ruakh) that a deleted page's history is probably insufficient attribution under WP's licenses. (Again, though, IANAL.)​—msh210 (talk) 05:58, 7 September 2012 (UTC)[reply]
Re greaser: I have now done what I only hesitated to do because there was already a Talk:greaser: I have preserved the history of Transwiki:Greaser (derogatory) at Talk:greaser via move. - -sche (discuss) 03:02, 7 September 2012 (UTC)[reply]
It cannot be the case that we are obliged to "preserve edit history" if we extract ideas (not copyrightable) or snippets ("fair use") from a to-be-deleted transwiki, any more than if extract such material from a WP article which is subsequently deleted or an interwiki definition that is subsequently deleted from the originating wiktionary edition. This would be significantly more burdensome than "fair use" of ordinary copyrighted material.
Whether -sche copied two consecutive sentences from the transwiki into [[greaser]], I couldn't tell from the preserved edit history, which few indeed would know to look for at the Talk page. OTOH, I am grateful that it wasn't preserved in the principal namespace page history. It is already hard enough to track down specific changes to an entry. This seems a great deal of effort to preserve a figleaf of strict compliance with the MWF(?) license. If we really need to do it, there should be an automated way to do it.
Perhaps we should be more diligent in heading off transwikis by participating in the process at WP. DCDuring TALK 13:53, 7 September 2012 (UTC)[reply]
I'm not sure "fair use" applies to a case like this, where our goal really is to completely supersede the original work. Basically, my point is that we need to take the attribution requirement seriously, because we want to insist that other sites take it seriously. We send our legal ninjas after mirrors that don't properly attribute their content to Wiktionary; we should be just as diligent in attributing our content to those who contribute it.
When we determine that what's taken really is not subject to copyright — all that's taken is information/ideas, rather than the original expression of that information/those ideas (caveat: the choice of which information to present can be copyrightable expression, so merely changing the superficial appearance may not be sufficient) — then there's no legal obligation to preserve the edit history, but I think we should make explicit mention, in discussions and in edit summaries, of this determination. It's not enough for an editor to be privately, secretly aware that (s)he's done the right thing; (s)he needs to mention it, so that others learn, "Oh, yes, this is something we have to pay attention to." Editors often get this wrong, so even when we get it right, we should "show our work" so that everyone can see that we've gotten it right (or at least that we believe we have).
Re: "OTOH, I am grateful that it wasn't preserved in the principal namespace page history": Me, too. I really hate it when the histories of two pages are merged. But, when we take copyrighted content from one page to another, our edit-summary should always indicate the source. In a case like [[greaser]], if the content was copyrighted, then that attribution could have been made by pointing to the talk-page.
RuakhTALK 15:34, 7 September 2012 (UTC)[reply]
I may have misunderstood the purpose of preserving edit history. I thought we were talking about respecting the original contribution of contributors. If I understand you correctly, the principal purpose is to enable us to show that such contribution is not a verbatim copy of copyrighted text by exposing in the edit history all the text ever added. Is that right? DCDuring TALK 17:31, 7 September 2012 (UTC)[reply]
Firstly — IANAL, but I don't think we need to take pains to prove that we're not stealing copyrighted text; if it ever somehow came to such a point, the original deleted text is still in the database, accessible to admins and subpoenas. So as far as that goes, I'm just saying that we should make a point of explicitly mentioning that we're not stealing copyrighted text: i.e., of making clear that this is an issue we're on top of.   Secondly — IASNAL (S = "still"), but I think that the attribution requirement only requires a list of contributors. Obviously our usual mechanism for presenting the list of contributors is our edit-history, which also gives access to much more detailed information (who contributed what and when), but a mirror could choose to satisfy the requirement simply by listing all the usernames that appear in the edit-history, plus any edit-summaries that give attribution elsewhere. Minus, perhaps, any usernames that appear only on edits that the mirror determines were rolled back, if they want to be really parsimonious. (Until quite recently, I thought differently — to the point that I thought it was a copyright problem if we had to hide certain old versions that included valid contributions — but I now think I was mistaken.)   Thirdly — no thirdly. IASSNAL. If you want three points, consult an attorney. :-)   —RuakhTALK 18:03, 7 September 2012 (UTC)[reply]

Word of the day

Due to school and some other stuff, I won't be able to edit as frequently anymore. I would appreciate it if someone could take over the WOTD. —Internoob 02:23, 7 September 2012 (UTC)[reply]

Dude, I wish I could. Who did it last time we had an impending gap? --Μετάknowledgediscuss/deeds 03:23, 7 September 2012 (UTC)[reply]
EncycloPetey set WOTD for a long time, then Widsith did it until work started keeping him too busy. Then I did it until I burnt out (though I've been setting December 2012's words). Ruakh helped out, and then Internoob took over. And that's the history of WOTD. - -sche (discuss) 03:43, 7 September 2012 (UTC)[reply]
The beauty of the WOTD system (and the reason I have opposed efforts to make it year-dependent) is that if no-one sets a word on a particular day, the previous year's word is shown (a failsafe). - -sche (discuss) 20:04, 7 September 2012 (UTC)[reply]
If needed, I could perhaps fill in for a month, but that month would be November. Changes in my work and life offline mean that I'm usually too busy to commit to that kind of long-term work anymore. --EncycloPetey (talk) 05:17, 18 September 2012 (UTC)[reply]

Foreign Word of the Day code

The vote for having Foreign Word of the Day on the Main Page passed, but the code that was mentioned in the “Voting on” section of the vote was bugged. So... should the bugged (but voted on) code be added to the Main Page, or the fixed (but not voted on) code?

Also, can we add a request for nominations to the Main Page for a week (see here)? This wasn’t mentioned in the vote so I’m asking here. Note that if we limit ourselves to at most two FWOTDs per language per month there are already more than enough valid nominations for a month. — Ungoliant (Falai) 19:37, 7 September 2012 (UTC)[reply]

Definitely don't add the buggy version. If you're confident that the fixed version still satisfies the spirit of the vote, and differs from the buggy version only by fixing the bugs, then sure, add that instead. —RuakhTALK 20:00, 7 September 2012 (UTC)[reply]
I think if the fixed code is just that, fixed, then there is no problem with adding it. But I wonder about the request for nominations. Do we really need to ask for even more of them? We do have quite a few already. —CodeCat 20:27, 7 September 2012 (UTC)[reply]
There also were some minor cosmetic changes. The problem isn’t quantity, but language scope. Since most valid nominations were nominated/cited by Metaknowledge and me, most of them are Germanic, Romance and Oceanic. We need more than that. — Ungoliant (Falai) 20:50, 7 September 2012 (UTC)[reply]

Quoth DCDuring "Do we want to be in the position of carrying water for trademark holders?" Quoth bd2412: "This a burden we should not assume. Trademarks are impermanent. Trademark registrations are for limited periods, and are lost if the registration is not renewed or if the mark is not used for some period of time. Furthermore, most every word in the lexicon has been used by someone at some point as a "trademark". Are we then responsible for noting that ace is a trademark for a brand of bandages, and for a hardware store; that dove is a soap maker, and a chocolate bar; that planters are nuts, Peter Pan is a peanut butter, Pam is a cooking spray, and falling star is a wine? The most we can and should say is that a word (such as aspirin or escalator) originated as a trademark, without commenting on its current status."

I very much agree with these sentiments. However, as Template talk:trademark records, Template:trademark was kept based on this vote from 2007. Given that community opinion in more recent discussions has changed from what it was in 2007, should another vote be held? Or is it enough, given the inconclusiveness of that vote, to have this BP discussion of whether or not to delete {{trademark}}, without a vote? - -sche (discuss) 02:59, 8 September 2012 (UTC)[reply]

Note that this is a discussion of whether or not to indicate that words are trademarks. This particular discussion has no impact on which words we include. - -sche (discuss) 03:03, 8 September 2012 (UTC)[reply]

What is the WT way of RR?

According to w: Revised Romanization of Korean #Consonant letters, the Korean consonant string /ㄷㅎ/, as of 닫히다 and 받히다, the passive forms of 닫다 "to close" and 받다 "to butt," respectively, shall be romanized into either /t/ or /th/ or /ch/, embarrassingly. --KYPark (talk) 12:55, 8 September 2012 (UTC)[reply]

Indeed there are two valid ways of the dualist RR:

  1. 1:1, hence simple, clear-cut, academic, orthographic transliteration of written Korean. (Personally I've preferred this academic way since 2006.)
  2. 1:1+, hence complex, confusing, secular, orthoepic transcription of spoken Korean. (WT admin's often press me to use this confusing way hence to my agony.)

The WT authorities concerned are cordially asked to make clear to Korean editors including myself which is the definite unnegotiable standard, before and after this moment.

References

--KYPark (talk) 13:38, 9 September 2012 (UTC)[reply]

See also User:KYPark/rrok and find which of the four romanizations, if given alone, enables you to make the best guess of the corresponding hangul orthography unavailable. May I take this opportunity to argue strongly that though imperfect, this at least should be the minimum essential standard? --KYPark (talk) 15:02, 9 September 2012 (UTC)[reply]

We traditionally use the system of RR that you call "orthoepic", which is what I believe is standard RR, and matches all the official use of RR that I have seen in Korea. You call it "confusing", but to a native speaker, it should be very simple (I find it to be the most useful myself, and not difficult in the least). Also, I find your misleading parenthetical notes about which is more "academic" to be very disingenuous, and your continuing to use your preferred system (after several Hangeul-reading editors have told you that Wiktionary does not use that system) to be frustrating. --Μετάknowledgediscuss/deeds 15:45, 9 September 2012 (UTC)[reply]
You dare to blame me for being "very disingenuous" meaning even "deceptive."
  • RE ""confusing"" : Refer to the passage on top, ending with "embarrassing" instead this time, wishing the confusion implied there to be resolved. Please resolve it.
  • RE ""academic"" : It is not my wording but the Korean government's allowing the way I prefer to be used for "academic" purposes, from the dualist perspective. Therefore, this should be explicitly ruled out, and on this base only you should ask me to stop it. Otherwise you press me unprincipled, roughly based on "tradition" and your personal judgment. I ask you to go beyond that "which is what I believe is standard RR" as well as the "reasonable doubt."
Remember I am responding to your previous advice for me to bring the question of using that academic way to this forum. And I added my personal agony. But now you look like focusing on personal attack on me, I fear, making a "disingenuous" liar of me because of that minor side effect. Is this a fair play? Are you focusing on the main point in question? BTW, it's too late; I have to go to bed.
--KYPark (talk) 17:07, 9 September 2012 (UTC)[reply]
닫히다 and 받히다, 닫다 and 받다: dathida, bathida, datda, batda. —Stephen (Talk) 17:44, 9 September 2012 (UTC)[reply]
KYPark, I can't believe you're accusing me of personal attacks after all the rude things you've said about me and other editors in the Etymology Scriptorium. In any case, the "base" upon which I ask you to cease is community consensus. Only I and Stephen have commented so far in this thread, and I have expressed my preference, while Stephen has given transliterations that align with that preference. I really cannot imagine how this is causing you "personal agony". --Μετάknowledgediscuss/deeds 20:52, 9 September 2012 (UTC)[reply]
Let me move leftmost...
To Stephen

On what grounds is the /ㄷㅎ/ to /th/ other than /t/ and esp. /ch/, though WP treats all as equal, as suggested on top? Did WT ever make /th/ an explicit rule? I wonder if RR ever made it very clear, but the so-called orthoepic way of RR is basically but roughly based on the so-called "Phonetic Hangul." Otherwise, unprincipled! Accordingly, RR must take it that:

        Canonic      Phonetic
        Orthography  Orthoepy   Remarks  
Hangul  닫히다       다치다    
Roman   dadhida      dachida    dathida ?? 

So I suspect WP and WT arbitrarily introduced /th/ to align with /kh/ (ㄱㅎ) and /ph/ (ㅂㅎ) which may be also arbitrary. Go and see the following real-life examples, incidentally 3:3, half and half! Anyway, this may well sum up the state of confusing WT traditions. See also more confusions in a nutshell I worked on hard. Naturally you find no Sino-Korean but native Korean examples of /-ㄷㅎ-/.

Arbitrary? RR Principled RR
백화점 (배콰점) baekhwajeom
역할   (여칼  ) yeokhal
입학   (이팍  ) iphak
백합   (배캅  ) baekap
특히   (트키  ) teuki
낙하산 (나카산) nakasan

May I take this opportunity to ask you if you made this mistake last year purely by accident? Your frank witness would greatly influence the standard RR to be or not to be refined. Regards.

낮 말은 새가 듣고 밤 말은 쥐가 듣는다
naj mal-eun saega deudgo bam mal-eun jwiga deudneunda

--KYPark (talk) 04:29, 10 September 2012 (UTC)[reply]

You asked how to transliterate those letters and I told you. If you have found a mistake, you should fix it, just like I fix your mistakes. You are still griping and complaining about RR after these six long years. Enough already! We use the Revised Romanization here. If you cannot do it, then do not put transliterations. We will do the transliterations for you if they are so difficult. Please note: this paragraph is the answer to your questions. This paragraph does not contain a question. —Stephen (Talk) 09:43, 10 September 2012 (UTC)[reply]
To fix mistakes is one thing; to fix what are mistakes at all is another. Don't mix both here, as the latter solely matter here. As perhaps one of the most influential here, you should focus on all I ill or well bring to everybody's attention. For that's what we have to collaborate to resolve. Please don't focus on attacking me unproductively, uselessly to the community. I really hate your emotionally pointed point of view. Didn't I say "Regards" of goodwill to you? Is this really the way you would respond to that? Will you remain such an awful enemy to me forever, for what at all? Anyway please answer the question at the end.
--KYPark (talk) 10:41, 10 September 2012 (UTC)[reply]
RE: "Please note: this paragraph is the answer to your questions."
This is not really the answer to my question beginning with "On what grounds is the /ㄷㅎ/ to /th/ other than /t/ and esp. /ch/ ..." but in effect an escape from it, I fear. Please advise me the reason for nothing but /th/ for /ㄷㅎ/. Sincerely. --KYPark (talk) 11:16, 10 September 2012 (UTC)[reply]
I have no idea what you are talking about. I don’t know what you mean by mistakes versus mistakes at all. I don’t attack you productively or unproductively, there is no emotion in what I write. You asked how to transliterate those letters and I told you. There’s an end to it. If you did not want to hear my answer, then you should not have asked. We discussed grounds and reasons six years ago, I am not going to rehash it because I already know that you will never accept the community decision. I am not going to argue with you about it. I have already said all that I am going to say on the matter. —Stephen (Talk) 11:28, 10 September 2012 (UTC)[reply]
There would be no worse way of speaking ill of me in this collaborative community than saying that "you will never accept the community decision." You make a self-righteous man of me, while I am seriously asking what or which is exactly "the community decision" at all regarding the very Janus-faced RR in disguise. And you just escape from answering a simple question I ask you, to my great regret, not to mention from any critical review of my reasoning based on the relevant RR principle (say, phonetic hangul basis) and real-life practices in a nutshell. I am really disappointed. Enough is enough; let's stop here. I can lead a horse to water but I can't make him drink! --KYPark (talk) 13:23, 10 September 2012 (UTC)[reply]
To Metaknowledge

I always wish you to remain more careful and consistent. You doubt my "personal agony" which was explicitly given above: "press me to use this confusing way hence to my agony" which is manyfold: (1) to be pressed to use the confusing way, (2) which itself in turn is embarrassing to become the source of mental stress that is simply undecidability, and (3) maybe blocked if I keep going "my way" inspite and instead of your "preference" warning!

Honestly I suspect you of trying to find some reason for blocking me long. Irrelevantly, indeed, you bring even WT:ES events into BP. Yeah, my karma or what I have ill or well done cannot be undone. I wonder if it helps resolve this agenda.

I make it a rule to attack ahead of no one, but perhaps tougher than an eye for an eye when attacked, as I did in ES. But ideally I prefer cosmic w:moral reciprocity such as do unto others as you would have them do unto you or 역지사지, as I recently created specially to this end. [ I correct "anyone" to "no one" above. --KYPark (talk) 04:03, 11 September 2012 (UTC) ][reply]

I raised this agenda step by step, starting from quite a simple straightforward question. It could have ended up with a simple answer. But no one answered quite a while. So I escalated again and again. At last Stephen answered very simply but quite doubtfully to me, as I suggested to him above.

You mentioned your "preference" in reply, which is not the right answer, I fear. The question is "What is the WT standard of RR?" especially "unnegotiable." If you as an admin repeatedly impose your mere "preference" on me, in effect you threateningly do unto me, as you once warned me a long blocking, as I am easily scared by you, you know!

I greatly regret I have to say all these, regardless of the main point. You should know how hard I am preparing myself for its resolution; this byproduct is never ever what I want.

--KYPark (talk) 04:29, 10 September 2012 (UTC)[reply]

To be maximally clear: I will not block you over transliteration. I will not block you for a long period of time (i.e., more than a couple days) without bringing it up here first. I think that adding a Korean hanja box to the English section of an entry on purpose and defending it is disruptive, and because you have been warned, I will block you for a short period of time if I see you do that again. That's all. Honestly, I would much rather never block you. I apologize sincerely if I have scared you.
Nothing here is "unnegotiable", but my preference happens to be the same as what almost all Korean entries and translations on Wiktionary follow for RR. That is why I have been modifying your transliterations. --Μετάknowledgediscuss/deeds 04:45, 10 September 2012 (UTC)[reply]
I hope you guys calm down. Here's my thoughts:
Using a standard or most common transliteration is very important but not critical. KYPark, please reconsider this carefully. What seems to be good and natural to YOU, may not be good to others and may put off users, editors. RR has become very popular among Korean learners, that's the fact.
Losing a passionate editor, a native speaker or a person with a good knowledge of a language over transliteration is not wise. I had to compromise over this before. Perhaps we can compromise too. One option is to leave all entries and translations WITHOUT transliteration altogether until the dispute is resolved. Korean is not a language, which has a long tradition of using one standard transliteration, so a compromise might be possible, I think.
My preference is that we use Revised_Romanization_of_Korean (RR)) but I'm not sure if the Wikipedia article describes it completely in English (or any other web-site) and all the nuances are subject to the decision of Wiktionarians. My knowledge of RR is not so great but it's definitely better than McCune–Reischauer (MR).
Sorry if I sound confusing. I may have done a bunch of inconsistencies as far as Korean transliteration is concerned. BTW, I think Stephen was occasionally using a very relaxed Korean transliteration, so I wouldn't worry about why he transliterated the way he did. We don't have a Wiktionary document describing in a clear way how we should transliterate, so we get all sorts of varieties. As I said, transliteration is important but it's more important that the native script is written correctly. --Anatoli (обсудить/вклад) 05:50, 10 September 2012 (UTC)[reply]
To Anatoli

Surprisingly, you include everything most vital in your first paragraph, as I note and respond as follows:

"KYPark, please reconsider this carefully."

I did "reconsider this carefully" over and over again, at least no less than half a dozen years, as explicitly referenced and discussed above. It's now your turn to do hopefully that much.

"What seems to be good and natural to YOU, may not be good to others."

This sounds the very way of my motto I've implicitly raised recently. This sounds the very goal of my motto I've implicitly praised recently all the way and perhaps forever!
But this is not exactly mine, to be precise. What you see "popular" is what I see "secular" or likely unacademic or unworthy of Wiktionary in my critical view.

"RR has become very popular among Korean learners, that's the fact."

Your view is supposed to be generally right. But you are right as far as RR is nothing but the secular or popular orthoepic transcription of spoken Korean at the cost of the academic orthographic transliteration of written Korean. But the truth is RR is not so monist but dualist indeed! I'd rather blame the Korean government authorities concerned for misquiding the world as if such secular then popular orthoepy were the standard RR.

These are all I'm saying here forever. Kind regards.

--KYPark (talk) 09:50, 10 September 2012 (UTC)[reply]

I don't have an opinion on the matter at hand, but I will say that you do love to create drama. Most people would be content with simply making their point, without pages of hyperboly, marginally relevant or irrelevant data, tables, charts, graphs, illustrations, etc. Not you. Sometimes you have a valid point buried under all of that, but I suspect that pretty much none of those not actually participating in the debate will ever see it- it's just too easy to page past all the eye-glazing prolixity. Chuck Entz (talk) 13:50, 10 September 2012 (UTC)[reply]
Also, it is easy to find a stick to beat a dog. You come to find still another fault with me. Again, enough is enough; let's just focus on the very RR in question, shall we? --KYPark (talk) 14:32, 10 September 2012 (UTC)[reply]
Okay, it's the way the rest of us want to transliterate Korean, so that's the way consensus goes. If you're not interested in discussions of your communication style, I think that's what you're going to get.--Prosfilaes (talk) 22:56, 10 September 2012 (UTC)[reply]
Haha, what a classic circular argument! (^_^)
  1. "What is the Wiktionary way of RR?" (or "the very RR in question")
  2. "Okay, it's the way the rest of us Wiktionary community want to transliterate Korean, so that's the way Wiktionary community consensus goes."
Apparently, you don't like the likely well-established and well-known difference between:
  1. orthographic transliteration of written language, and
  2. orthoepic transcription of spoken language.
The RR mode of Stephen's "community decision" or your "concensus" is not transliteration but transcription, to be precise, to avoid so widespread confusion even in this relevant discussion. Whence, the first button may have been wrongly fastened, I presume; what has been seen cannot be unseen.
--KYPark (talk) 04:03, 11 September 2012 (UTC)[reply]
KYPark, your arguments are very hard to read, sorry. Not sure if it's your English or mine. Even after careful reading I don't know what we are arguing about. Specifically, what examples of transliteration are you advocating? Are you sure that Stephen is insisting on the transliteration example he used (it may have been his mistake)? If you are FOR the use of RR, then what are the specifics? There's a lot of emotion here but little technical and little to the point. Can you start with consonants in each position? We have a consonant table here: RR transcription rules. What part do you disagree with?

Importing the table:

next initial →
previous ending ↓ g n d r m b s j ch k t p h
k g kg ngn kd ngn ngm kb ks kj kch k-k kt kp kh, k
n n n-g nn nd ll, nn nm nb ns nj nch nk nt np nh
t d, j tg nn td nn nm tb ts tj tch tk t-t tp th, t, ch
l r lg ll, nn ld ll lm lb ls lj lch lk lt lp lh
m m mg mn md mn mm mb ms mj mch mk mt mp mh
p b pg mn pd mn mm pb ps pj pch pk pt p-p ph, p
ng ng- ngg ngn ngd ngn ngm ngb ngs ngj ngch ngk ngt ngp ngh

--Anatoli (обсудить/вклад) 05:21, 11 September 2012 (UTC)[reply]

In the beginning, I ask "what is the WT way of RR?" say for 닫히다. Please romanize it for me using the above table. --KYPark (talk) 05:44, 11 September 2012 (UTC)[reply]
I would romanise as dathida (although I used datida, never dachida) . The table doesn't describe initial consonants or between vowel (there are other tables on the Wikipedia page). The initial (same as between vowels) is transliterated as "d". Using "t" and "ch" for + is confusing. We should stick to "th". I'm against using dadhida or tadhida, that's definitely not RR. --Anatoli (обсудить/вклад) 05:54, 11 September 2012 (UTC)[reply]
Definitely, you are split among the three options, being more or less embarrassed.
  1. Why would you change your mind from datida to dathida?
  2. Is it because Stephen said so above?
  3. Why do you say "we stick to dathida"?
  4. What if neither Stephen's dathida nor your datida is outside of the RR proper? Should we still stick to it?
  5. Why do you rule out dachida for which I argued from the "phonetic hangul" perspective?
  6. Do you say that dadhida is not RR proper? Just forget tadhida which is absolutely wrong.
--KYPark (talk) 06:25, 11 September 2012 (UTC)[reply]
I have been doing occasional Korean translations for a few years and if asked seriously would insist on dathida - that's my understanding of RR but I may be wrong. I haven't change my mind recently but I at some stage got misled by a textbook which used a variety of MR method. It used just "t", "k" and "p" in such cases. Stephen's answer doesn't have to do with this. Did he mention this example? You have too many questions, some sarcastic, from which I can't tell what YOUR preferences are. If you prefer phonetic Hangeul, why do you insist dadhida? What DO you insist on? dadhida and dachida are opposite extremes. Are you trying to confuse me or are you trying to turn everyone against you?
I don't understand: "# What if neither Stephen's dathida nor your datida is outside of the RR proper? Should we still stick to it?"
Why do you say "we stick to dathida"? You tell me why not. --Anatoli (обсудить/вклад) 06:45, 11 September 2012 (UTC)[reply]
Did he mention this example?
Yes he did indeed! And this dathida of his is practically all that this long discussion has answered me properly though implausibly. In response, I argued against it as unfounded at all, while for dachida as founded on the phonetic hangul. In retrospect, this was the climax of this discussion! So I'm afraid you miss that. Thus may I ask you to read through all the discussion very carefully, before you complain you cannot understand what I am talking about. Nonetheless, this discussion is not enough. It entails a six-year old prehistory, as referenced above. To become a really responsible commentator, you have to read through those references, at least.
I have to say it again but your English and your manner of expressing yourself is hard to understand. It was also obvious from Stephen's comments. In reply to your lengthy writing where your main point was thoroughly buried, he just mentioned what our convention is. Instead of explaining clearly what you want, you give numerous examples. Yes, it's our convention and common practice. I think I'll leave this unproductive discussion but prepare a some tables later on for a discussion. --Anatoli (обсудить/вклад) 23:30, 11 September 2012 (UTC)[reply]
In response to your complaining of difficulty in understanding me, I asked you a simple question. And you answered:
I would romanise as dathida (although I used datida, never dachida).
That is, dathida > datida > dachida, in short for 닫히다. The RR proper, however, almost certainly requires this be upside down, say, dachida (my guess) >>> dathida (Stephen's and newly yours). That is, Stephen and you and all the silent friends of yours are most likely wrong, I fear!
Now I have a way to prove it definitely, while all of you may have none, hopelessly. Or, You all may have one but say none so as not to prove You all are wrong yourselves. Metaknowledge's mentioning "preference" at last suggests You have one or the like proof. Otherwise, You all are simply unqualified here!
You complain of my hard English: "It was also obvious from Stephen's comments." Do you take him as your guiding light? But I said "To fix mistakes is one thing; to fix what are mistakes at all is another." And he said "I don't know what you mean by mistakes versus mistakes at all." Do you also have the same or similar trouble with understanding my passage as such? Please answer frankly so that I could review and refine my English for this community!
Thus I newly begin to wonder if the native English speaker can always understand English better than a level-2 English speaker like myself. Furthermore, not only English understanding but everything else....
Next to one simple question, I asked six discreet numbered questions, expecting you to answer each individually but in vain. This confusing way of Q&A is helpless and hopeless, I fear and regret, for precise approaches to problem-solving, isn't it?
--KYPark (talk) 08:30, 12 September 2012 (UTC)[reply]
In reply to your lengthy writing where your main point was thoroughly buried, he just mentioned what our convention is.
In reply to my first single sentence, "he just mentioned" not "what our convention is" but what his preference or preoccupation is, I fear. Should it be our explicit convention, it may be simply wrong, as I explained so long so far. Now it's your turn to examine if and how ours is wrong. There is no use to blame me and cover but uncover it!
Instead of explaining clearly what you want, you give numerous examples.
"Clearly what I want" is to clearly show or suggest to you, through "numerous examples," what may be wrong in your ways of thinking and doing, say, not transliterating written Korean but quite exceptionally transcribing spoken Korean into Roman, possibly to the great disadvantage to Korean! Nonetheless, my original question was very simple and humble indeed. You all have helped it evolve into all this complexity. It's rather your karma, I fear.
Yes, it's our convention and common practice.
I remember you plausibly doubted it at first. I wonder what has changed your mind suddenly to be so firmly convinced of what is so doubtful to me. The single example 닫히다 is enough for me to doubt your "convention" from the bottom up!
I think I'll leave this unproductive discussion but prepare a some tables later on for a discussion.
I sincerely wish you every good luck, for it could throw a bright light on this all darkness. Cheers.
--KYPark (talk) 11:43, 12 September 2012 (UTC)[reply]
If you prefer phonetic Hangeul, why do you insist dadhida?
The "phonetic hangul" is not a matter of preference but how Korean sounds. It is the very basis (perhaps in disguise) of a mode or face of Janus-faced RR, you know. Another mode or face is say dadhida I prefer. Put otherwise, I never prefer phonetic hangul but prefer such romanization that can do without it. It is hard even to Koreans, and perhaps too hard for foreign editors of Korean. They have to check it for every new entry. Do you do that nusance? For what? Just for trivial phonetic ease at the cost of everything else! (cf. )
I don't understand: "# What if neither Stephen's dathida nor your datida is outside of the RR proper? Should we still stick to it?"
Because of my poor English or carelessness, I am awfully sorry to make such a simple stupid mistakes that "neither ... nor" be "either ...or" or even "both ... and". Again very sorry, but please think again.
--KYPark (talk) 07:44, 11 September 2012 (UTC)[reply]
Please answer in particular what if both dathida and datida were not RR. Should we still stick to /th/ or dathida?
--KYPark (talk) 08:26, 11 September 2012 (UTC) --KYPark (talk) 08:45, 11 September 2012 (UTC)[reply]
Were is subjunctive; if it were not so, then it would not be so, but that has little impact on what we should do here where it is so.--Prosfilaes (talk) 09:38, 11 September 2012 (UTC)[reply]
Then I ask otherwise. Suppose dathida is not RR. Then what should we do? Should we give it up or not? --KYPark (talk) 09:47, 11 September 2012 (UTC)[reply]

Self-help

Motivations
  • He who knows does not speak, he who speaks does not know. 知者不言 言者不知. -- Lao Tze
  • The cruelest lies are often told in silence. -- R. L. Stevenson
  • In the end, we will remember not the words of our enemies, but the silence of our friends. -- Martin Luther King Jr.

Serbo-Croatian translations

For a while now I've been working on combining Bosnian, Croatian, and Serbian translations in trans tables and I have a few questions I probably should have asked from the beginning:

  1. Is this necessary? It seems to me that translations into languages that don't exist in Wiktionary is confusing, so that's the main reason I'm doing this.
  2. Since we don't have a consensus on unified Serbo-Croatian, could our current practice of unifying be changed in the future? And wouldn't that make current efforts futile?
  3. How should we treat Ekavian/Ijekavian spellings? I think what I did at woodpecker (not that it was my idea) is probably good (although more time-consuming).

Ultimateria (talk) 22:18, 8 September 2012 (UTC)[reply]

I dispute that "we don't have a consensus on unified Serbo-Croatian". We do. The issue has however gone a bit quiet recently. The consensus is to unify them, we don't have a vote saying this, we do have Category talk:Croatian language. The consensus among editors is to unify them. The objections come from non English Wiktionary editors. Mglovesfun (talk) 22:23, 8 September 2012 (UTC)[reply]
I strongly believe that "we do have some consensus on unified Serbo-Croatian" but those mentioned, that contest it, opine that is solely on the unified L2 headings. I had some crosswiki unpleasantries with one of them. Anyway, I don't think that combining the translation entries is the best idea until we find an alternative for links to the B/C/S wiktionaries that are abandoned with such a practice. One proposal given here (in the WT:BP#Bot_to_handle_.7B.7Bt.7D.7D_and_its_ilk. section) is to have a separate {{t}} for sh and to amend the script for adding translations. But since at this point I am not technie enough to do this on my own, I hope someone jumps in. FWIW, I don't know if we lose much with jettisoning those links since I think that many of the entries don't exist in B/C/S wikts and judging by the activity on those wikts and especially by the recent reluctance of the HRT to forcibly subtitle the movies from Serbia and BiH, these entries are likely not to exist... even in the quite distant future. As far the Ekavian and Ijekavian accents is concerned, I think it is unnecessary to additionally clutter the transl tables since this info is usually already contained in the entries and it isn't that hard to decipher with a little background knowledge of sh. --BiblbroX дискашн 12:42, 9 September 2012 (UTC)[reply]
Yes, we do have consensus but efforts were mainly spent on entries, not translations. I have been fixing some (usually it's also involves adding missing Cyrillic script) but it's not my main focus. I don't add diacritics (tones), though.
@Ultimateria. Your method of distinguishing Ekavian/Ijekavian is common but I think listing them together may also be OK, perhaps a slash "/" instead of a comma would do the job. You need the {{qualifier}} when there are actual usage differences (Bosnian only, Croatian only, Serbian, Bosnian/Serbian, etc.) I don't agree that Ekavian/Ijekavian should be handled by entries only. This way one variant will definitely be disadvantaged in translations and e.g. to Croats having just Ekavian in translations will look we are really introducing a new standard, IMHO. --Anatoli (обсудить/вклад) 05:32, 10 September 2012 (UTC)[reply]
Thanks for the input, everyone. The reason I thought we had no consensus was because the only vote I could find on the issue was this: Wiktionary:Votes/pl-2009-06/Unified Serbo-Croatian, which has no consensus.
@Biblbroks: I don't think a bot would be able to combine the B/C/S entries, would it? Wouldn't it be easier to implement some change (like linking to bs:wikt, hr:wikt, and sr:wikt from the Serbo-Croatian translations) if the translations were already unified, don't you think? Ultimateria (talk) 06:02, 10 September 2012 (UTC)[reply]
Bosnian, Croatian and Serbian translations are still being added; {{bs}}, {{hr}} and {{sr}} still exist. Mglovesfun (talk) 08:47, 10 September 2012 (UTC)[reply]
@Anatoli: I surely didn't mean to exclude Ijekavian translations or make them somehow unallowed, I was just expressing my opinion that putting accent (Ekavian/Ijekavian) qualifiers in translations is superfluous when it is often very easily discernible from the word itself and it really overcrowds the tables, IMO. Your proposal with separating Ekavian/Ijekavian variants with slashes seems more than sufficient to me. As far as preferring one standard over other is concerned it really depends on those inserting the translations and consequential ratio of translations belonging to one or some other standard, doesn't it? Why does it have to do with labeling the translations with such-and-such accent? As for labeling the entries and/or translations with "Croatian only" etc. I have nothing against it.
@Ultimateria: yes, exactly that is what I had in mind when mentioning the proposal of having a separate {{t-sh}}. That way we can have links to B/C/S wiktionaries if new translations are added. Of course, as I said, the script for adding translation should be amended, especially if we wouldn't like to start a bot of converting the common {{t}} to the specific t-sh one. I can't think of any better solution to this except for manual linking of every translation to every wikt and that could be tiresome and surely more prone to errors. Anyway, maybe for now when one combines the B/C/S translations into one sh, they could check if entries in appropriate wikts exist and link them if they do just not to lose any link. And yes, I don't think a bot could easily combine B/C/S translations.
@Mglovesfun: If we don't want B, C, and/or S translations being added I can't figure of a better way than to amend the translation script to automatically add the translation under sh. Maybe with an additional option to add the "Bosnian only" for example (which maybe already exists with the "Qualifier" field). And I don't think that templates {{bs}}, {{hr}} and {{sr}} should be declared obsolete. They convey some info, don't they, as same as {{qualifier|US}} do. --BiblbroX дискашн 18:53, 10 September 2012 (UTC)[reply]

I've created this template based on a Wikipedia equivalent. It's meant to be used in entries that include references, but do not specify with <ref> tags which particular information in the entry is being referenced to which source. This is a problem in particular with entries for reconstructed entries, which often do have references but there is no way to see what information in the entry has a source and what doesn't. Hopefully this template will allow us to spot such issues and fix it. —CodeCat 10:41, 9 September 2012 (UTC)[reply]


Wikidata is getting close to a first roll-out

(Apologies if this message isn't in your language.)

As some of you might already have heard Wikimedia Deutschland is working on a new Wikimedia project. It is called m:Wikidata. The goal of Wikidata is to become a central data repository for the Wikipedias, its sister projects and the world. In the future it will hold data like the number of inhabitants of a country, the date of birth of a famous person or the length of a river. These can then be used in all Wikimedia projects and outside of them.

The project is divided into three phases and "we are getting close to roll-out the first phase". The phases are:

  1. language links in the Wikipedias (making it possible to store the links between the language editions of an article just once in Wikidata instead of in each linked article)
  2. infoboxes (making it possible to store the data that is currently in infoboxes in one central place and share the data)
  3. lists (making it possible to create lists and similar things based on queries to Wikidata so they update automatically when new data is added or modified)

It'd be great if you could join us, test the demo version, provide feedback and take part in the development of Wikidata. You can find all the relevant information including an FAQ and sign-up links for our on-wiki newsletter on the Wikidata page on Meta.

For further discussions please use this talk page (if you are uncomfortable writing in English you can also write in your native language there) or point me to the place where your discussion is happening so I can answer there.

--Lydia Pintscher 13:15, 10 September 2012 (UTC)[reply]

Distributed via Global message delivery. (Wrong page? Fix here.)

Does the first point mean we won't need interwiki bots anymore? I like that. :) —CodeCat 19:23, 10 September 2012 (UTC)[reply]
The first point specifies "language links in the Wikipedias"; so, taken literally, it means only that the Wikipedias won't need interwiki bots anymore. :-P   —RuakhTALK 19:52, 10 September 2012 (UTC)[reply]
It would be really nice to not need Interwiki bots anymore. Someone should go and ask about whether the Wiktionaries get this lovely benefit too. (Not it. I think personally think that Ruakh should go ask, or point her back to this discussion so she can take part in it.) --Neskayagawonisgv? 09:00, 11 September 2012 (UTC)[reply]
I heard something about this a few months ago. The effort is intended to solve a number of Wikipedia-specific problems, and I don't see us benefitting from it. The interwiki links on Wikipedia are topically based, but the topics are not written so as to match up one-to-one. There's a similar problem between the Wikipedias and Wikispecies. For example, there might be a single article on the Englih Wikipedia covering the plant order Amborellales, family Amborellaceae, and genus Amborella all at once, but the French Wikipedia has two articles to cover these three taxa, but Wikispecies has a separate entry for each one. How do you manage the interwikis in such a case? Likewise, if an article is incorrectly linked, unhelpful bots quickly propogate the wrong links, and if someone corrects the problem, the bots quickly re-add the incorrect links. We don't have this issue, because all of our interwiki links are dependent upon identical spelling, and have nothing to do with which topic is being covered. As I say, nothing I've heard about the project seems designed to solve any problem except those peculiar to Wikipedias. --EncycloPetey (talk) 05:14, 18 September 2012 (UTC)[reply]

Discussions to modify Wiktionary:Criteria for inclusion.

Metaknowledge (talkcontribs) believes that any changes to Wiktionary:Criteria for inclusion need to be discussed here, and that such changes should not be proposed at Wiktionary talk:Criteria for inclusion. Personally, I could maybe understand wanting a link to be given here, but I think the best place for the actual discussions is definitely there. What do others think? —RuakhTALK 20:02, 10 September 2012 (UTC)[reply]

A link seems like the right approach. DCDuring TALK 20:28, 10 September 2012 (UTC)[reply]
Yes, a link is good... anything to ease some of the burden from BP, which is already quite chaotic and busy. —CodeCat 20:30, 10 September 2012 (UTC)[reply]
A link from here IMO is both necessary and sufficient. If we want to make it unnecessary (move discussion to WT:CFI's talkpage altogether), we should IMO achieve consensus to that change here at the BP, considering how entrenched "all normative policy discussions are at BP or V" is: possibly even we should vote on it, but I don't think so (at the moment). I, for one, support that change (assuming there's consensus for it here: i.e., I'm willing to be part of such consensus).​—msh210 (talk) 01:20, 11 September 2012 (UTC)[reply]

Since everyone seems to agree that links are necessary, here are two:

§ "Text for COALMINE." — a proposal to actually describe the COALMINE rule in WT:CFI, rather than just mentioning that such a rule exists and pointing readers at the vote-page that approved.
§ "Formatting of misspellings." — a proposal for various small adjustments to the description of how misspellings should be formatted; most significantly, to remove the explicit [[ and ]], and to specify lang=en.

RuakhTALK 01:25, 11 September 2012 (UTC)[reply]

Er, I'm sorry I gave you that impression. I think discussions are better held here, but a new stub section with a link to the relevant section of the talkpage in question is definitely good enough. Thanks for posting the links above — that's exactly what I would want. --Μετάknowledgediscuss/deeds 01:35, 11 September 2012 (UTC)χ[reply]

Index words by pronunciation?

Someone on the feedback page mentioned that they were looking for a word they had heard, but weren't able to find it because of the irregular spelling of English. That made me wonder... if someone knows how to pronounce the word, why shouldn't they be able to look it up that way? It certainly would be useful (as long as you know a pronunciation scheme) and I don't think there are that many sites out there that do this yet. I don't really know how to make such a feature myself, but presumably a bot would periodically go through a database dump, look at which pages contain pronunciation information, and update pages listing them accordingly. We should probably have a variety of indexes, one for each pronunciation scheme: IPA, enPR... maybe X-SAMPA too? (It's kind of redundant to IPA) —CodeCat 13:44, 11 September 2012 (UTC)[reply]

But... we've gotten more complaints about our "complicated" pronunciation systems than we get about not being able to find words by means of pronunciation. THis plan really doesn't make much sense to me until the world learns IPA in grammar school, or something. --Μετάknowledgediscuss/deeds 13:58, 11 September 2012 (UTC)[reply]
The Rhymes: namespace could be used to find words this way. If they know the pronunciation of the word, maybe they know another word that rhymes with it (y'know, unless the word in question is orange or purple), then they can look that word up, click on "Rhymes", and get to the list of words that rhyme with it. —Angr 14:21, 11 September 2012 (UTC)[reply]
(after edit conflict) It would be useful to those who know one of the pronunciation schemes, and could be easily ignored by everyone else. It would be a bit of a crude approximation, given the difference in phonological patterns between languages (and even regional varieties), but definitely worth the trouble. It also would give people an incentive to learn the pronunciation schemes. Chuck Entz (talk) 14:28, 11 September 2012 (UTC)[reply]
Just out of curiosity, has anyone ever heard of any utility for converting audio to an IPA strict phonetic transcription? I imagine it would be a lot harder to do than one might think, but really handy for some applications. Chuck Entz (talk) 14:42, 11 September 2012 (UTC)[reply]
The problem is: how likely is it that someone who doesn’t know how to spell many words knows IPA? — Ungoliant (Falai) 15:07, 11 September 2012 (UTC)[reply]
Is this really intended to be of utility to some large population of users? Does it have to be? We have plenty of low-utility entries (eg most or all of our English digraph entries), which the persons entering found rewarding. Some of our translation tables seem to be populated by folks practicing translation or copy-and-paste skills. If the byproduct of folks amusing themselves has some use and doesn't bring our benefactor's servers to their knees, why not? For many classes of contribution it seems that the contributors would not be willing or able to contribute anything else, so we should just be grateful for what we get.
This index project might get us some more contributors who had some linguistic skills who might be able to improve other aspects of our entries. DCDuring TALK 15:36, 11 September 2012 (UTC)[reply]
I'd be happy to work on a pronunciation index, but how should such a thing be organized? In what order should the symbols be sorted? What index structure would provide the most utility? DTLHS (talk) 17:51, 11 September 2012 (UTC)[reply]
As I mentioned above, I think this should be done as part of the existing Rhymes: namespace. The sorting of symbols can be the same as there. It occurs to me that even for words that don't have rhymes, there's nothing stopping us from creating a page with only one entry on it. For example, we can create Rhymes:English:-ɜː(ɹ)pl̩ even if it contains no entries but [[purple]]. —Angr 18:14, 11 September 2012 (UTC)[reply]
Just to be clear... I never said this had to be only IPA. I also don't understand Ungoliant's argument that people who don't know how to spell don't know IPA. I don't think that's even relevant. If someone doesn't know the spelling of one word, a pronunciation-based index can help them find the word. I disagree with adding them to the Rhymes namespace, because such indexes are not rhymes. Rhymes are identical at the end of the word from the stressed syllable onwards, while these indexes are meant to be like any other word index (sorted by the start of the word) but using a representation other than orthography to organise the terms. Under a pronunciation-based scheme, vial and vile would be placed near each other, irrespective of the transcription (they'd be together in both enPR and IPA because they both start with the same sounds). —CodeCat 21:21, 11 September 2012 (UTC)[reply]
What Ungoliant appears to be saying is what I'm saying — this is really not useful, because people are just confused by any pronunciation systems that use characters or diacritics they are not intimately familiar with. If you disagree, I can dredge up such comments (from WT:FB and elsewhere). DCDuring, meanwhile, seems to be thinking that we might as well waste our time on this anyway. I would have to say that I, at least, find pages like of fascinating and educational, and I am a native, articulate speaker of English. Low-utility? Not IMO. --Μετάknowledgediscuss/deeds 01:00, 12 September 2012 (UTC)[reply]
Actually I think something like this is very useful, but not as much if we use a pronunciation scheme. My point was that people who know IPA (or any other pronunciation scheme) tend to be the very educated and/or linguistically inclined, and these tend to be much more familiar with how words are spelled. I also think that having this in any namespace other than Main will make it harder to find (thus less useful), because non-contributors usually reach other namespaces via links, while a system for searching by pronunciation would be better used via search. Something like this or similar would be more useful, IMO. It’s only an idea, I’m not trying to hijack your suggestion or anything. If we end up using an index, we should add a link to it from the “There were no results matching the query.” page. — Ungoliant (Falai) 01:31, 12 September 2012 (UTC)[reply]
To answer the question about "who would know IPA but not know how to spell many words," I would imagine a large proportion of non-native speakers learning English. Compare non-native speakers learning Chinese. Siuenti (talk) 20:55, 17 September 2012 (UTC)[reply]
I think you guys are coming at this from the wrong way. You're thinking like linguistic folk and not like normal folk. Normal folk don't know IPA and, furthermore, they don't care about it. The New Oxford American Dictionary which comes with every mac (in the States) can be searched in this way. I can type throo in the search box and get the choices of throb, through, and throw. I don't know the inner workings of the code of wikt, but when making a dictionary app on a mac, this is eath to do. It's simply a matter of putting in the possible misspellings as part of the entry. I know this for that I'v been putting together my own wordbook. Since it is only for me, I also put in translations so that I can note that way as well. So, there would hav to be a marker for a list in the html code when we put in a word that the search engine would look thru. Then it would be up to the editors to put in possible misspelling and fonetic spellings (that don't show up on the entry page like the alt forms). For byspel, under --phonetic--, one would list fonetic; then phonetic would show up in the list of choices for the users if they type fonetic. This could also be noted for translations ... but I don't know if we want to go that far as the list that the users see could get unwieldy. --AnWulf ... Ferþu Hal! (talk) 11:30, 18 September 2012 (UTC)[reply]
As a recent purchaser of a new Mac, I can concur that Macs in the States come with a Dictionary app that allows you to search the New Oxford American Dictionary. It does not include a list of misspellings, but uses coding similar to that used when you type something into the Search window on Wiktionary. It finds items that start with the same characters, and suggests an alternative of similar spelling if that sequence of characters is not found. It therefore does nothing that we don't already do. It does allow you sometimes to find a misspelled word phonetically, but sometimes it doesn't. If you type "shure" (to find sure), it returns share instead, as a closer character match. It's not searching by pronunciation.
Again, it's in the inner workings. If you want to build your own dictionary app, you can download XCode from Apple ... which I hav done ... and it's done by listing it in the search parameters ... In Dictionary 2.2.3 (New Oxford American Dictionary), I type in shure and I get the choices (in alphabetical order) of share, Shari, shear, shire, shirr, shore, sure. So the way to do it is to do it likewise ... that is ... hav a field in the html code to put in other likely spellings (or even typos). I'd say note it with some care so as not to overwhelm the user with choices but it would direct someone who types seperate to separate. I think this would be much easier than trying to type in IPA symbols. Keep in mind, when normal folks talk about finding words by "pronunciation" they're talking about ways to type in words noting the standard 26-letter alphabet as noted in English ... no schwas, no š, ž or any other funky letter from another language. These mean nothing to the average, nativ English speaker (which is my gripe with the way wikt translits Russian ... it notes some funky int'l or interlingual notation and not English translit which render it useless to the average, nativ English user [I know of what I speak, I work'd with translit in the Army and when I show them the translit here to my friends, they're baffled by it). The average, common, nativ English speaker thinks in terms of how the alphabet is noted in English and that is how any scheme for inputting words should be tailor'd to. --AnWulf ... Ferþu Hal! (talk) 16:09, 22 September 2012 (UTC)[reply]
I notice that the same dictionary app allows you to search the Oxford American Writer's Thesaurus, Apple Dictionary, and Wikipedia, but not Wiktionary. --EncycloPetey (talk) 01:49, 19 September 2012 (UTC)[reply]
True. Only the powers that be at Apple know why. I can only guess that it might be part of the contract with the OED folks so that they can load the NOAD ... or some type of licensing problem ... That's only a guess. However, you can leave feedback with them at: http://www.apple.com/feedback/macosx.html --AnWulf ... Ferþu Hal! (talk) 16:09, 22 September 2012 (UTC)[reply]

accelerated creation for Georgian

Well, we have a great tool for th accelerated creation of English plurals and I want it to work for Georgian nouns too. What should I do?

I have modified {{ka-noun}} (here it is) a little and got this მეტალი, although it works, that small icon just after the plural form bothers me.--Dixtosa. 15:34, 12 September 2012 (UTC)[reply]

  • Accelerated entry-creation is all JavaScript (see User:Conrad.Irwin/creation.js); the only change you should need to make to the template itself is to add "hooks" for that JavaScript. For example, {{en-noun}} wraps the link to the plural form in <span class="form-of plural-form-of lang-en">...</span>. I suggest you try simply wrapping your plural form in <span class="form-of plural-form-of lang-ka">...</span>, and seeing if that gives the right result. (If it doesn't, then we may have to create custom JavaScript.)
    Also, please fix your signature to link more prominently to your user-page or talk-page.
    RuakhTALK 18:39, 12 September 2012 (UTC)[reply]
Thanx, span trick worked.--Dixtosa. 13:26, 13 September 2012 (UTC)[reply]
It's not a trick so much as how the whole thing is designed to work. Mglovesfun (talk) 11:41, 14 September 2012 (UTC)[reply]

Introducing WT to twitter

Hello Wiktionarians. I started a twitter account called WiktionaryUsers which, naturally, is all about Wiktionary. The goal is to inform people about the site, and the long-term goal is to get more people to know about the site, and hopefully get some more contributors. I've got lots of things to tweet about the site already, as I'm very familiar with it, but thought some of you guys could help me out with other things I could put on the tweets. Some ideas floating around are the history, some important features, best entries, a little bit about the users, why we're such an awesome site, WOTD feeds, interesting techincal data about it, lesser-known areas of the site, how the readers can contribute etc. But if you want anything particular up there, let me know. I don't expect I'll be able to tweet about WT all the time, so maybe some users can share duties with that (I don't know how to do this personally, I'm a twitter noob). Anyway, have a look at it, and maybe we can make this a new area of interest. Regards, --Wikt Twitterer (talk) 19:43, 12 September 2012 (UTC)[reply]

Wonderfool, is that you? --Vahag (talk) 22:38, 12 September 2012 (UTC)[reply]
I loled. Equinox 12:06, 13 September 2012 (UTC)[reply]
I don't have anything against a twitter account, but... what would we do with it? Does Wiktionary ever have anything newsworthy to mention? It's more or less the same old every day, there is nothing to announce. And if we do need to announce something, we put it on the main page or WT:NFE. —CodeCat 17:58, 13 September 2012 (UTC)[reply]
Word-of-the-day? Foreign-word-of-the-day? Collaboration-of-the-week-if-we-ever-start-that-up-again? —RuakhTALK 18:58, 13 September 2012 (UTC)[reply]
Ok, I can understand that, but does WOTD need a twitter feed all by itself? I thought twitter was for things like status updates (I've never used it). And what's collaboration of the week? —CodeCat 19:03, 13 September 2012 (UTC)[reply]
Re: what Twitter is for: It can be for lots of different things. I've never used it, either, but it's basically a blogging platform with a very strict character limit. If someone is "following" our Twitter feed, then our tweets would show up in their Twitter feed. (Or something like that. I probably have the details wrong. But it would let Twitter users get our WOTD delivered to them somehow.)   Re: what collaboration of the week was: see Wiktionary:Collaboration of the week. —RuakhTALK 19:38, 13 September 2012 (UTC)[reply]
In the 1990s, "blogging" was invented, which allowed people to post messages for a given date, diary-style. Twitter is the same thing, but it came later, and it has a retarded, pointless character limit relating to outdated mobile telephone limitations. Equinox 23:01, 13 September 2012 (UTC)[reply]
It's more like a chat room than a blog. A chat room with many monologues. Njardarlogar (talk) 09:33, 14 September 2012 (UTC)[reply]

Question: Are there "TwitterBots"? If so, can somebody run them? This would actually be worth something if it autoposted the WOTD and FWOTD. --Μετάknowledgediscuss/deeds 00:12, 14 September 2012 (UTC)[reply]

I'm sure there are. See https://dev.twitter.com/docs/api/1.1/post/statuses/update. —RuakhTALK 01:12, 14 September 2012 (UTC)[reply]
Writing a Tweetbot that published WOTD would be trivial (except, I suppose, for the fact that the definition might be longer than 140 characters, which would need a little special handling). I could set one up as a test, if people thought it was a good idea. Making it "official" might be a bad idea though. Smurrayinchester (talk) 13:37, 14 September 2012 (UTC)[reply]
I assume that we would want human intervention in preparing the tweet. The purpose of the bot would just be to post the appropriate day's tweet at the appropriate time. Oh, and perhaps to register the page with a URL-shortening site, so that e.g. http://en.wiktionary.org/wiki/comeuppance becomes e.g. http://bit.ly/foOB4r. —RuakhTALK 13:48, 14 September 2012 (UTC)[reply]
Twitter deals with URL shortening automatically these days. Human intervention wouldn't be necessary - it's perfectly possible to write a program with goes to template:Word of the day, copies the definition (if necessary, truncating it with a (more)) and links to the right article entirely automatically. Today's WOTD might look like "comeuppance (n): A negative outcome which is justly deserved. en.wiktionary.org/w..." while something longer like Olympiad would be "Olympiad (n): (historical) A period of four years, by which the ancient Greeks reckoned time, being the... (more) en.wiktionary.org/w...". Human intervention would be handy if we always wanted Twitter-ready definitions, but it would be more hassle. Smurrayinchester (talk) 13:57, 14 September 2012 (UTC)[reply]
I am an occasional user of a Twitter feed. WOTD and similar items are ideal for Twitter. The link is the important thing. What could be generated automatically by having headword and first definition, truncated as necessary to fit the 140 character limit, would be adequate and somewhat useful for folks who don't really have time for us otherwise. When they have time or our tweet piques their interest, we may lure them to the page itself. But as tweeting is usually from mobile devices, a link to a full entry often creates an unsatisfactory experience (slow loading, unreadable text). Perhaps mobile wiktionary would be better for linking. BTW, long cognate lists don't do much for users on mobile devices. DCDuring TALK 14:14, 14 September 2012 (UTC)[reply]
Re: automatic URL shortening: Oh, good.   Re: "it's perfectly possible to write a program [which …]": Well, obviously. I'm just saying that I don't think that's really what we want. A lot of care and custom writing goes into the WOTD presentation, with the result that the text in the WOTD template does not always directly match what's in the entry. A truncated WOTD def, with a "more" link to the entry as though it were a truncated entry-def, could be confusing and counterproductive. —RuakhTALK 14:33, 14 September 2012 (UTC)[reply]
I strongly recommend that anyone who is going to do this be at least an occasional Twitter user. There is no point in having the entry handcrafted by someone who doesn't get what it is like on the receiving end. A tweet will never do justice to the full WOTD. The link is the important part. The tweet message can either be custom designed to pique interest or it can be a simple truncated definition which may by less effective on average. Do we have anyone knowledgeable about tweeting who would like to like to handcraft tweets for WOTD? If we do, aren't they lucky? If we don't, do we think we would be better off with something that reminded users of our existence? I don't know because we seem often not really to care about such things. DCDuring TALK 15:42, 14 September 2012 (UTC)[reply]
If we're writing the tweets by hand, there's no need for a bot. A simple program like Tweetdeck lets you write tweets and then them automatically at a given time. I use Twitter quite a bit - while I don't think there's any kind of general "Twitter style" (as long as the definition is punchy and the tweet includes a link to the entry, it should be fine), I wouldn't mind writing at least a couple of weeks worth of tweets. Smurrayinchester (talk) 20:19, 14 September 2012 (UTC)[reply]
That would be great. How can we open an "official" account for this with a name that looks official? Is there any way to track clicks from Twitter to WOTD so we know what folks like? DCDuring TALK 20:55, 14 September 2012 (UTC)[reply]
Eventually we might know enough to determine whether daily handcrafting might be worth someone's continued daily effort. DCDuring TALK 20:58, 14 September 2012 (UTC)[reply]

Sort order in Category:vro:Months

Why does {{list:Gregorian calendar months/vro}} put all month names under letter V in Category:vro:Months? Same seems to happen to Category:vro:Days of the week, but not with their en counterparts. Malafaya (talk) 22:16, 14 September 2012 (UTC)[reply]

It's in {{list helper}}, when cat is given, it automatically sorts by cat which is wrong, it should be {{PAGENAME}}. Can I let someone else fix it, I have been known to screw this sort of thing up before. Mglovesfun (talk) 22:21, 14 September 2012 (UTC)[reply]
 Fixed. For future reference, this kind of request belongs at WT:GP. --Μετάknowledgediscuss/deeds 23:10, 15 September 2012 (UTC)[reply]

Wikidata and Wiktionary potential

There's is probably something in the details that makes it unfeasible, and there is probably someone who has suggested something similar before; but here it goes:

The way Wikidata works, has tremendous potential for Wiktionary. By letting each data set ("item") correspond to a particular meaning of a word in a particular language, entries on several wiktionaries can be created/maintained/deleted in one go.

E.g. if this page represents the meaning "chemical element" (in other words, a noun), then, if you add the translation muileh for the Hsilgne language, there will be created entries on the Hsilgne Wiktionary for the words of all the other languages that are already in the table, and the page muileh will be created on all the wiktionaries that are now in the table.

Wikidata could also create and maintain automatic synonym lists as well as translation tables. Njardarlogar (talk) 17:20, 15 September 2012 (UTC)[reply]

Each Wiktionary has traditionally been a separate domain, subject to its own policies and conventions. When we hold votes here, they have no effect on the Nynorsk Wiktionary. Most data is actually a lot more subjective than the optimistic WikiDatists seem to believe. For example, we held a vote to use the IPA symbol ɹ instead of r in English pronunciation sections, but I'm willing to bet that in other Wiktionaries, that's not the custom. To cite another example, we consider Serbo-Croatian to be a unified language, but other Wiktionaries disagree. How could we deal with that problem? I think that this coud be a very helpful development, but I need to see how the independence of each Wiktionary would be respected and how these issues would be handled. --Μετάknowledgediscuss/deeds 17:31, 15 September 2012 (UTC)[reply]
Translation tables would be tricky, at best. Translations are not one-to-one; they are not always symmetric; and some words translate as whole phrases rather than as words. Further, synonyms lists are subjective. It is possible, using a thesaurus, to follow from synonym to synonym and discover that "good" and "evil" are connected by a chain of synonyms. Some languages also have more words for a particular concept, and some languages have fewer. In English, we recognize a certain fixed set of colors as the natural basic colors. In some cultures, pairs of colors we consider to be distinct are not distinguished at all from each other, but are considered together as a single color. Without a common set of underlying ideas into which to separate language, translations won't be possible from a central dataset. --EncycloPetey (talk) 01:41, 19 September 2012 (UTC)[reply]

Trigonometric functions

The symbols for the trigonometric functions, reciprocal trigonometric functions, et alia are currently in a mess. Entries like tan, tanh, and arctan are variously in the English or Translingual sections (or both), and some of those Translingual sections have the English pronunciation and even translation tables! I don't really mind which language we put them in, but I'd like to standardise them. --Μετάknowledgediscuss/deeds 17:22, 15 September 2012 (UTC)[reply]

How translingual are they? Would you write e.g. tanh in French, in German, in Mandarin...? Equinox 23:26, 15 September 2012 (UTC)[reply]
The answer to that is — probably. At w:zh:正切, it looks like they're using the same symbols as in English, which looks pretty weird imbedded in a Chinese article. At w:fr:Function trignométrique, they note that multiple terms are used including what seem to be some local variants. At w:de:Tangens und Kotangens, they mention the standard terms first, but then say that (deprecated template usage) tg and (deprecated template usage) ctg may be found in "älterer Literatur". Based on that evidence, what do you think? --Μετάknowledgediscuss/deeds 23:52, 15 September 2012 (UTC)[reply]
Sounds as though they ought to be translingual entries. I had a very quick Google around to see whether they are used like English nouns ("the tanhs of x and y") but that doesn't seem common. Equinox 00:02, 16 September 2012 (UTC)[reply]
Should I leave (deprecated template usage) tg in the Czech section and create a duplicate Translingual section, or should I instead just change the header and tidy it up a bit like what I plan to do with English arctan? --Μετάknowledgediscuss/deeds 00:26, 16 September 2012 (UTC)[reply]
If it helps, when I was learning math in Brazil, we used tg, ctg, tgh, ctgh, and arctg (or atg). Also, American sin was sen (e.g., sen2x = 2 senx cosx). So I'm not sure these abbreviations are really translingual. --Pereru (talk) 22:47, 16 September 2012 (UTC)[reply]
Well, they don't have to be used in every, or even in most, langauge(s) to be Translingual. Compare something like 汉#Translingual, which is obviously not used in Brazilian Portuguese (nor German or French, either). --Μετάknowledgediscuss/deeds 22:54, 16 September 2012 (UTC)[reply]
I've seen (deprecated template usage) tg in English too, though rarely. (I'm a mathematician, FWIW.)​—msh210 (talk) 22:20, 20 September 2012 (UTC)[reply]

Portuguese pronunciation of names of portuguese municipalities

Hi. Just to let everyone know that I've just finished uploading in Wikimedia Commons de audio files with the 306 names of the 308 portuguese municipalities (two of them have the same name). I think they will be a good suplement for the Wiktionary and the "Pronunciation" parts. More experienced editors may now pick up this words, in some cases used in other context. Thank you. FilipeFalcão (talk) 22:09, 15 September 2012 (UTC)[reply]

Thanks! I’ll take a look. — Ungoliant (Falai) 01:41, 16 September 2012 (UTC)[reply]

Great, Ungoliant! Hope you find everytihng in order.

Forgot the link to Wikimedia Commons. Enjoy! FilipeFalcão (talk) 09:03, 16 September 2012 (UTC)[reply]

You may also find the IPA "Pronunciation" for the 308 portuguese municipalities here Anexo:Lista de concelhos por NUTS taken from the Wikipedia en. Robot, anyone? FilipeFalcão (talk) 06:23, 18 September 2012 (UTC)[reply]

many, more, few, etc.

Shouldn't words like these be in a category named Quantifiers (under Determiners) rather than directly in the category Determiners? --Pereru (talk) 22:53, 16 September 2012 (UTC)[reply]

I wouldn't mind, but words can be nested very deeply in the determiners category (for example, numerals should go under quantifiers if it's created). By trying to organise everything, we might end up making so many small categories that they become almost useless. —CodeCat 23:34, 16 September 2012 (UTC)[reply]
Additional hard categorization (not template-based) does little harm and might help for articles on Determiners within and across languages. It should be easy enough to execute for little effort for closed categories, though not for numerals, which we treat as an open class for some reason. DCDuring TALK 23:58, 16 September 2012 (UTC)[reply]
They should definitely be in Category:English determiners: we use ===Determiner=== as a part-of-speech heading, just like ===Noun=== and ===Verb=== and so on, and ==English== entries with that header should all belong to Category:English determiners (or potentially, in some cases, Category:English determiner forms, if it existed, which it doesn't). Perhaps they should also be in Category:English quantifiers, just as some English adverbs are in Category:English modal adverbs and some English nouns are in Category:English uncountable nouns, but it's not an either/or thing. —RuakhTALK 13:54, 18 September 2012 (UTC)[reply]
Is it always necessary to put entries into the category that matches their part of speech heading? Yesterday I added several entries to Category:Zulu concords and Category:Zulu subject concords, but the part of speech heading for those entries is ===Prefix===. I think if we can narrow down the part of speech further, it isn't always necessary to keep them in the more general category too. I guess it depends on whether we expect someone to ever want to look at the category and see a list of all determiners, even those that are in the subcategories (in the case of Zulu concords, I doubt anyone would find that useful). —CodeCat 14:06, 18 September 2012 (UTC)[reply]

I blocked him last night because he has been adding nonsense categories to entries, which I consider disruptive editing and not really fitting for an admin. But it seems to me that he didn't really get the point, and turned it into a personal attack against me instead (see his talk page). Now User:Dick Laurent (who has made some similar questionable edits in the past) has unblocked him. I'd like to call more attention to this and would like to know if he could be de-admin-ed. I realise that he is our only editor for Armenian, but I don't think that should be an excuse for vandalising Wiktionary. —CodeCat 17:50, 17 September 2012 (UTC)[reply]

You can't desysop me: read this. --Vahag (talk) 18:02, 17 September 2012 (UTC)[reply]
I think even Vahag and Dick would admit they're both at times disruptive editors. They also do a lot of good work. I've been willing to overlook their blatant nonsense because of the amount of good work they've done for this wiki. That won't necessarily last forever though. I'm not saying there shouldn't be any limits to the amount of bullshit they can do and get away with. Mglovesfun (talk) 18:05, 17 September 2012 (UTC)[reply]
I think you’ve probably misread Vahag. Yes, occasionally he gets into a little minor disruptive editing, but it’s always easy to fix. It’s just his humor. He’s especially fond of ethnic humor, to do with blacks, gays, gypsies, etc, and sometimes he goes all sexual or something. For all I know, he’s a gay black gypsy. Or not, who knows? But he’s an exceptional wiktionarian of long standing. I should think we could wink at his antics as they appear from time to time. He is like a national treasure...an eccentric one. —Stephen (Talk) 18:18, 17 September 2012 (UTC)[reply]
I think you have a point; some people think his antics are jokes, others don't. Of course the main namespace isn't a good place for jokes, either. Mglovesfun (talk) 18:22, 17 September 2012 (UTC)[reply]
I like most of his jokes, but he should undo them after a day or two. Expecting others to find and clean up things like art history being categorised under Category:gay:Sciences is irresponsible. It is very sad that we once needed a vote to delete a joke. — Ungoliant (Falai) 20:10, 17 September 2012 (UTC)[reply]
He should undo them after a day or two??? Admins shouldn't be making joke edits in mainspace at all, ever. —Angr 20:50, 17 September 2012 (UTC)[reply]
Of course he shouldn’t. But the very least he could do is cleaning it up. — Ungoliant (Falai) 21:19, 17 September 2012 (UTC)[reply]
Yeah, within ten seconds, not within a day or two. And if he does it a third time after being warned twice, he gets desysopped, and if he persists he gets blocked indefinitely. I don't see how we can trust a jokester with admin rights not to play a "joke" like... well, I don't want to say what sort of possibilities occur to me, because of w:WP:BEANS. —Angr 21:30, 17 September 2012 (UTC)[reply]
Good thing we don't have gay.wiktionary.org yet. -- Liliana 20:31, 17 September 2012 (UTC)[reply]
I don't find Vahagn's edits disruptive. His jokes are fun and he doesn't really mean bad. --Anatoli (обсудить/вклад) 23:01, 17 September 2012 (UTC)[reply]
So you think that Wiktionary is improved by those edits? —CodeCat 23:16, 17 September 2012 (UTC)[reply]
If Vahagn would undo his edits and just leave them in the edit history as jokes, I'd overlook them, as Stephen suggests... but adding false categories to entries and just leaving them is insidiously disruptive — consider that it took three years for someone to notice and fix the usage notes at [[երկկտտացնել]]. - -sche (discuss) 23:42, 17 September 2012 (UTC)[reply]
@CC: We can't look at those offensive edits & comments in isolation. When we balance all those Old Armenian entries with a few obvious jokes… I would have to say that I'd vote against desysopping him. (And I'd also vote against constantly blocking WF as soon as he's sighted, but I have a feeling less regulars agree with me about that.) --Μετάknowledgediscuss/deeds 02:16, 18 September 2012 (UTC)[reply]
It seems irresponsible to allow him to be whitelisted, let alone be a sysop if he is going to PoV push in principal namespace. I don't see how this Usage note is in the slightest bit acceptable. Is it really worth having a language "covered" if the coverage means having offensive material that uses Wiktionary to vituperate against some software vendor? DCDuring TALK 02:33, 18 September 2012 (UTC)[reply]
Since when are offensive slurs against gays appropriate here?--Prosfilaes (talk) 03:19, 18 September 2012 (UTC)[reply]
I, at least, come from a culture in which offensive slurs given in a non-serious manner are not considered to be offensive. I have gay friends who apologize for being gay and in turn make fun of straights and straight friends who apologize for being straight and in turn make fun of gays — and none of us get offended. From his comments, I have to assume that VP hails from such a (sub)culture as well (do I stretch AGF too far?). That said, I totally agree that leaving this kind of stuff for years in the mainspace is past the limits of reasonableness, but @DCDuring I have to add that <0,5% of his edits seem to be of this kind, so saying that a bunch of offensive edits are more important than all those entries seems ridiculous. --Μετάknowledgediscuss/deeds 04:19, 18 September 2012 (UTC)[reply]
I think the key word there is friends. Moreover, the example at երկկտտացնել was clearly a hostile attack, not non-serious joking. These are block-on-sight edits on most of our neighbor wikis; I hate to say that a certain number of good edits allow you to commit vandalism in mainspace.--Prosfilaes (talk) 04:32, 18 September 2012 (UTC)[reply]
If you hate to say it, don't say it. The way I see it, whatever improves the dictionary overall is OK, and at the ratio of excellent edits in lesser-known languages to mainspace vandalism that Vahag carries out, I see no reason not to revert the vandalism and otherwise carry on with our business. BTW: I am in at least one of the minority groups that Vahag makes fun of, so I understand, but I don't let it bother me. --Μετάknowledgediscuss/deeds 05:16, 18 September 2012 (UTC)[reply]
In the ex-USSR homophobia is almost normal, they have a long way to go before it's not considered normal. Think of US or Europe in the 60's ("Mad Men" TV drama shows what the prejudices are like in modern Russia, Armenia, etc.). Vahagn makes silly jokes about gays, Jews, other ethnicities or races on talk pages and most people understand that he is just being funny. It's the first time I've seen these comments in the main space. I don't think he is a homophobe or bigot, though. He gets along with User:Dick Laurent who is gay. No, that edit wasn't normal but I don't think he should be desysopped. Let's wait and see what he has to say. No-one's perfect but we're here to help each other. --Anatoli (обсудить/вклад) 04:43, 18 September 2012 (UTC)[reply]
For one thing, words are not automatically offensive slurs unless they accurately reflect the feelings and prejudices of the speaker. Vahag and Dick are good friends and they throw around these jabs with each other just to be playful. He does it because it is not the way he really feels. I don’t believe Vahag has a mean bone in his body. For another, I think some of the crimes that some of you are alluding to (but dancing around and avoiding any attempt to be specific) may actually be proper...for example, that flamingoes are categorized as the gay bird. I’ve known some gay people to be avid collectors of pink-flamingo statues that stand grotesquely on the front lawn. I don’t know what the logic is, but I can see how flamingoes might be considered the gay bird (not that flamingoes are gay, but that gays are partial to them). As for Canada being categorized as a gay country, maybe it has to do with the Canadian tolerance. The U.S. certainly is not a gay country in that regard...the U.S. is more on a level with Iran. Now whether it is useful or desirable to have categories like that, I can’t say. They don’t have much meaning for me, but maybe they would for some people. On the other hand, he might be simply highlighting the silliness of those particular categories by applying the categories to some words...I know that Vahag didn’t create all of those gay categories. I only looked at a couple, such as Category:gay:Sciences, and someone else created the ones I looked at. Vahag is the best of the best, better than the rest. —Stephen (Talk) 12:22, 18 September 2012 (UTC)[reply]
@Stephen: gay is the language code for the Gayo language, so Category:gay:Sciences is just like Category:es:Sciences, except that it's for Gayo instead of Spanish. Someone who adds an English entry to Category:gay:____ is either making a mistake or engaging in vandalism. —RuakhTALK 15:04, 18 September 2012 (UTC)[reply]
The issue isn't whether Vahag is really an evil bigot or not (he doesn't seem to be), but a lack of judgment and of tact. Being in a position of authority brings with it the responsibility to set an example, and to be careful about how one's words and actions reflect on the project. Vahag is no doubt a first-class editor, and probably a nice guy, but as an admin, he has real problems. By engaging in the very sort of actions that we routinely respond to by reverting edits and blocking the perpetrators, he's setting himself up for accusations of hypocrisy. What's more, he's unnecessarily putting the rest of us on the spot by forcing us to decide between loyalty to a long-time and productive editor (and for many, a friend), and even-handed enforcement of the rules. That goes double for the openly-gay admins here. I may be reading too much into it, but the fact that none (as far as I'm aware) have weighed in on this speaks volumes. We may decide to tolerate his behavior in light of his overall contributions to the project, but we should not minimize the damage he does. Chuck Entz (talk) 13:30, 18 September 2012 (UTC)[reply]
We have tolerated his bad behavior in Wiktionary space for quite some time. But I don't see how it can be tolerated in principal namespace, the public face of Wiktionary, the product of our efforts, our supposed service to mankind, the justification for the free resources we receive.
It is all the worse if he is our sole contributor in his language. How can we trust his entries? Who is looking at them? DCDuring TALK 14:42, 18 September 2012 (UTC)[reply]
  • Speaking as one of the openly gay admins (there are several, and obviously I don't speak for the others), while I don't feel strongly about Vahag's homophobic and anti-Semitic remarks for my own sake, what really bothers me is that he posts them in public forums that newbies and non-Wiktionarians are liable to frequent. Stephen's theory that "words are not automatically offensive slurs unless they accurately reflect the feelings and prejudices of the speaker" only works in an environment where people have some way of knowing the feelings and prejudices of the speaker. In an environment such as this one, where that's not the case, the effect of his remarks is to promote a culture that tolerates homophobia, anti-Semitism, and so on. Why would we want to deter potential contributors? Or put another way — even if the current gay admins are willing to put up with this B.S., why should we demand that all potential gay contributors do so? —RuakhTALK 15:04, 18 September 2012 (UTC)[reply]
Ah, so it’s a language code. How confusing. That explains all of those recent categories that struck me as so odd. As for Vahag, I simply don’t see the offense in him. To me it’s like that well known comedy, Much Ado About Nothing. Or what Pontius Pilate said, something about Ecce homo. —Stephen (Talk) 15:15, 18 September 2012 (UTC)[reply]
It's not about him; it's about what he's doing. It's about "This neologism was coined by brain-damaged homosexual morons from the Armenian company Bi-Line" being exactly the type of stuff that brings ill-repute on Wikis; it's unjustified opinion that attacks a company and it's homophobic to boot. Ignore the person, and look at what an outsider sees when they see that. Look at what an outsider sees if they see this thread and realize that we're not de-sysopping the guy who wrote that.--Prosfilaes (talk) 19:35, 18 September 2012 (UTC)[reply]
  • I'm an openly gay admin, and I've weighed in on this. The fact that his "jokes" are often homophobic has nothing to do with anything; I'm not personally offended by them. But to make them in article space at all (even if he did revert them instantly, which he doesn't) is conduct unbecoming of an admin and frankly makes me no longer trust him with the tools. In other words, I would support desysopping if it came up for a vote. I think it's very odd that we block good, constructive editors for failing to adhere strictly to our byzantine formatting requirements, but allow a blatant vandal to continue not only editing, but also being a sysop. —Angr 17:14, 18 September 2012 (UTC)[reply]

If an administrator vandalizes Wiktionary, the administration privileges should be taken away. Making a joke in the main namespace about dogs with short hair is going too far. I don't see why there is even a discussion here. --BB12 (talk) 08:01, 19 September 2012 (UTC)[reply]

If I had done the same thing (I mean the doubleclick's case) in kawiki the admin crew (nay, super-duper-buper-powerful-extra-mega-yotta-zetta-pretta-exa strict hive XD) would not have just block me further than forever (yeah, they do have some Chuck Norrises there :D), but definitely put me on trial :D. It's very good to hear that some people do have a sense and ability of perception of humor. As for this matter, indeed, such disruptive edits affect the reader and highly(!) decreases trustworthiness of this project. --Dixtosa_HERE IT IS xD_ 17:01, 19 September 2012 (UTC)[reply]
I understand your feelings but Vahagn has many years of great editing behind him. Old-timers sometimes relax, they just need to be reminded and told not to do so. --Anatoli (обсудить/вклад) 23:36, 19 September 2012 (UTC)[reply]
  • I've created a vote-page proposing that he be de-sysopped: Wiktionary:Votes/sy-2012-09/User:Vahagn Petrosyan for de-sysop. I welcome any improvements to it. (Alternatively, if people prefer, I'd be open to voting on some sort of warning: just a vote that his actions have been grounds for de-sysopping, but delaying de-sysopping until such time as he is seen to continue them.) —RuakhTALK 17:47, 19 September 2012 (UTC)[reply]
    Thanks. Recurrent offensive edits - especially in the main namespace - cannot be tolerated in an international project where people of various backgrounds attempt to collaborate in creating a valuable dictionary and are reading it. I, for one, have never found any of VP's "jokes" funny in any respect. I can't judge on his ordinary edits but, like DCDuring rightly asked, how can we trust his entries when he is the sole contributor in his language and leaves a ... (well, what is this doubleclick thing? PoV-push for sure. Hardly a joke, not even one with bad taste) ... standing for more than three years? -- Gauss (talk) 18:32, 19 September 2012 (UTC)[reply]
  • FWIW, I might find his comment funny in certain contexts, but not in the main namespace and certainly not as made by an administrator. I personally might take into consideration an apology in this discussion, though given the gravity of what he has done, my standards for that apology would be quite high. --BB12 (talk) 19:43, 19 September 2012 (UTC)[reply]
    Given his response to the opening of this discussion above, I would say that he doesn't take this very seriously. If we just get a provocation in response to such a complaint, I would not expect an apology. DCDuring TALK 19:50, 19 September 2012 (UTC)[reply]
    I did consider that to be funny, but perhaps you are right. That would be too bad :( --BB12 (talk) 19:56, 19 September 2012 (UTC)[reply]

A poll is a step in the right direction. Hopefully, it can establish a precedent on such matters. Jokes in the main name space make Wiktionary a joke itself. Njardarlogar (talk) 20:34, 19 September 2012 (UTC)[reply]

  • I say judge him on the quality of his edits and nothing else, and give him another chance. If he continues to make disruptive edits then I think a full block may be reasonable. By the way I'm another openly gay admin (as if there were only one in a language-related site...!) ---> Tooironic (talk) 23:22, 19 September 2012 (UTC)[reply]
Thanks! A manager who sacks qualified, experienced employees because of their views, infrequent mistakes, whatever, is a bad boss. I see many are quick here to desysop Vahagn for a bit of silliness. If we desysop him, he may leave altogether or drop editing Armenian like Stephen stopped editing Russian. Even if it's considered serious by some, it's very rare. Wiktionary without foreign languages has little value. Vahagn has been making great contributions in Armenian and Russian for many years and we should value him for that. --Anatoli (обсудить/вклад) 23:36, 19 September 2012 (UTC)[reply]
He's not an employee. He's a manager. Managers have much more responsibility. Still, it seems reasonable to give him another shot provided a proper apology is forthcoming (which does not seem likely to happen). --BB12 (talk) 03:19, 20 September 2012 (UTC)[reply]
His views are irrelevant, the fact is that he consistently vandalises Wiktionary. Wiktionary is not a company, it's created and maintained by volunteers. We cannot hire someone to babysit Vahagn's edits. If you think his edits are that valuable, you should step up and volunteer for that job yourself.
He was warned already in 2010 that vandalising entries could get him desysoped. And what does he do? He vandalises som more, of course. Which is what he'll continue doing no matter how many warnings he'll receive. It's a waste of time, he should be desysoped and/or blocked already. Njardarlogar (talk) 09:06, 20 September 2012 (UTC)[reply]

Vahag, I share the concern DCDuring and Gauss have expressed, that because you're the only contributor in your language, it's impractical for us to judge the validity of your Armenian edits. And as I said earlier, edits like those you made adding non-Gayo words to Gayo categories are insidiously problematic because they're unlikely to be noticed for years if not noticed immediately. As others have said, I find such edits only barely offensive, but that's because I know they are the doing of one editor: I understand how offended a casual reader of Wiktionary who saw them and interpreted them as the opinion of our site community could be. Such edits are immature, and they make Wiktionary look immature.
Your edit to "երկկտտացնել" was more problematic, but it was also made three years ago. I realise such edits are block-on-sight offences if committed on other wikis, or by newbies here, but as Metaknowledge says, you're the contributor of many, many other entries which I'm willing to assume are good. Thus, I would prefer it if you simply agreed not to make POV edits like this, and agreed to undo edits like these immediately after you make them, with the understand that we'll desysop you if you don't. - -sche (discuss) 04:48, 20 September 2012 (UTC)[reply]

  • Of course, desysopping him won't affect his ability to contribute good work on the Armenian front. It might piss him off, of course - but then, if he goes on a vandalism rampage he can be blocked like any other user. SemperBlotto (talk) 16:18, 20 September 2012 (UTC)[reply]
If he is desysopped, then he can't really remain on the whitelist either, because he might be motivated to more than his usual modest level of same. DCDuring TALK 18:25, 20 September 2012 (UTC)[reply]


Calm down, angry villagers, I want to say few words:

  • I've made 54,905 edits to this wiki of which only about 30 are what you call "vandalisms". I prefer the term "Easter eggs". And they all eventually got reverted.
  • The joke in "երկկտտացնել" was problematic and not funny, I admit. Shouldn't have done that.
  • I honestly don't believe users who want to punish me have the interests of the community at heart. It's a petty revenge, for mocking them personally. Shalom, Ruakh.
  • If I'm desysopped, I won't quit editing Armenian and Russian. I'll think less of the community, that's all.
  • Finally, I don't hate gays, Jews, blacks, gypsies, etc. Especially because non of those exist in my country. I'm the most tolerant guy in the world. I only hate Canadians. Then again, everybody hates those pinko bastards. --Vahag (talk) 11:44, 21 September 2012 (UTC)[reply]
Re: "I honestly don't believe users who want to punish me have the interests of the community at heart. It's a petty revenge, for mocking them personally": That's not true — fortunately for you. Because if it were true, then that would be compelling proof that your so-called humor has been seriously offending people, to the point that they felt the need to take "revenge". If you do honestly believe that that's the case, then what the Hell is wrong with you? Why, after finding that you were genuinely offending people, would you double down on that? (FWIW, despite your word "honestly", I don't think that you honestly believe this. I don't see how you could. So you don't need to answer the "what the Hell is wrong with you?" question.) —RuakhTALK 12:07, 21 September 2012 (UTC)[reply]
I said above I'm not personally offended by your "Easter eggs", and I'm not. The problem is not that the comments are homophobic/antisemitic/anti-Turkish/whatever, the problem is that long-term editors -- and especially admins -- shouldn't be making joke edits in mainspace at all. It doesn't matter how many good edits you have; they don't make up for the joke edits. Suppose a Habitat for Humanity volunteer had helped build 500 houses, but committed arson and burned down one house: he'd still be guilty of arson, no matter how many other houses he had worked on constructively. —Angr 18:15, 21 September 2012 (UTC)[reply]
Unless new, compelling evidence comes to light I will be opposing Wiktionary:Votes/sy-2012-09/User:Vahagn Petrosyan for de-sysop. The reason is none of his supposed bad behavior is admin related. I also notice he's never unblocked himself. The admin work I've seen him do it good work; good speedy deletions, good reversions, so I can only see negative effects of de-sysopping. Mglovesfun (talk) 19:56, 21 September 2012 (UTC)[reply]
If he's never unblocked himself, it's because he's never been blocked for more than a few hours. You yourself blocked him for only 15 minutes for this trio of hilarity. Me, I'd have blocked him for a week for that. It's true that desysopping isn't really the ideal response to this behavior, since it isn't admin-related. The appropriate response to the behavior is blocking him for a month or so, but I don't trust him not to unblock himself if he's blocked for that long and still has the tools to do so. Nevertheless, I would be willing to oppose both desysopping and blocking if he would (1) credibly apologize and acknowledge the behavior is unacceptable for any editor and unbecoming of an admin, and (2) promise to never make joke edits in mainspace again. If he does that, I'll be happy for him to remain an admin. If he breaks the promise, he gets blocked. If he unblocks himself, he gets desysopped. —Angr 21:14, 21 September 2012 (UTC)[reply]
Posting joke entries is like spray-painting a joke on someone else's store window: it may be hilarious, but it's still wrong. As for "punishment", the primary purpose for de-sysoping isn't punishment, it's used to prevent damage to Wiktionary, either through precluding abuse of admin functions (including the ability to unblock oneself), or through disassociating Wiktionary from inappropriate activity by an admin, who would otherwise be perceived as an official representative of Wiktionary. As for "revenge", I agree with Ruakh that admins shouldn't be "mocking" anyone personally. NPOV is one of the pillars of Wikimedia, and making fun of anyone or anything in mainspace is a violation of NPOV. The only reason more gay editors have come out against such edits is that they don't have appreciation for the humor obscuring the basic issue. Oh, and for the record- I'm not gay, and my position on this isn't due to any personal reasons. Chuck Entz (talk) 20:03, 21 September 2012 (UTC)[reply]

In what appears to be intended as an apology, Vahagn Petrosyan says, "I don't hate gays, Jews, blacks, gypsies, etc. Especially because non of those exist in my country." What country doesn't have gay people? Or "etc."? Is this part of the joke that continues on to bad-mouthing Canadians, where he demonstrates that he has completely missed the point? And when he says, "If I'm desysopped, I won't quit editing Armenian and Russian. I'll think less of the community, that's all." I would think that "the most tolerant guy in the world" would think more of the Wiktionary community for desysoping him on this matter of principle, not less. Finally, his comments that desysopping is for revenge and that his vandalism should be called "Easter eggs" show that, indeed, he doesn't understand that the issue at stake is the integrity of Wiktionary that the administrators need to uphold. --BB12 (talk) 07:35, 22 September 2012 (UTC)[reply]

In Soviet Russia, the Russians believed that there were no gay Russians. In many cultures of Asia, especially Southwest and South Asia, lots of the citizens believe that they have no gay people there and that gays are a product of Western decadence. Armenians still believe that they have no gays. It’s not a joke, it’s a different culture and a different point of view. You think Armenians should know that there are gays there and it is shocking if they do not, and I think you should know that not every culture believes what you believe and it shocks me that you do not. Many cultures recognize a difference in gay people, but see it in a completely different way.
If Vahag were desysopped, I would think that Wiktionary had come to be populated by fools and idiots. This entire matter is a nonstarter and has been blown up out of all proportion. Vahag is not the one with bad judgment...the rabid witch hunt I see unfolding in this discussion is piled high with examples of horrible judgment, and I think anyone who continues with this nonsense is doing a lot of harm to the project and no good whatsoever. Shame on you guys. If you are not ashamed of yourselves, I’m ashamed of you. —Stephen (Talk) 09:36, 22 September 2012 (UTC)[reply]
So discussing what should be done to a repeat vandal who doesn't consider what he does problematic is a witch hunt? You can take your attitude, and shove it. Not every culture believes what you believe; some believe that with great power comes great responsibility, that only the most trustworthy should be given power, and that people who vandalize can't be trusted.--Prosfilaes (talk) 10:09, 22 September 2012 (UTC)[reply]
@Stephen, your assumption about what I think is wrong. It is possible that Vahagn has never noticed the gay pride events in Russia and that indeed he thinks there are no gay people there (if that's his country). But this is not about gay people. Vahagn says also that Roma and Jews do not exist in his country. And that "etc." do not exist there, either. This is part of his joke. The issue is vandalism by an administrator who thinks apologizing for inappropriate behavior is a joking matter. --BB12 (talk) 15:59, 22 September 2012 (UTC)[reply]
I don't agree with most of what Stephen said, but this is feeling a little over-the-top. One of the reasons that I don't want to be more involved with 'pedia is the ArbCom and the witch hunts that happen there. Let's not let it get like that here.
BB12, what you're saying is patently unreasonable, and you must not be putting yourself in his shoes at all if you think that you would care more about the community after the community disowned you. In all honesty, I know I would be pissed off if there was a vote almost wholly composed of people yelling at me, which seems to be what we're going to witness.
Chuck, I think you're taking this too far. If you can't accept even a joke comment (not a joke vote, just a comment) on WT:V, then I really think you've taken this farther than the mainspace edits, which are really the only thing we should be judging.
Angr, your suggestion sounds similar to the second option Ruakh added at Wiktionary:Votes/sy-2012-09/User:Vahagn Petrosyan for de-sysop. I recommend that you and others look at it, because I think it is a viable way to give a warning without desysopping. --Μετάknowledgediscuss/deeds 16:08, 22 September 2012 (UTC)[reply]
I could support Proposal 2 if Vahagn apologizes for vandalizing Wiktionary in the past and promises never to do it again in the future. —Angr 19:30, 22 September 2012 (UTC)[reply]
A formal warning should have been given ages ago. This whole case is way overdue. Given the extensive period of time this vandalism has gone on, it seems like unnecessary bureaucracy to give a formal warning now. It's not like anyone should be surprised over that vandalising a dictionary could have repercussions. Njardarlogar (talk) 17:16, 22 September 2012 (UTC)[reply]
@Meta, I'm not sure what you're referring to. I'm addressing issues brought up by his comments here, not objecting to anything he might have said on WT:V. I agree that his comments in the Wiktionary namespace aren't the problem. The activity that we might want to dissociate ourselves from in this case is vandalism of mainspace entries- especially with content that some might find offensive. That part of my comment was more of a general statement about the philosophy of desysopping and blocking, rather than advocating any course of action. I'm trying to counter the "boy's will be boys" attitude and the minimizing of the seriousness of the joke edits, but I haven't made up my own mind as to whether desysopping is a good idea, so I certainly am not advocating it- yet. Chuck Entz (talk) 18:44, 22 September 2012 (UTC)[reply]
@ Meta, I think I am putting myself in his shoes. If this happened to me, and I didn't give a proper apology, then I would think better of the community if it desysopped me. (Desysopping is not the same as disowning, which is blocking.) This is not a witch hunt. People are bending over backwards to try to welcome Vahagn in, but he simply wants to make a mockery of Wiktionary. --BB12 (talk) 19:11, 22 September 2012 (UTC)[reply]

Comment period on the Wikimedia United States Federation

There is a proposal for an an umbrella organization for chapters and other groups in the US called the Wikimedia United States Federation. A draft of the bylaws is now up at meta. There will be an open comment period on the bylaws 17 September, 2012 to 1 October, 2012. The comments received given will be incorporated into the bylaws and they will be put up to a ratification vote from 8 October, 2012 to 15 October, 2012. --Guerillero (talk) 21:56, 17 September 2012 (UTC)[reply]


Categories for undefined languages

We have many categories (such as Category:Proto-Central-Eastern Malayo-Polynesian language) for languages for which we are lacking a definition. Is this OK? Should we delete the offending categories, put them to RfV, or force the user who added the category to add the language definition? SemperBlotto (talk) 13:59, 19 September 2012 (UTC)[reply]

Do we have a cleanup category for this? Would it be possible for {{langcatboiler}} to check for a redlink on the language name, with an optional switch to suppress it in irregular cases- or does the lack of string functions preclude that? Chuck Entz (talk) 14:13, 19 September 2012 (UTC)[reply]
It could check, but there could be false negatives if the entry exists, but not in English. —CodeCat 14:53, 19 September 2012 (UTC)[reply]
The bigger problem is, there is no valid language family Category:Central-Eastern Malayo-Polynesian languages, so there should not be a proto language with this name. -- Liliana 16:42, 19 September 2012 (UTC)[reply]
I've been creating entries for the languages' names as I've been checking the names themselves (proposing renames when needed) to verify that they are attested, but it's slow going. - -sche (discuss) 19:06, 19 September 2012 (UTC)[reply]
The real problem is that the {{proto}} template allows any possible name, even those that don't have a language code (for reasons such as this). So people will take that chance to just make up whatever proto-language they can think of. We've already had to fix/remove several Proto-Altaic etymologies in the past. So I've been thinking we should get rid of {{proto}}, and use {{etyl}} with {{recons}} instead. —CodeCat 19:24, 19 September 2012 (UTC)[reply]

/LangCode:Dialectal/ or /Language dialectal terms/

Sorry if It has been discussed many times. Well, I have been thinking of "LangCode:thing" as identical to "LangCodelanguage terms relating thing"> Compare en:wine, en:wines, en:law, en:card games, etc. So one could easiy understand en:dialectal as Enlgish terms relating dialectal (yeah, grammatically incorrect but it doesn't matter), therefore one would expect the category to include: region, dialect, dialectal, nonstandard, maybe some regions regarded as speaking dialectal language and other bullcrements :D

Dixtosa_HERE IT IS xD_ 17:18, 19 September 2012 (UTC)[reply]


Mobile version questions

Mobile for me means Safari on an older iPod Touch.

o Wondering why quotations, etc. don't collapse on the mobile version like they do in the big-screen version?

o Many (most?) tables (e.g. translations) are too wide for the mobile version and there's no way to horizontal scroll. If you view the mobile version on a regular sized display you get a horizontal scroll bar when the window is small.

o Doesn't appear you can edit from the mobile version. Are there plans to make that possible? I know it would be clumsy, but still nice if it were possible.

-- dougher (talk) 00:01, 20 September 2012 (UTC)[reply]


Border between Scottish English and Scots

The dividing line between English and Middle English is typically given as 1470-1500, but do we have any similar guidelines for deciding whether something falls into Scottish English or Scots? I've just added collie-shangie - the citation from Burns is presumably in Scots, but Queen Victoria is surely using (quite literally) the Queen's English, with the other citations falling on a spectrum between the two. Which one should be categorised under (or should it be marked as both?) Smurrayinchester (talk) 09:58, 20 September 2012 (UTC) (Moved from Tea Room)[reply]

Not sure about Burns and Munro, but the middle three are most definitely English. --WikiTiki89 (talk) 10:04, 20 September 2012 (UTC)[reply]
I'd say Burns and Munro are definitely Scots, and Scott seems to be quoting Scots speech.--Prosfilaes (talk) 10:13, 20 September 2012 (UTC)[reply]

macrons in Nahuatl

What's our policy on macrons in Nahuatl entry titles? I was under the impression it was the same as our policy on macrons in Latin entry titles, i.e. [[ahuacatl|āhuacatl]]. That is our practice in almost all of the entries I've seen. I suppose we just need a WT:ANAH/WT:ANCI to discourage edits like this. - -sche (discuss) 15:03, 21 September 2012 (UTC)[reply]

Of course it depends whether Nahuatl uses macrons, or macrons are only used to indicate vowel length in text books (and so on). I asked Stephen G. Brown this question once while I was doing my best to clear out Wiktionary:Todo/redirects with macrons, he seemed to think Nahuatl does use macrons. Mglovesfun (talk) 16:20, 21 September 2012 (UTC)[reply]
When I asked him an Adyghe question, he deferred to the practice of ady.WP and ady.wikt; perhaps he only favoured macrons for Nahuatl then because nah.wikt and nah.WP use them in entry titles. - -sche (discuss) 18:30, 21 September 2012 (UTC)[reply]
The reference is in User talk:Stephen G. Brown/2010. Obviously, Stephen G. Brown can talk from himself. Mglovesfun (talk) 18:33, 21 September 2012 (UTC)[reply]
Whether we do or don't use macrons, we should mass-create soft redirects (i.e. alt form entries). - -sche (discuss) 18:42, 21 September 2012 (UTC)[reply]
Not using macra, at least for Classical Nahuatl, seems like the wiser course. The classic texts tend not to include them, and our current practice would be difficult, and I think unuseful, to undo. --Μετάknowledgediscuss/deeds 19:52, 21 September 2012 (UTC)[reply]
What is our current practice? Mglovesfun (talk) 20:04, 21 September 2012 (UTC)[reply]
As Sche said (and as you can see by taking a quick peek yourself), current practice for Nahuatl is "macra in headwords, but not pagetitles". --Μετάknowledgediscuss/deeds 15:45, 22 September 2012 (UTC)[reply]
This is a side note, but unless you want to object, I'm going to note in macron that macrons is much more common as an English plural then macra.--Prosfilaes (talk) 08:42, 23 September 2012 (UTC)[reply]

Lucifer

I am pretty sure Special:Contributions/Choochoochoose is Luciferwildcat. Equinox 18:45, 21 September 2012 (UTC)[reply]

...aka User:Acdcrocks, aka User:Gtroy. Just kill him on sight. -- Liliana 19:04, 21 September 2012 (UTC)[reply]
 Blocked. Equinox (isn't that tomorrow?), just use WT:VIP if you see him again. I don't think anyone would dispute that it was just LW again, being even more obvious than WF for some reason. --Μετάknowledgediscuss/deeds 19:50, 21 September 2012 (UTC)[reply]

Tentative Alternative forms content format policy

(this question is related to a project to write a validator for enwiktionary dump, see User:Fedso for details) I'm trying to define a format for entries under Alternative forms headings (nothing is specified in Wiktionary:ELE for Alternative forms). Currently (I'm actually testing with a dump from May), over a total of 59863 entries the formatting breakdown is:

  • 29828 * [[wikified]] {{qualifier}}?
  • 19495 * {{l}} {{qualifier}}?

the rest (10540) are templates, wikified terms and plain text combined in various ways: {{term}}, {{sense}}, {{l-nn}}, {{l-nb}}, {{onym}}, {{forms}}, {{nn-inf}}, {{pedlink}}, {{zh-ts}}, {{R:Webster 1913}}, {{alternative form of}}, {{seeCites}}, {{soplink}}

I think only formats that result in the same output (one wikified term per line with optional qualifier) should be allowed. Any comment?

— This unsigned comment was added by Fedso (talkcontribs) at 20:30, 21 September 2012 (UTC).[reply]

I don't think we should be ultra-strict on this issue. I think {{l}} is preferable to a wikilink because of the automatic script support and linking to the correct section (even more true for people who use tabbed languages, or so I'm told). Mglovesfun (talk) 20:33, 21 September 2012 (UTC)[reply]
I often put multiple forms on the same line if the differences between them are quite trivial, especially (but not exclusively) if there are other forms whose differences are less trivial. Also, I often leave a form unlinked (using {{onym|en||form}} instead of {{onym|en|form}}) if the target page is actually just a redirect to the current page. I don't think I'd support a policy of forbidding either of those options. —RuakhTALK 21:15, 21 September 2012 (UTC)[reply]
I found a {{onym|en||form}} example (pit-yacker), it definitely looks more elegant than a link to itself. Is it possible to obtain the same result with a different template? (I'm searching but without success). I'm asking because from the documentation it seems {{onym}} was meant for a different use, moreover there is a "nominated for deletion" message.
How do you format usually a multiple form entry? Fedso (talk) 00:05, 22 September 2012 (UTC)[reply]

Complex heading layout

After reading various help pages and looking at existing layouts I came to the conclusion that headings can be positioned with reasonable freedom, but this discussion Wiktionary:Beer_parlour#Q_about_header_levels_in_exceptional_cases has raised a doubt about the actual correctness of a complex layout like the following (and the correctness of some examples as well):


----

==English==

===Etymology 1===

====Alternative forms====

====Pronunciation====

====Noun====

=====Synonyms=====

=====References=====

====Verb====

=====Translations=====

====Coordinate terms====

===Etymology 2===

====Alternative forms====

====Pronunciation====

====Noun====

====Verb====

=====Coordinate terms=====

===References===

===See also===

===External links===

===Whatever===

Is this valid? (sorry I keep forgetting the signature) Fedso (talk) 20:14, 22 September 2012 (UTC)[reply]

Looks right to me. You know about WT:ELE, right? —Angr 20:25, 22 September 2012 (UTC)[reply]
The Coordinate terms of the verb (etymology 1) should be at L5 (=====) rather than L4 (====). - -sche (discuss) 21:26, 22 September 2012 (UTC)[reply]
Also, what do you mean by "whatever"? Some entries use "Scientific name", but they don't use it in that place, and IMO they shouldn't use it at all (I move the scientific name into the definition whenever I find such an entry). Some entries also use "Statistics" and "Trivia" but, again, there's discussion (see WT:ID#Trivia) of (re)moving these, too. - -sche (discuss) 21:30, 22 September 2012 (UTC)[reply]
I agree with -sche in every respect on this, except that I sometimes treat scientific names as synonyms, especially if there are multiple ones or the scientific name is of some level above genus. Coordinate terms are with respect to definitions and should be a level below the PoS of the definitions. However incomplete our explanation may be there is a kind of logic to it. One undesirable result of our practice is the occasional placement of alternative forms at other than at the top of the L2 section. I can't think of a good alternative to our present practice for that either.
I have never understood whether or in what sense the same logic applied to certain Chinese and Japanese entries. DCDuring TALK 23:12, 22 September 2012 (UTC)[reply]
I think that is right, except that when the pronunciation is the same across all Etymologies, it can be placed as an L3 above Etymology 1. Ƿidsiþ 08:58, 24 September 2012 (UTC)[reply]
I could have been clearer, with "Whatever" I wanted to say that is is possible to add custom headings as stated in WT:ELE:
  • "Other sections with other trivia and observations may be added, either under the heading "Trivia" or some other suitably explanatory heading. Because of the unlimited range of possibilities, no formatting details can be provided."
About the Coordinate terms, I thought that having it above a PoS heading, while complying to the nesting principle from WT:ELE, permits to assign an usually L4-5 header like Coordinate terms to more than one PoS, useful to avoid cut&paste in languages like Japanese. I was hoping to get a clarification but 2 are firmly against this practice and 2 don't find any problem in doing that... and you are all admin so I suppose you do know how to edit! I think we have a problem, do you mind if I propose a vote to get a definitive answer? Fedso (talk) 20:30, 24 September 2012 (UTC)[reply]
Certain languages need significantly differeent entry structures, for at least some entries. Chinese and Japanese come to mind. If you are interested in Japanese, you might try WT:AJA. DCDuring TALK 22:12, 24 September 2012 (UTC)[reply]
I read it superficially but enough to notice that it doesn't even mention "Coordinate terms". Anyway you gave me an idea, I'll code the validator so that different languages can have different heading structures and, for English language, I'll force the placement of semantic relations headings to be below PoS headings. It will make the validator more complex but also more flexible and with 450+ languages I'm afraid I'll need it! Fedso (talk) 23:19, 24 September 2012 (UTC)[reply]

Anagrams


The blacklist of websites

Why is this virtuous website blacklisted?:D

cais-soas.com --Dixtosa_HERE IT IS xD_ 10:16, 23 September 2012 (UTC)[reply]

[1] Equinox 12:28, 23 September 2012 (UTC)[reply]

...is now User:DTLHS, and DTLHS has explained that he no longer has access to the Nadando account and no longer wants sysop rights. Thus, I propose we remove User:Nadando's sysop bit, per the standard reason of "the actual person no longer uses the account, and a vandal could possibly hack into it". - -sche (discuss) 16:41, 23 September 2012 (UTC)[reply]

As I understand it, the situation is that DTLHS (talkcontribs) claims to be the same person as Nadando (talkcontribs), and claims that (s)he no longer has access to the Nadando account. I think these claims are true — there was a gap of a few months between Nadando's last edit and DTLHS's first; DTLHS seems to be an honest/legitimate/valid/reliable user; DTLHS has done some of the same sorts of things that Nadando did, such as analyses of database dumps; and DTLHS has not behaved in a way that would suggest a goal of getting Nadando desysopped by this mechanism — but I don't think it's possible to independently verify them. —RuakhTALK 18:27, 23 September 2012 (UTC)[reply]
I am under the impression that he/she randomised the password on that account and thus doesn't know it. There is a way to reset the password, though, if you have the e-mail address; I also randomised without knowing about this, and later used it. Equinox 20:15, 23 September 2012 (UTC)[reply]
Isn't having a registered e-mail address a prerequisite for adminship? If so, Nadando ought to be able to have his password e-mailed to him. —Angr 20:35, 23 September 2012 (UTC)[reply]
I changed my email to a temporary address before changing the password, so no, I can't have it sent to me. DTLHS (talk) 20:50, 23 September 2012 (UTC)[reply]
Um, isn't the fact that DTLHS is operating NadandoBot kind of the duh factor? And that desysopping Nadando will happen anyway due to account inactivity? --Μετάknowledgediscuss/deeds 21:09, 23 September 2012 (UTC)[reply]
Ah, good call. Thanks. —RuakhTALK 12:15, 24 September 2012 (UTC)[reply]
I have removed his admin status and removed his name from the list of administrators. This can always be reversed without a vote. SemperBlotto (talk) 21:13, 23 September 2012 (UTC)[reply]
Perhaps it would be worthwhile adding him to Wiktionary:Administrators/Former. — This unsigned comment was added by 2.136.134.91 (talk) at 08:56, 24 September 2012.
Done. SemperBlotto (talk) 09:03, 24 September 2012 (UTC)[reply]

The format of kaninķenis

Recently, CodeCat mentioned on my talk page that the format of Latvian kaninķenis was not in agreement with the standards here. It is an obsolete form, a borrowing from Germanic (cf. German Kaninchen) later replaced by another borrowing, trusis. Now, in the definition line, I used {{obsolete name of}} to link kaninķenis to trusis, but CodeCat tells me this is not the standard way of doing so. I had thought that this was the reason why {{obsolete name of}} exists, but maybe I'm wrong? (CodeCat also pointed out that this template is apparently used only by me for Latvian words in situations similar to kaninķenis', even though I didn't create it).

I thought a little about this issue. I had also mentioned kaninķenis in the trusis entry, under ===Alternative forms of===, which again, stricto sensu, is probably not correct. So I did the following: I created a new template {{obsolete synonym of}}, I moved obsolete synonyms from ===Alternate forms of=== to ===Synonyms=== (keeping an "obsolete form" tag to distinguish them from non-obsolete synonyms), and I changed the template in the definition line of such obsolete synonyms from {{obsolete name of}} to {{obsolete synonym of}}. (I did all of this for only one case: šķīvis, with obsolete synonym tallerķis; you can have a look and see what I did by comparing them with trusis and kaninķenis).

So: do you guys think this addresses, perchance resolves the issue raised by CodeCat? Or should I do something else instead? --Pereru (talk) 17:55, 24 September 2012 (UTC)[reply]

Re: "I didn't create [{{obsolete name of}}]": Yes, you did; see Template:obsolete_name_of?action=history. —RuakhTALK 20:48, 24 September 2012 (UTC)[reply]
Surprising! I had forgotten that. :-/ --Pereru (talk) 22:18, 24 September 2012 (UTC)[reply]
A word shouldn’t be treated as a form just because it is obsolete and a modern synonym exists. If kaninķenis and trusis are only semantically related, they should have lemma entries (correctly tagged as obsolete, when that is the case) and link to each other by means of a Synonyms heading. — Ungoliant (Falai) 21:13, 24 September 2012 (UTC)[reply]
I agree. I suppose the situation with šķīvis and tallerķis conforms to what you're saying? Also, {{obsolete synonym of}} certainly does not imply that something is "a form of" something else; but it does keep the relation, which is more than semantic: one term replaced the other. --Pereru (talk) 22:18, 24 September 2012 (UTC)[reply]
In that case, adding something like supplanted by FOO or nowadays mostly replaced by BAR (etc.) AFTER the definition should be good enough. Or a usage note when there is more to be known. — Ungoliant (Falai) 22:37, 24 September 2012 (UTC)[reply]

Vote to unite the languages?

Could I please get a link to the vote/main discussion about unifying Serbo-Croatian on Wiktionary. It's for the twitter page. I wrote this short, neutral summary about Serbo-Croatian in general here, and hope it is accurate. There's still more I could tweet about Serbo-Croatian in Wiktionary, as it sparked a huge debate (which I DO NOT want to flare up again), but would like the tweets to be factual, neutral, and interesting. Thanks in advance. --Wikt Twitterer (talk) 09:34, 25 September 2012 (UTC)[reply]

Category talk:Bosnian language is one, I don't have the patience to dig up more than that right now. Mglovesfun (talk) 10:49, 25 September 2012 (UTC)[reply]
Umm, Wiktionary:Votes/pl-2009-06/Unified Serbo-Croatian comes to my mind. -- Liliana 20:22, 25 September 2012 (UTC)[reply]
The vote turned into a bit of a farce, where most of the 50 people opposing weren't Wiktionary editors, but from other Wikimedia projects. We've now updated our policies on who can vote - even that was controversial as a lot of non Wiktionary editors again got involved to try and defeat it. Now I seem to think we require anyone who votes to have 50 edits in all namespaces, discounting user page edits. Mglovesfun (talk) 20:53, 25 September 2012 (UTC)[reply]

Main Page

Now that we have the FWOTD, we have kind of a problem. On large screens, the left side is much shorter than the right one, so there's a big chunk of empty space that looks just ugly. I tried moving the index in there, but it doesn't look good at all. Anyone have an idea how to fill this gap? -- Liliana 20:20, 25 September 2012 (UTC)[reply]

Expand "Behind the scenes" to fill the full width of the screen (like "Index") and put a titch more space between the WOTD and the FWOTD to fill the resulting smaller whitespace on the right side? - -sche (discuss) 20:45, 25 September 2012 (UTC)[reply]

signing posts

How tough would it be to render signing each post by clicking a button null? — This unsigned comment was added by 81.9.217.33 (talk).

What's most wrong with this edit?

Not to mention this. You may well view and review this question from the oriental, neutral, or non-Abrahamic perspective. --KYPark (talk) 08:12, 26 September 2012 (UTC)[reply]

I think it's a readability issue. We don't want really long entries; Wikipedia does that, people come to us for relatively short, concise definitions. Mglovesfun (talk) 08:19, 26 September 2012 (UTC)[reply]
I'm not asking the old but new, Ruakh's edit, whose 'wrong' may sound impossible to you. --KYPark (talk) 08:48, 26 September 2012 (UTC)[reply]
Just to clarify, you're asking us to critique this version without references to the previous versions, and answer the question "what's most wrong with it?" Mglovesfun (talk) 08:59, 26 September 2012 (UTC)[reply]
I meant this. I wonder why I made reference to another as above. --KYPark (talk) 09:07, 26 September 2012 (UTC)[reply]
Well my question above stands, as that's what I started replying to and you told me I'd misunderstood. Mglovesfun (talk) 09:08, 26 September 2012 (UTC)[reply]

I refer everybody to:

NOT to your:

And I never told you that you "'d misundertood." Please be precise not to mislead people. --KYPark (talk) 09:23, 26 September 2012 (UTC)[reply]

Well, since you asked "What's most wrong with this edit?", I'd say the most wrong thing is creating a red link for "don't do unto others what you wouldn't have them do unto you", even though that isn't a saying in English and is unlikely ever to be an entry. —Angr 09:45, 26 September 2012 (UTC)[reply]
That red-link was already there; I merely failed to remove it. —RuakhTALK 00:30, 27 September 2012 (UTC)[reply]
I agree (though I don't know if that's the question being asked, but if it is, I agree). Mglovesfun (talk) 09:51, 26 September 2012 (UTC)[reply]
Still you misunderstand me. So I say again "what's most wrong with [Ruakh's counter] edit" upsetting mine. Did you read the process at all, even not terribly carefully, before you say something responsibly?
Should it be most wrong to create a red link for "don't do unto others what you wouldn't have them do unto you," then it would be enough for it to be erased from Alternative forms, instead of reducing the content from 2.5 to 1.0 kbytes. Is your reason still reasonable?
I wouldn't continue this talk because strange edit conflicts continue. --KYPark (talk) 10:14, 26 September 2012 (UTC)[reply]
Examples of usages are supposed to be in the langauge of the entry, and they must use the exact word/phrase that the entry is about. Thus, only examples containing (deprecated template usage) 역지사지 can be used, not quotations of English texts. I am not sure why the synonym section was removed. Njardarlogar (talk) 14:09, 26 September 2012 (UTC)[reply]
Re: why the synonym section was removed: The sole "synonym" was marked as "-- Confucius". Since Confucius didn't speak Korean (or even some ancestor of Korean), I figured this must be a mistake. Given that the entry was filled with inappropriate content, I assumed this was just another example of such, so I removed it. —RuakhTALK 02:51, 27 September 2012 (UTC)[reply]
See #caveat lector (Reader, beware!)
I would have made the same revisions, pretty much. For one thing, the English-language citations in the September 14 version are irrelevant for a Korean expression. The September 14 content looks more conceptual (encyclopedic} than linguistic. DCDuring TALK 14:49, 26 September 2012 (UTC)[reply]
caveat lector
  1. I'm quite surprised that no debater understands what's the very question. So I guess that the self-righteous suffer cognitive biases to such an extent.
  2. What if all faults, if any, should be found with KYPark, likely the paganic eccentric Eurasiatic, while nothing would be wrong with Ruakh the powerful admin. Simply this is so unlikely. Why?
  3. Singly relevantly, though marginally, wondered in this session so far was "why the synonym section was removed" by Ruakh, who in turn answered, "Since Confucius didn't speak Korean ... I figured this must be a mistake." In effect, he admitted he was not terribly principled or objective.
  4. By doing so, he looks like watering down the original thick question "What's most wrong with this edit" of his. He disguised his scathing counter-edit as a marginal ("m") "trimming" [2] while making a mere bare bone of my special, strategic thick description!
  5. For many years, my such Eurasiatic strategy has remained too notorious for even the least informed to miss it. So I always feel like my opponents, say, PIE people having conspired to make a joke of my edits and make a scapegoat of myself, by all means. May I ask if I'm hypersensitive? Ironically, yes is what I really wish to hear. But surely they wouldn't convince me as I wish.
  6. Meanwhile, Ruakh's easy reply on synonymy as above is in fact too uneasy for me! Let me explain this way.
  7. Native English speakers would hate me, should I interfere with them using Latin such Category:English proverbs as follows:
  8. Likewise, Koreans in turn would hate Ruakh, who in fact or in effect interferes with them using hanja such as 孔子 and his saying 己所不欲,勿施於人.
  9. That simple is the principle of moral reciprocity that has lasted at least 2.5 millennia old, repeating itself one way after another, evolving quite uneasy etymologies and meanings, as evident as follows:
    1. 己所不欲勿施於人 (Confucius)
    2. 易地則皆然 (Mencius)
    3. 易地思之 or 역지사지 (author unknown)
  10. So these entries should be handled very carefully! Outsiders, say Ruakh knowing little Korean, have little to do with, or interfere with, them.
  11. But, I'm afraid and sorry that he in fact has been so disruptive as:
  12. By the way, I wish you all to be motivated by this regretting "ownership of language" as a fatal fallacy.
  13. In this regard, I guess, Ruakh is painfully misguided by, and misguiding, such unreasonable and unseasonable ownership of language in this global joint venture like these wikis in this global age. I wish him respice finem and not to do what may make the East Asian very angry at him!
  14. I wonder if whatever Ruakh does is purely for the benefit of Wiktionary. I fear if he might greatly endanger all the wikis that has kept stressing themselves as founded on NPOV. No global wiki forever without global NPOV, I guess.
  15. Eventually, I wonder why Ruakh dares to take risks of the unnegotiable NPOV. So far I have no idea on the relevant root and reason than the deep-rooted self-righteous Abrahamism coupled with Eurocentrism. No world peace with these, I fear utmost!

I wish the above numbers of mine to discussed one by one. And I confess that during this session I was blocked for a day for a dubious reason. --KYPark (talk) 10:11, 28 September 2012 (UTC)[reply]

I assume it's because you're not a native English speaker that you were unable to express what you meant. Having read what you've just written, you're talking rubbish and I won't reply because it might imply that I think your points are legitimate, which I don't. Mglovesfun (talk) 10:16, 28 September 2012 (UTC)[reply]

Automatized entry-generation of some types of words

Again, sorry if it is discussed already. So, synthetic and polysynthetic languages are rich in terms of word-formation. Georgian is one of the synthetic languages. Most nouns in Georgian can produce (19+11)(1+1+several other postpositions)=60+ forms. This job cannot be, apparently, done manually, but in the case of Georgian language it is too complex (e.g. some Georgian nouns produce, though grammatically correct, but insensible forms which has not ever been used... We do not have a right to create entries like that. Do we?) to trust bots to work with them. Nevertheless, "insensible forms" might be used someday. Even if we create insensible entries too, it will take a very long time. I think the Georgian is not the only language with the same "problem".

Also, I am sure you have discussed the inclusion of numerals somewhere and concluded that numerals upper than 20 should not be created. That's sensible since numbers are infinite, whilst wikidata is not :D. It would be nice if we could give users translations(there must be a regularity in languages so we can automatize transalting) in all languages (including English) and alternative forms(e.g see 20) of numbers upper than 20.

And, here is the solution! :D

I propose we provide (dunno how, but sure it is possible) users with "guess entries", in other words with facility to see definitions of terms that has not been (or should not be) started yet. The page would much like a normal one, but with a warning in the top saying it is just a guess and should not be relied on (perhaps entries on numbers would be accurate enough). We can also put a button which searches in a traditional manner. Maybe we can add it to Preferences>Gadgets as an option able to be switched off.

Look, we have გზა (gza) and we know what/how does this suffix -თვის (-t'vis) do/work, but we have not got გზისთვის (gzist'vis) which causes a user to leave. Why should we let a user leave wikt disappointedly when we can change something (I think it is mediawiki) a bit thereby improve much. BTW, google translate can handle 'gzist'vis'

Surely, I would help with pleasure if It got positive responses.

Lastly, of course I am not obsessed with amenity of users :D, they are just things that make this project useful : ).

Regards--Dixtosa-wikified me 18:44, 26 September 2012 (UTC)[reply]

From an user point of view it is an interesting idea. Maybe the user could be presented with a link to a grammar page next to the guess result, that's something Google Translate can't do (well... yet, but probably never) -Fedso TALK 22:33, 26 September 2012 (UTC)[reply]

Is there any (somewhat) objective way to judge how complete our coverage of a language is?

I am wondering if we could somehow quantify how much work we still need to do, before our coverage of any particular language is considered up to par. That means that important words should not be missing. A simple number would be good, like '60% complete for everyday use'. But what could we use for this? I'm thinking we could use news articles or some other kind of prose that is full of everyday words. Some kind of reasonably lengthy text that is translated into many languages would be good too, as long as it's written in everyday language (not the Bible!). Would anyone be interested in such a thing and would it be useful? And if so, how could we make it work? —CodeCat 23:03, 26 September 2012 (UTC)[reply]

The problem are be conjugated forms and compound words. Even if lemma forms could be lets say 80% complete a lot of conjugated and compound forms would be missing. So you need to have an automatic way for finding the lemma entries of a corpus, which AFAIK is very complicated. On the other hand, if you do that manualy it would mean realy a lot of work. Matthias Buchmeier (talk) 23:47, 26 September 2012 (UTC)[reply]
Well, those inflected forms are really part of the whole completion. Although they shouldn't weigh in as much, they should probably still count. —CodeCat 23:49, 26 September 2012 (UTC)[reply]
This sounds a bit tautological, but comparing our coverage against that of a mainstream dictionary (and preferably doing it to the part-of-speech level, not just "is this a word?") would be a good indicator. Equinox 23:50, 26 September 2012 (UTC)[reply]
We could use a Monte Carlo method: pick X random entries out of the Y most common words and rate its lemma from 0 to 10, then calculate the average. — Ungoliant (Falai) 23:58, 26 September 2012 (UTC)[reply]
But how would we find the most common words and rate the lemmas? —CodeCat 23:59, 26 September 2012 (UTC)[reply]
WT:FL. — Ungoliant (Falai) 00:02, 27 September 2012 (UTC)[reply]
Oh, wow! That looks good... But what did you mean by rating them? (also, somehow the word "Jax" is in the 1-1000 list for English... vandalism?) —CodeCat 00:07, 27 September 2012 (UTC)[reply]
For example: 0/10 means no entry; 5/10 means an entry with a couple of definitions and a bunch of related terms; 10/10 means an entry with as many definitions as necessary (as opposed to just one, all-encompassing definition; sadly common in our foreign language entries), the definitions are well written, wikified and there are usexes if necessary, have necessary glosses (especially for FL entries), a complete etymology instead of just “ultimately from FOO” (and the terms should have glosses), all major pronunciations and some dialectal pronunciations, entries exist for inflected forms, many semantic relations (and they should use {{sense}}) and related terms, topical/context categories, citations, etc.. Then we calculate the average and voilà. — Ungoliant (Falai) 00:18, 27 September 2012 (UTC)[reply]
Not that it wouldn't be desirable, but that would require a lot more manual intervention than would be feasible for a site-wide list. This list has to be automatically generated with a script, because there is no way we could do it any other way. —CodeCat 00:23, 27 September 2012 (UTC)[reply]
To avoid manual work it should be possible to assign a numeric value to each property considered essential for a good entry, like having an Etymology field with a minimum length, having semantic relation... Final score could fluctuate adding new properties or changing their values but could be a good approximation to start from. -Fedso TALK 02:08, 27 September 2012 (UTC)[reply]
Doubtful. For example, some of the worst etymologies we have are some of the longest. It does require human review. Equinox 02:13, 27 September 2012 (UTC)[reply]
@CodeCat. I have the same thoughts. We have entries/translations of advanced terminology but with basic words full of gaps, e.g. Macedonian or Malay. Frequency lists are a good start. Sometimes using English word lists can be used but basic English words may not be as frequent in a target language or a different part of speech, multipart word or SoP can be used. I'm not too worried about the inflected form at the moment. Eventually we may have a bot that generates them, hopefully but having some kind of competition sounds good. I spend a lot of time on translations from English - that way I manage cover more vocabulary. Translations, if they are good, can be used as a basis for entries. I would base ratings on coverage of a language by the number of entries, coverage of basic vocabulary. Etymology and pronunciation are bells and whistles, IMHO. Providing standard romanisation and clear definitions is a minimum, which may be sufficient. --Anatoli (обсудить/вклад) 02:36, 27 September 2012 (UTC)[reply]
Perhaps "bells and whistles" to a translator, but ety and pron are a big deal in a single-language dictionary. Equinox 02:39, 27 September 2012 (UTC)[reply]
I think our standards for coverage of English and for coverage of other languages will naturally be different. Etymology and pronunciation are probably "bells and whistles" for most non-English languages on en.wikt. (Anatoli does specify "standard romanisation", which is probably adequate pronunciation information for most languages in non-English scripts, since we usually cheat and incorporate pronunciation info in our romanizations.) —RuakhTALK 02:48, 27 September 2012 (UTC)[reply]


(After edit conflict). Let me clarify. Language dictionaries don't usually provide etymology, just help to understand and use the language as it is. Pronunciation info depends on the language in question. Learners/users tend to handle this first. Familiarity with the script and standard transliteration is what is needed, including info for words that are read irregularly. Having audio files is almost beyond our control, IPA is an add-on, if the the transliteration is standard. IPA may be more confusing than transliteration to an average person. I don't deny the above info is important but by language coverage I mean - do we have colour names, pronouns basic adjectives, words enabling one to communicate and what level of communication. The more advanced the learner, the less important the pronunciation becomes. For example, Japanese, I think our coverage of Japanese is very good. Romaji is all it needs to help learner to read a word. I don't want to highjack the discussion but I think CodeCat meant "everyday vocabulary" coverage, not the completeness of entries. I personally think we should aim not just for quality but for quantity (with accurate and clear info). Nobody is going to consider Wiktionary a great source for a language if we have a small number of perfect entries but fail to provide sufficient amount of words for basic communication.
If I understand Ruakh correctly, yes, if the romanization exists, we can do without IPA in the Pronunciation section. --Anatoli (обсудить/вклад) 03:02, 27 September 2012 (UTC)[reply]
That's assuming the romanization maps one-to-one onto pronunciation, which isn't necessarily the case. The romanization of Tibetan, for example, is pretty far removed from the pronunciation. —Angr 06:25, 27 September 2012 (UTC)[reply]
Agreed, yes, we have to make concessions or choose alternative romanization(s) that describe how to pronounce better better (as you do with Burmese). No romanization is perfect, even for languages with simpler phonology, though but IPA won't help either, for those who are unfamiliar with the phonology of the language or spelling/pronunciation rules. Some Roman based language entries also have complex reading rules but it's assumed that user will be familiar with spelling rules or pronunciation section becomes essential, depends on the language. There is no problem with entries like පාන් (pān), for example, when the translation is straightforward. The problem with Sinhalese contents on Wiktionary is not so much the quality but the quantity - too few of them. --Anatoli (обсудить/вклад) 06:49, 27 September 2012 (UTC)[reply]
I admit I haven't really been following this thread (tl;dr) so I don't know whether anyone is seriously proposing removing etymologies and pronunciations from non-English entries. If so, I would most strenuously object. Not being paper, we are the only dictionary really capable of listing all the dialectal pronunciations of Irish words, and since Irish doesn't have a standard pronunciation, the dialectal pronunciations are the only ones we can list. We therefore offer to Irish learners something no other dictionary offers them. But if no one is proposing removing etymologies and pronunciations, I apologize for attacking a straw man. —Angr 07:07, 27 September 2012 (UTC)[reply]
No, I don't think anyone is aiming at removing anything useful. Please read the beginning of the discussion. I quote: "...if we could somehow quantify how much work we still need to do, before our coverage of any particular language is considered up to par". I like the idea and I expressed my opinion on this. Etymologies and pronunciations are icing on the cake, we still miss basic vocabulary. --Anatoli (обсудить/вклад) 07:26, 27 September 2012 (UTC)[reply]
Wow... a lot of reactions. I'm glad to see there is interest in this idea! I did indeed mean coverage of words, not the quality of entries. On the other hand, I don't see any reason why we can't do both, separately. In other words, we could have one number to quantify the coverage of the vocabulary, and another to quantify the quality of the entries themselves. So an entry could be considered complete when it has an etymology, pronunciation and (in case of English) translations. Note that this is rather lenient... not all entries with those things are necessarily any good, but we do know that entries without them are not so good. So it can give false positives ("positive" being "good") but not as many false negatives. We could improve and add more complicated criteria (such as template usage, genders, inflection tables etc) but I'd rather not do that right away because it's hard enough already. Translations should probably be judged separately, with one number for each language. For each English entry, if there is a translation into a language, it counts as a +1 point for translations in that language, 0 otherwise. The translation quality number for a language is then the percentage of English entries that has a translation into that language. —CodeCat 11:56, 27 September 2012 (UTC)[reply]
if you need "draft" statistics I'll be happy to provide them. The parser I'm coding, that runs directly on the dump file, counts headings already so it will take really 5 minutes to adapt it and make lists of words with ranks. I can also try to add complex criteria, again, the program examines headings content already (work in progress) so checking the presence of a specific template, the content of a template or the number of translations is definitely feasible. -Fedso TALK 14:07, 27 September 2012 (UTC)[reply]
Good idea. --Anatoli (обсудить/вклад) 22:40, 27 September 2012 (UTC)[reply]
A rough-and-ready way to get an idea of our coverage is to find a reasonably long text in a particular language (try the Wikisource "long pages") and wikify it (get your text editor to surround every word in double square brackets). Put it in the sandbox and "show preview". The percentage of blue links rather than red is a good estimate of our coverage. You won't get 100% blue, even for English, because of capitalisation issues. And it won't work for Chinese etc. as they don't use "words". SemperBlotto (talk) 15:16, 27 September 2012 (UTC)[reply]
Capitalisation is not a problem, neither is the Chinese wikification. Chinese DOES have words, even if they may consist of one or multiple hanzi. Such wikification would need to modify words to link to lemma-forms, unless we are checking for inflected forms coverage as well. --Anatoli (обсудить/вклад) 22:40, 27 September 2012 (UTC)[reply]

You might like to read [7] from the BBC news website. SemperBlotto (talk) 15:11, 27 September 2012 (UTC)[reply]

You could also say that this whole language is a Britishism. --WikiTiki89 (talk) 15:28, 27 September 2012 (UTC)[reply]
A few months ago I read an article there complaining about the opposite. Here it is: [8]. More recently they also had an article claiming that English originated in Turkey (because PIE originated there... go figure!): [9]. — Ungoliant (Falai) 16:48, 27 September 2012 (UTC)[reply]

How to transcribe tone in Swedish and Norwegian?

Both of these languages have a rudimentary tonal system, related to stød in Danish. The tone can in some cases be phonemic, like in anden (the duck) which is distinguished from anden (the spirit) by tone: File:Sv-anden anden.ogg. How should this difference be transcribed in IPA pronunciations? —CodeCat 11:16, 28 September 2012 (UTC)[reply]

Appendix:Swedish pronunciation#Stress says to use ˈ for accent 1 and ˌ for accent 2. Not quite IPA usage, since accent 2 isn't secondary stress, but I've seen that method used in other sources. On the other hand, the narrow transcription further down the same page uses a grave accent diacritic to stand for something, presumably one of the two tones. —Angr 11:29, 28 September 2012 (UTC)[reply]
Many Norwegian grammars and dictionaries use the acute accent for toneme 1 and the grave accent (sometimes a háček, i.e. ˇ) for toneme 2, both in front of the stressed vowel. I have seen acute and grave in Swedish books, too.
Using the secondary stress (ˌ) for toneme 2 seems a bad idea to me. --MaEr (talk) 12:51, 29 September 2012 (UTC)[reply]
When it comes to the tonal distinction it seems that it's the tone of the word as a whole that changes rather than the tone of just one syllable. In the example of anden to me the difference is much clearer in the second syllable than in the first. The actual realisation of the tones also differs by dialect, the only real distinction that most of them share is the difference between word tone 1 and word tone 2, how they pronounce them is dialectal. So how can we denote tone in IPA, without making things too specific as to the phonetic realisation or syllable placement of the tones? —CodeCat 13:35, 29 September 2012 (UTC)[reply]
You say the difference is clearer in the second syllable than the first, and the way our Appendix recommends transcribing anden "the ghost" is [ˈanˌden], where the mark for tone 2 precedes the second syllable. In general, if we want a transcription that doesn't get too specific about its phonetics, we do have to use something abstract like repurposing the stress marks or putting ´ and ` into the word in question, e.g. [´andɛn] invalid IPA characters (´) "the duck" vs. [`an`dɛn] invalid IPA characters (``) "the ghost". The transcriptions used at w:Swedish phonology#Stress and pitch, while impeccable IPA, are probably too dialect-specific for us. —Angr 15:52, 29 September 2012 (UTC)[reply]
Using secondary stress marks is a bad idea like MaEr said, because then how do we distinguish that from actual secondary stress? I'm not sure if using ` and ´ is much better, because again they are tied too much to the phonetic representation of tone, which is dialect-specific. See w:sv:Ordaccent#Accent i svenskan (in Swedish) for an overview, in particular the dialect differences between central, southern and western Swedish. —CodeCat 16:00, 29 September 2012 (UTC)[reply]
Here is a translated version of the table given there:
word accent and stress standard Swedish tone western Swedish tone southern Swedish tone
paraply acute, final stress normal-normal-high normal-normal-high normal-normal-high
vatten acute, penultimate stress high-normal low-high falling-low
sommarlov grave, final secondary stress falling-low-high falling-low-high rising-low-low
nyfiken grave, penultimate secondary stress falling-high-normal falling-low-high rising-low-low
ordförande grave, antepenultimate secondary stress falling-high-normal-normal falling-low-low-high rising-low-low-low
Using ´ and ` over vowel letters is tied to the phonetic representation of tone, but putting them before syllables like stress marks is not. One possibility for the above words is: para´ply, ´vatten, `som`marˌlov, `nyˌfiken, `ordˌförande (except using actual IPA instead of orthography for the vowels and consonants). —Angr 16:27, 29 September 2012 (UTC)[reply]
CodeCat, you say: the only real distinction that most of them share is the difference between word tone 1 and word tone 2, how they pronounce them is dialectal. So how can we denote tone in IPA, without making things too specific as to the phonetic realisation or syllable placement of the tones? — Indeed. I suggest not to denote tones in IPA, but tonemes (without IPA). If Norwegian pronunciation gets too IPA-like (too detailed) it covers only a few dialects, since there is no standard pronunciation in Norway. I would prefer what Angr suggested (but with only one accent sign per word): [´andɛn] invalid IPA characters (´) "the duck" vs. [`andɛn] invalid IPA characters (`) "the ghost" (here without the second accent sign). In this case, we would have:
* /ˈ/ for stress in non-toneme languages (English etc)
* /´/ for toneme 1
* /`/ for toneme 2
* /ˌ/ for the secondary stress
--MaEr (talk) 17:49, 29 September 2012 (UTC)[reply]
That seems like a good solution. But Swedish has stress in addition to tone. I'm not quite sure what the relationship of tone to stress is, mostly because words inherited from Old Norse tend to all have initial stress placement. I also don't really know how tone works in compounds; what happens if an acute-accented disyllable is compounded with a grave-accented trisyllable, for example? Does the compound retain both accentuation patterns in the respective parts, or does the new word take on a new accent? —CodeCat 18:39, 29 September 2012 (UTC)[reply]
The last three words in the example table above are compounds, aren't they? I'm fine with only using the ` mark once per word; in fact, I prefer it. I only suggested using it twice because CodeCat seemed to want something on the second syllable even though stress itself falls on the first. —Angr 19:46, 29 September 2012 (UTC)[reply]
They are, but I'm not really sure how or why the tone placement is the way it is. Is ordförande grave-accented as a whole, even though ord itself is acute-accented? Can a word have something like an acute-grave or grave-grave accent or some other combination, or is the tone entirely determined by placement of primary and secondary stress and the tone pattern of the word as a whole (with no combinations)? The examples don't really make that very clear. —CodeCat 21:15, 29 September 2012 (UTC)[reply]
About the relationship between toneme and stress: in Norwegian and Swedish, only the main-stress syllable can have a toneme distiction. Example: bønder ("farmers", toneme 1) vs. bønner ("beans", toneme 2). The only difference is the toneme. If these words, however, have secondary stress, the distinction gets lost, and thus there are no more tonemes. Example: in the second part of småbønder ("small farmers") there is no more toneme. --MaEr (talk) 09:21, 30 September 2012 (UTC)[reply]
Then how come the three compound words in the table above are marked as 'grave' accent when according to you they should not have any tonal class at all? —CodeCat 12:07, 30 September 2012 (UTC)[reply]
It's only the less stressed part of the compound that looses the toneme. Example: in the second part of småbønder the toneme gets lost, but the first part still may have toneme 1 or 2. Other example: ordförande would be /`ordˌförande/ — ord- has the main stress and toneme 2, -förande has a secondary stress on för (but no toneme distinction here). --MaEr (talk) 14:04, 30 September 2012 (UTC)[reply]
Ok, I understand that at least. But why does ord- have tone 2, when the word ord by itself has tone 1? Can single-syllable words even have tone 2, even in compounds? —CodeCat 14:19, 30 September 2012 (UTC)[reply]
As a rule of thumb, Old Norse two-syllabic word forms have toneme 2 in modern Norwegian and Swedish (ON baunir > N bønner) whereas ON one-syllabic word forms have toneme 1 in modern N and S (ON bœndr > N bønder). But there are exceptions. And a toneme is not a feature of a word but of a word form. Norwegian bonde ("farmer") has toneme 2 (Old Norse búandi, bóndi) but the plural has toneme 2. According to Hallfrid Christiansen ("Norske dialekter") we have even a toneme difference between gårdgutt (2) and gårdsgutt (1). I guess that gårdgutt was a two-syllabic word form in Old Norse whereas gårdsgutt was two separate word forms in Old Norse. I think the key is the Old Norse situation: if a word form has two syllables in Old Norse, it is likely to have toneme 2 in modern Norwegian and Swedish. --MaEr (talk) 14:43, 30 September 2012 (UTC)[reply]
So if the compounding process follows the original Germanic/Indo-European method of using the bare stem or a prefix, then it counts as a multisyllable (gårdgutt, föreligga), but if it follows the Old Norse method of using the genitive case, then it counts prosodically as two separate words (gårdsgutt, hälsovård)? Also, what happens to words (mostly loanwords) with noninitial stress? Can they ever have tone 2? —CodeCat 15:10, 30 September 2012 (UTC)[reply]
So are we agreed that tone 1 shall be transcribed with ´, tone 2 with `, and secondary stress with ˌ? Shall we update Appendix:Swedish pronunciation and Appendix:Norwegian pronunciation (the latter of which currently says nothing about tone at all) to say this? —Angr 15:04, 30 September 2012 (UTC)[reply]
Yes I agree, but it should be placed at the beginning of the word so that it applies to the word as a whole. Also, does the tone marker replace the stress marker or is it used in addition? /`skriːva/ invalid IPA characters (`) or /`ˈskriːva/ invalid IPA characters (`)? —CodeCat 15:10, 30 September 2012 (UTC)[reply]
Well, it should be placed before the stressed syllable. For most words, that's the first syllable, but for the example of paraply above it should be before the ply. And it should be instead of the stress mark: since only (primarily) stressed syllables take tone anyway, the presence of the tone mark already implies the presence of stress on the marked syllable. So marking stress as well would be redundant. —Angr 15:36, 30 September 2012 (UTC)[reply]
That doesn't agree with the anden example though, the second syllable clearly changes there as well, perhaps moreso than the first. —CodeCat 16:06, 30 September 2012 (UTC)[reply]
I agree with Angr. I would like to avoid double accent signs. I never have seen double accent signs in Norwegian/Swedish grammars or dictionaries. --MaEr (talk) 16:33, 30 September 2012 (UTC)[reply]
If the second syllable of `anden changes too, that's just part of the phonological realization of the grave accent that "resides" on the first, stressed, syllable. It's like Serbo-Croatian: we mark pitch accent on the stressed syllable, even though the effect of the pitch can be realized on the following syllable as well. I think having just a single mark on `anden etc. is the only way to achieve the goal of having a pan-dialectal transcription that doesn't get too specific about its phonetics. Because the only plausible alternative is to transcribe it [ˈan˧˩dɛn˥˩], which is dialect-specific. I do think, however, that we should use these markings only in broad /phonemic/ transcription. Narrow [phonetic] transcription can certainly be dialect-specific and can show the tones in a precise way. —Angr 17:35, 30 September 2012 (UTC)[reply]

Non-free content criteria

During this RFD discussion, it became apparent that Wiktionary is currently breaking WMF rules by having non-free content without having a policy page detailing criteria for the use of such content. To conform to WMF rules, we could create such a page (an EDP), or we could delete the non-free content. I would prefer that we set up an EDP, because I believe it is, rarely, necessary for us to host non-free content; therefore, I have tweaked w:Wikipedia:Non-free content criteria to make it appropriate to Wiktionary and thus create Wiktionary:Non-free content criteria as a starting-point. Discuss... - -sche (discuss) 02:04, 30 September 2012 (UTC)[reply]

Argh! That policy is a great example of the instruction creep for which Wikipedia is infamous. I'm sure we can craft a nicer one that isn't so verbose and complex, but that WMF will still like.
Also, the draft policy would need to be clearer on the namespaces in which non-free images are permitted.
On the whole, though, I'm still against the use of non-free images here. At the moment, we use them on citations pages, where they are always replaceable by simple textual descriptions, and in project discussions, a use which seems contrary to the vaguely-worded point 3 of foundation:Resolution:Licensing policy. So, quite simply, we can live without non-free images and remain a purely free-content project, while not being worse off for it. This, that and the other (talk) 06:00, 30 September 2012 (UTC)[reply]
I agree, but this is not the place for that discussion. I think we should give -sche's draft a chance to be written and then hold a vote saying "Do we want to implement this EDP and allow nonfree files on Wiktionary, or do we want to prohibit nonfree files here?" The former choice ought to entail allowing nonadmins to upload files locally (since Special:Upload is apparently currently available only to admins); the latter will entail deleting the 4 nonfree files we are currently hosting. —Angr 14:59, 30 September 2012 (UTC)[reply]
I think I disagree. It seems to me that if we first discuss the general principles, such as "Do we want to allow nonfree files here? If so, which/for what purposes?", then we'll end up with an EDP that's closer to what the community wants — or no EDP at all, if we don't want nonfree files — than if we try to create an EDP first and then just hold an up-or-down vote on it. No? —RuakhTALK 23:14, 30 September 2012 (UTC)[reply]
I disagree about the choice being between anyone can upload and everyone can upload. The number of cases where we bypass Commons should be as small as possible, and should be strictly controlled- otherwise we risk becoming Commons' reject-bin. That might mean having a mechanism to approve such images before they become available for use, or it might mean keeping the admin-only set-up we have now, but having a procedure for sending the file to an admin if you want it uploaded. I don't know if either is practical or desirable, but I don't want to have to deal with the potentially quite large increase in copyright and other legal issues that unrestricted uploading would present us with. Chuck Entz (talk) 23:46, 30 September 2012 (UTC)[reply]
I agree with Ruakh that discussion should precede any vote on an EDP. It seemed from the RFDO discussion that some users viewed the issue as too complex for Wiktionary to handle; I provided a starting-point draft to counter that idea, but the draft should be refined to address the kind of non-free content we want and 'why'.
In the RFDO discussion of File:ghoti oeufs sign.png, Angr described three other files as non-free: File:Far Side 1982-05-28 - Thagomizer.png, File:f-word-xxxx.png, and File:khw-superscript.jpg. I see File:ghoti oeufs sign.png as unnecessary, and I have proposed that File:f-word-xxxx.png (which I uploaded) be deleted. File:Far Side 1982-05-28 - Thagomizer.png is necessary, because only the image conveys the meaning of the term. File:khw-superscript.jpg is linked-to from a vote, and I dispute that having an image of a snippet of a books' words is different from having a retyping of a snippet of a books' words as far as any permissibility is concerned (and the image is necessary precisely as an image, from which Unicode codepoints cannot be inferred with certainty, whereas any typing-up of the text would necessarily use Unicode characters...though if someone could find similar IPA in a free book, we could use an image of that, instead). Thus, it is very rarely necessary to host non-free content, but it is necessary to host non-free content, in my view. - -sche (discuss) 00:34, 1 October 2012 (UTC)[reply]
I like free-only (see Angr's essay here), but there's not much point to the files we have that are non-free. khw-superscript is unnecessary now that the vote is history, and I don't really follow your reasoning about thagomizers, Sche. The world contains a good few thagomizers. It follows that Commons must have at least one good, free thagomizer photo (I haven't checked). Heck, if it doesn't, I'll upload an image of a thagomizer myself. When we're that close to such an admirable goal, why not go all the way? --Μετάknowledgediscuss/deeds 04:50, 1 October 2012 (UTC)[reply]
Re thagomizer: the comic is a citation which verifies that the term "thagomizer" means "the arrangement of spikes found on the tails of various stegosaurs", and which is uniquely relevant as the earliest citation: the comic coined the term. If you take away with picture and keep only a retyping of the caption, it is entirely unclear what "thagomizer" means. Terms have failed RFV and be deleted when the only available citations of them were too unclear for a coherent meaning to be worked out, and while "thagomizer" has enough other citations that it is not likely to be in danger of being deleted, the importance of citations of the use, not merely the existence, of words should be clear.
Re khw-superscript: I emphatically oppose the idea that images should be deleted simply because the discussions or votes which they made intelligible are not currently live (even in the unlikely event that the subject of superscripts never comes up again — though it already came up at RFV after the closing of the vote...). Some other images, some of which I listed after another user tagged them {{d}}, have already been kept at RFDO because "the discussion which uses this image is no longer live" has been deemed a poor reason to delete something. - -sche (discuss) 05:30, 1 October 2012 (UTC)[reply]
Don't you find the description at thagomizer#Etymology sufficient? To me, the non-free picture adds nothing that the textual description of the comic doesn't already provide (viz. the context, the humour, and the citation).
As far as the vote picture, I do not really think that was needed in the first place. The textual transcription of the picture was sufficient. What did the image add to the vote in your opinion? (More to the point, we would need a very good reason to allow non-free media in project-space, since most [all?] WMF wikis do not permit such use under any circumstances.) This, that and the other (talk) 07:29, 1 October 2012 (UTC)[reply]
I thought the ghoti sign was on Commons; I certainly recall arguing that it would be okay for Commons at some point. I don't think that File:f-word-xxxx.png could be justified as "fair use", given that we're in direct competition with the commercial selling of the book in question and it's copying an entire entry. File:khw-superscript.jpg is marginal PD-text; as people have said, we use that much for citation. (I'm pretty sure if the WMF got sued over Wiktionary, no lawyer would waste their time arguing that our citations were PD-text when they could argue that they are practically the definition of fair use, a much easier argument.) If necessary, it could be cut done to just the first line, which I would argue is a clear PD-text. We could argue fair use for thagomizer; I don't find it particularly essential, though.--Prosfilaes (talk) 07:51, 1 October 2012 (UTC)[reply]
I screencapped and clipped that Ghoti Oeufs sign image, and never uploaded it to Commons Though legal and fair use for reasons in its description, I agree that the page can do without it, so I'll go along with the consensus for deleting it. ~ Röbin Liönheart (talk) 16:21, 1 October 2012 (UTC)[reply]

Block 'bad-iw' edits altogether?

I can't really think of any situation where such edits are valid. Should we block them altogether? It would save us a bit of work. —CodeCat 21:41, 30 September 2012 (UTC)[reply]

Pardon my ignorance, but what is 'bad-iw'? --WikiTiki89 (talk) 22:00, 30 September 2012 (UTC)[reply]
We have a filter system that marks specific kinds of edits as problematic so that we can find them later. One of them is 'bad-iw' which means that someone added an interwiki link to a page with a different name from the current one. Most people do that because they are trying to translate the term, which is an error. —CodeCat 22:22, 30 September 2012 (UTC)[reply]
Maybe they could be valid if the headwords rules are different. For example, our Hebrew headwords are generally ktiv male but the he.wikt uses ktiv khaser so while we have ניקוד, they have he:נקוד. I don't know if this should be solved with "bad-iw"s though. --WikiTiki89 (talk) 22:44, 30 September 2012 (UTC)[reply]
Nope, we insist on exact title-equality. Differences in headword-rules can be (and frequently are) addressed by using redirects, but not by adjusting the rules for interwikis. —RuakhTALK 22:57, 30 September 2012 (UTC)[reply]
We would have to limit it to mainspace, since none of the interwikis on the left of this page match the headword, as is typical for the Wiktionary namespace. Chuck Entz (talk) 22:54, 30 September 2012 (UTC)[reply]
bad-iw already is limited to mainspace; that's what the article_namespace==0 part means. —RuakhTALK 22:57, 30 September 2012 (UTC)[reply]
Good, I thought that might be the case. I don't think we should go from tagging to disallowing. Most of the bad iws are due to ignorance, not bad intentions. I think it would be great if such an edit got a response like: "Interwikis should only link to entries in other Wiktionaries with exactly the same spelling. Are you sure?" That way the differences in orthographic conventions, or even scripts (Serbo-Croatian and Kurdish- need I say more...) can be accommodated, while still educating the majority who are mistaken, but editing in good faith. Chuck Entz (talk) 23:13, 30 September 2012 (UTC)[reply]
Re: "the differences in orthographic conventions, or even scripts [] can be accommodated": Well, not really. We can give the illusion of accommodating them, in that we'll happily let people add invalid interwikis, but the interwiki-bots will still just remove them at first opportunity. (That could probably be changed, if there were consensus to change it, but I don't that there is; and anyway it would need to be a separate discussion, and would probably require consultation with the maintainer(s) of the pywikipediabot framework that most of the interwiki-bots use.) —RuakhTALK 23:18, 30 September 2012 (UTC)[reply]
Curiously I managed to make a valid bad-iw edit by linking to the French Wikisource using the syntax [[:fr:s: instead of [[:s:fr:. They're both identical in terms of where you end up, but only the first one sets off the bad-iw filter. Mglovesfun (talk) 22:29, 4 October 2012 (UTC)[reply]
Are you sure? In the last 500 hits for bad-iw, the only one that's you is one from September 5th where you used the syntax [[fr:s (note the missing colon), thereby genuinely adding a bad interwiki to the entry. —RuakhTALK 23:07, 4 October 2012 (UTC)[reply]
IIRC we can educate and disallow: a first attempt will warn the user and a reattempt to save the bad edit will be blocked. I recommend this if possible.​—msh210 (talk) 03:51, 5 October 2012 (UTC)[reply]