Wiktionary:Beer parlour/2013/November: difference between revisions

Definition from Wiktionary, the free dictionary
Jump to: navigation, search
(Vowel length needs to be marked in Greek: this probably isn't the best approach to the issue.)
(Vowel length needs to be marked in Greek: I have no idea why this is one of the first examples that came to my mind…)
Line 333: Line 333:
 
*:::: Why should we? Generating all of the possible outputs in Lua and hiding unwanted ones in Javascript sounds like...bad engineering. --[[User:Ivan Štambuk|Ivan Štambuk]] ([[User talk:Ivan Štambuk|talk]]) 18:25, 16 November 2013 (UTC)
 
*:::: Why should we? Generating all of the possible outputs in Lua and hiding unwanted ones in Javascript sounds like...bad engineering. --[[User:Ivan Štambuk|Ivan Štambuk]] ([[User talk:Ivan Štambuk|talk]]) 18:25, 16 November 2013 (UTC)
 
*:::::I think we're looking at this the wrong way. The crucial decision is what the vast majority of our users are going to see, what the default is. Quite frankly, making a user preference for different Ancient Greek transliterations is probably a waste of resources. Let's focus on a singular decision. -[[User:Atelaes|Atelaes]] <small>[[User talk:Atelaes|λάλει ἐμοί]]</small> 18:35, 16 November 2013 (UTC)
 
*:::::I think we're looking at this the wrong way. The crucial decision is what the vast majority of our users are going to see, what the default is. Quite frankly, making a user preference for different Ancient Greek transliterations is probably a waste of resources. Let's focus on a singular decision. -[[User:Atelaes|Atelaes]] <small>[[User talk:Atelaes|λάλει ἐμοί]]</small> 18:35, 16 November 2013 (UTC)
  +
*::: I think such an extension, even if possible to implement, would defeat the extensive caching system built around here (template expansion results could no longer be cached, because they depend on the user who is viewing the page). Also, it would be possible to write code like <source lang="lua" container="none">if USERNAME == "Ivan Štambuk" then return "" else return "[[User:Ivan Štambuk]] smells" end</source>. Now tell me, should [[Special:Whatlinkshere/User:Ivan Štambuk]] list a page which invokes such a module? So I would not expect anything like that installed here. [[User:Kephir|Keφr]] 18:37, 16 November 2013 (UTC)

Revision as of 18:37, 16 November 2013


Worder

Hi, can an admin please remove or suppress information from Worder's user page and talk page? It contains an email address. --~ curtaintoad ~~ talk ~ 20:53, 1 November 2013 (UTC)

I took care of it. For future reference, I don't think the Beer Parlour is the best place to bring this up. Try posting on an active admin's user page, or, if it's really bad, on Wiktionary:Vandalism in progress. --WikiTiki89 21:11, 1 November 2013 (UTC)
Okay, thanks. I also got confused if this violates the policies or not (regarding email addresses), but thanks again for taking good care of it. --~ curtaintoad ~~ talk ~ 21:10, 2 November 2013 (UTC)

Rollback

Hey. Please add me to the rollback group as it will allow me to revert vandalism much better and easier. I believe I have made enough reverts to be granted this permission and to be trusted with it. Thanks, --~ curtaintoad ~~ talk ~ 22:51, 2 November 2013 (UTC)

I think it's a little too early for that, considering you've been here since August and only have 65 edits in the main namespace. You have not even been made an autopatroller yet. Keep it up though, and you'll get there soon. --WikiTiki89 23:05, 2 November 2013 (UTC)

Transliteration of Georgian consonants

What is the rationale behind transliterating the aspirated consonants with an extra «’»? I note that in Georgian: A Reading Grammar by Howard I. Aronson the ejective consonants are marked with a dot below the transliterated letter, while in Beginner's Georgian by Dodona Kiziria, there's a «’» after a transliterated ejective consonant. In IPA representation, ejective consonants are also marked with «’». With the perspective from within the English language, it also seems weird to transliterate consonants that are essientially equivalent to the English ones with an extra symbol, while the consonants that are fairly different are transliterated as if they were the closest equivalents. --Njardarlogar (talk) 10:23, 3 November 2013 (UTC)

Aspirated Georgian consonants are marked with an extra «’» in many transliteration systems, namely ISO 9984 (the one we use), ALA-LC, BGN/PCGN. The tradition probably goes back to the Hübschmann-Meillet transliteration system for Armenian, where aspirates are marked by «ʿ», but I have no proof. Your first source must be using the Caucasiological transliteration scheme employed in serious literature, such as Klimov and Fähnrich. I do not mind switching to it. Your second source must be using the National system; it is unscientific and should not be adopted by us. --Vahag (talk) 11:27, 3 November 2013 (UTC)

Separate articles for inflected forms

Discussion moved from Wiktionary:Grease pit/2013/November#Separate articles for inflected forms.

(Moved from Grease Pit) itsacatfish 14:54, 3 November 2013 (UTC)

I have mentioned this before, but I wasn't convinced. In my mind, articles such as incalzai and salmodiaba have any function at all that is not achieved by a redirect and the presence of a conjugation table on the main article - all the information is provided elsewhere, ergo the article is not needed. I feel these articles - which judging by the "hit-rate" of this type of article when using the Random entry feature take up at least 50% of all articles on the entire wiki - are just unneeded, and should be replaced with indirects in all cases, except for when there is a justification not to (such as the presence of multiple etymologies of the word or specific idiomatic usage of a particular wordform).

itsacatfish

Perhaps there is a way to prevent soft redirects from showing up at Special:Random? --WikiTiki89 21:14, 1 November 2013 (UTC)
Different words can share one or more inflected forms and the lemma form of one word can be identical to the inflected form of another word. In the latin script at least, we cannot safely predict which inflected forms that will not conflict with other words; and it would give the reader the wrong idea if a word redirected when it shouldn't. In short, redirects are a bad idea.
Ideally, the software would have been designed fundamentally different so that inflected forms were added automatically from the inflection tables rather than being created manually or by bot. That way, we could also treat pages containing only inflected forms as non-content pages. --Njardarlogar (talk) 21:51, 1 November 2013 (UTC)
Sure, there are some spellings that are known to be unique among all the languages of the world, and there many that are shared between at least two well-known languages so it's obvious that they can't be redirects. What about all the other cases? No one speaks all of the languages of the world, or even has references for them, so no one person knows if a particular spelling is unique to one language or not. Even if we're able to guess right more often than not- what happens when we're wrong?
The problem with redirects is that they're not redlinked, so someone adding a translation to an English entry may be fooled into thinking there's already an entry for it in their language. If they do realize there isn't an entry, I suspect many casual contributors won't know how to get to the redirect page and convert it to an entry, or may think they're not allowed to. We then end up with the redirect ensuring that there never will be any content on that page- regardless of the merit of the content that might be added.
By the way, this isn't a technical question, so this would have been better addressed at the Beer Parlour. Chuck Entz (talk) 03:55, 2 November 2013 (UTC)
Indeed, and Special:Random isn't really relevant because it has nothing to do with what entries people look for. Also having something like joue redirecting to jouer means you have to visually scan the whole table to find all the instances of joue. And joue has a Dutch section and a noun on it, so you can't redirect it to jouer and lose both the noun and the Dutch section. In fact the Dutch is a form of jouen so you need to simultaneously redirect to two entries, and keep the noun section. Totally impossible and in my opinion not desirable. Mglovesfun (talk) 12:12, 2 November 2013 (UTC)
Ok I take the point of "people may be scared to convert a redirect page into an actual article" argument. That might be true but it's a fairly weak argument since it's still quite possible to do, and the kind of people who wouldn't do it are probably the kind of people who wouldn't make a new article for the word anyway.
All of the other comments completely fail to take into account my final statement that redirects should be made "except for when there is a justification not to" i.e. except when there is an entry in another language or that particlar wordform has other idiomatic or unrelated meanings.
As a learner of a highly inflected language - Russian, I know that 99% of the time, I just want to find the definition of a word. In the 1% of cases that I want to find a particlar inflected form, I am quite happy to just look for it in a declension table - this is not a difficult task and does not warrant the creation of literally millions of soft redirects. itsacatfish
One of the lesser functions of Wiktionary is lemmatization/stemmer. When it becomes technically possible to relegate it to search box, Wikidata or something else, or perhaps when search of template-generated inflected forms becomes feasible, we could get rid of the those "dumb" entries that are also consuming vasts amount of database dump. Until then, you will have to make that extra mouse click. --Ivan Štambuk (talk) 20:39, 3 November 2013 (UTC)
@itsacatfish the whole point was that we cannot know which pages can be safely redirected without at any time knowing every inflected form of every language written with the relevant script. As for your example, many languages are written with the cyrillic script. --Njardarlogar (talk) 09:47, 4 November 2013 (UTC)
Another thought: redirects are not informative. The reader may not understand why the page changed: was it because of a spelling or typographic variation? A common error? An inflected form? And if so, which one? An inflected form has other informations than just "inflected form of" that can't be found easily in a table (if you even know that the information you are looking for is in a table): pronunciation, other spellings, similar sounding words, similar written words (in the same language) or typographically similar words ({{also}}). Dakdada (talk) 16:16, 4 November 2013 (UTC)

Context templates.

Previous discussion: Wiktionary:Beer parlour/2013/June#Lua-cising Template:context

A few months back, CodeCat migrated the context templates to Lua, and deleted them from the template namespace. The first half of this was indisputably a good thing, but I'm less clear on the second: one of the benefits of Lua is that it would actually make it quite straightforward to have e.g. {{transitive|...}} be exactly equivalent to {{context|transitive|...}}. (At the very least, even if we intended to eventually delete {{transitive}}, it would have made sense to set up that equivalence temporarily, so we could have a longer transition period, rather than deleting each template as soon as it had been bot-orphaned. But that's water under the bridge now.) Personally I'm actually pretty O.K. with always requiring context| — especially since we have so many contexts, and it was always impossible to keep track of which ones had their own templates and which ones didn't — but the discussion from the time does not show much support for the change, so I thought I would check and see how people feel about this now. Are we O.K. with the current behavior? Are there any die-hard fans of being able to write e.g. {{transitive|...}}? —RuakhTALK 17:33, 3 November 2013 (UTC)

I don't mind it, as long as {{cx}} (or another shortcut) remains an option. —Μετάknowledgediscuss/deeds 17:38, 3 November 2013 (UTC)
What he^ said. --WikiTiki89 17:42, 3 November 2013 (UTC)
What they^^ said. --Vahag (talk) 18:38, 3 November 2013 (UTC)
I'm ok with it as it is. Mglovesfun (talk) 18:54, 3 November 2013 (UTC)
I was upset seeing them gone, but eventually grew accustomed to {{cx}}. This is on of the cases when even a single-letter template name (e.g. {{x}}) would be justified IMHO. --Ivan Štambuk (talk) 20:31, 3 November 2013 (UTC)
I personally could live with a temporary restoration limited to whatever context templates have not been properly replaced with something Luaic.
I still wonder about how casual and new users are supposed to learn that {{context}} or {{cx}} is what is supposed to be used. Some insert hard formatting, but some use the old templates, probably because it seems plausible or because it used to work. Are such contributors actually or potentially important enough for us to worry about? I think they are, because the projects is both incomplete and needs significant quality improvement and because such contributors add diverse viewpoints and idiolects to the tasks of adding and improving entries. If we had some forms-based input that also taught users how to format using templates, then this approach would not be necessary at all. DCDuring TALK 21:32, 3 November 2013 (UTC)
Well how did they know to use the old templates? Because they saw them being used before. Likewise, they will see our new template and use it instead. So basically the only people using the old templates are the ones that have already been using them and don't know about the change, but they'll catch on soon. --WikiTiki89 21:39, 3 November 2013 (UTC)
@ Wikitiki:The old template system was as if "{{" was "(" and "|" was "," etc. It was a simple leap. It was slightly less typing than hard formatting. And nobody felt the need to require "lang=en" for the default language. It could be a little more complicated if one was trying to get something categorized as well, but that is probably rarely a high priority for end users or casual contributors. DCDuring TALK 02:10, 4 November 2013 (UTC)
Yes, but they wouldn't know they could do that unless they saw it. Since they are no longer seeing it, it will become less of a problem. --WikiTiki89 02:35, 4 November 2013 (UTC)
I suppose. The damage is done. DCDuring TALK 03:08, 4 November 2013 (UTC)
I shall just point out that we also have {{label}}. Recently I have switched to it, because the markup is shorter (language codes being, if I recall correctly, mandatory anyway). Keφr 23:53, 3 November 2013 (UTC)
To makes things even easier, we should find a shortcut name for {{label}}. --WikiTiki89 01:15, 4 November 2013 (UTC)
And no, I will not miss standalone context templates. The fewer templates, the better. Keφr 23:56, 3 November 2013 (UTC)

Some proposals at Wiktionary talk:About Arabic

Since I know that it is unlikely that people actually check this page frequently I am letting you guys know that I made a few proposals at Wiktionary talk:About Arabic. Those of you interested in our policy on Arabic, please check it out. --WikiTiki89 01:51, 4 November 2013 (UTC)

I've replied to your three questions there. --Anatoli (обсудить/вклад) 02:04, 4 November 2013 (UTC)

Introducting Beta Features

(Apologies for writing in English. Please translate if necessary)

We would like to let you know about Beta Features, a new program from the Wikimedia Foundation that lets you try out new features before they are released for everyone.

Think of it as a digital laboratory where community members can preview upcoming software and give feedback to help improve them. This special preference page lets designers and engineers experiment with new features on a broad scale, but in a way that's not disruptive.

Beta Features is now ready for testing on MediaWiki.org. It will also be released on Wikimedia Commons and MetaWiki this Thursday, 7 November. Based on test results, the plan is to release it on all wikis worldwide on 21 November, 2013.

Here are the first features you can test this week:

Would you like to try out Beta Features now? After you log in on MediaWiki.org, a small 'Beta' link will appear next to your 'Preferences'. Click on it to see features you can test, check the ones you want, then click 'Save'. Learn more on the Beta Features page.

After you've tested Beta Features, please let the developers know what you think on this discussion page -- or report any bugs here on Bugzilla. You're also welcome to join this IRC office hours chat on Friday, 8 November at 18:30 UTC.

Beta Features was developed by the Wikimedia Foundation's Design, Multimedia and VisualEditor teams. Along with other developers, they will be adding new features to this experimental program every few weeks. They are very grateful to all the community members who helped create this project — and look forward to many more productive collaborations in the future.

Enjoy, and don't forget to let developers know what you think! Keegan (WMF) (talk) 19:48, 5 November 2013 (UTC)

Distributed via Global message delivery (wrong page? Correct it here), 19:48, 5 November 2013 (UTC)
I see that the Typography Update beta on MediaWiki.org has been rescheduled for Nov 14. Michael Z. 2013-11-08 23:20 z
No sign of it yet. Michael Z. 2013-11-14 16:26 z

Bots anyone?

I don't know if this is the right place to ask about that, but... I've been doing a lot of form-of page entering lately, and I'm wondering if one of the bot owners here would like to help me. User:George Animal used to help me upload form-of pages with his User:GanimalBot, but he hasn't been here much lately... is anyone with a bot interested in adding lots of pre-formatted pages on Latvian adjective and verb forms? --Pereru (talk) 20:15, 8 November 2013 (UTC)

For the future, bot requests are normally done at the Grease pit. --WikiTiki89 20:23, 8 November 2013 (UTC)
Yes. Could you describe the workflow? How are the pages to be generated? DTLHS (talk) 20:18, 8 November 2013 (UTC)
Basically, I create the pages using subst: with a template (like User:Pereru/Adjective forms/source code) and then I place it at User:Pereru/Adjective forms. The result is a single file in which the individual form-of pages have the format:
xxxx
PAGENAME
==Latvian==
......
yyyy

xxx
PAGENAME
==Latvian==
......
yyyy

George Animal would then use that page as an input for a script that splits the content at the xxxx-yyyy border, uses PAGENAME as the name of the page to create, and the rest (from ==Latvian== down to yyyy) as the ocntents of said page. Is that helpful? (Also, if a page of that name already exists, the bot -- at least George Animal did -- warned me about that so that I could deal with it manually, though there may be better solutions to that. --Pereru (talk) 00:08, 9 November 2013 (UTC)
A bot can certainly insert a new language to a page that already exists, but if the page already has that language, it's much more difficult and should probably be done manually. --WikiTiki89 00:17, 9 November 2013 (UTC)
  • Before running a bot to create new inflected form, the same bot should check for errors in existing inflected form entries, as well as for incompletely created paradigms. --Ivan Štambuk (talk) 08:40, 9 November 2013 (UTC)
Now I'm considering building my own bot, regardless of the language version; and regardless, any tutorials on that matter that any bot-oriented veteran could have sent to at least my talk page? --Lo Ximiendo (talk) 10:22, 9 November 2013 (UTC)
OK. Sorry for starting the discussion here--I didn't think that asking' for bot help would be a technical matter, so I didn't place it in the Grease Pit. So, DTLHS, do you think you can help me? (Ivan, I see how the bot could look for incompletely created paradigms, but how could it know a form is wrong? Don't you need human input for that last part?) --Pereru (talk) 10:46, 9 November 2013 (UTC)
1) Extract a linked lemma from the form-of entry 2) generate inflected forms for the lemma 3) isolate those forms equal to the inspected form-of entry 4) compare the existing entry to the generated entry 5) if different, add attention tag. --Ivan Štambuk (talk) 12:01, 9 November 2013 (UTC)
I see. Sounds feasible, though beyond my ken. If someone here with a bot can do that to my form-of pages, I would certainly be glad. So -- is anyone available? --Pereru (talk) 22:01, 9 November 2013 (UTC)
Yes, I can do it easily. I have watched User:Pereru/Adjective forms and will do a test run when you update it. DTLHS (talk) 01:25, 10 November 2013 (UTC)
Good. I'm placing the inflected forms of two new adjectives there right now (spējīgs and izdarīgs) -- that's about 120 forms, that should be enough for your test run. I'm looking forward to seeing this work! (By the way, since you haven't created your User page yet, how do I get in touch with you when there are more adjective / participle forms to upload? Or should I just leave forms at User:Pereru/Adjective forms for you without any message or warning? --Pereru (talk) 12:15, 10 November 2013 (UTC)
Sorry to come to this discussion late. Myasis doesn't appeal to me, so I think I'll pass (see this for details)... ;) Chuck Entz (talk) 01:57, 10 November 2013 (UTC)

The same lame joke I always make...

I will be taking a trip tomorrow, and will be unable to edit for a week. Please try to finish the dictionary by the time I get back. Cheers! bd2412 T 02:10, 9 November 2013 (UTC)

No problem. It will be done. --WikiTiki89 03:03, 9 November 2013 (UTC)
It's all looking pretty much done, except for a certain motley section devoted to the Judeo-Arabic language *cough cough* ;)Μετάknowledgediscuss/deeds 06:24, 9 November 2013 (UTC)
It will probably take about 2 weeks. Haplogy () 07:16, 9 November 2013 (UTC)
We've been trying since about 2003 now. Mglovesfun (talk) 18:34, 9 November 2013 (UTC)

Absolutely no way to add an external link when not logged in?

Well, I must say your policy has become WAY stricter than Wikipedia's meanwhile!! My external link was "automatically deemed harmful" and rejected. I can't believe that! In 8 years, it has never happened in Wikipedia that any of the references I added for clarification was ever rejected in any way! That's no "policy" anymore; that's on the way to be called paranoia! Not amused. This is the entry in question: חן So is there any place where I can read about this policy? I still think this policy of total rejection of external links for new and/or unregistered users is unacceptable. Remember Wikipedia existed when Wiktionary was still in the making, so don't try to be stricter than them! -andy 77.191.199.87 17:41, 9 November 2013 (UTC)

It's not a policy, it's a spam filter. Wikipedia has better spam-reverting bots than we do. Anyway, I have fixed the formatting in your edit, please try to use the correct templates next time. --WikiTiki89 18:03, 9 November 2013 (UTC)
You're joking man? What do you mean by "correct templates"? These poor excuses for templates by chance? There is not even a transliteration possible with them! So with my---as you call them---"wrong" ones I do have at least a transliteration! (what you can perfectly see in the entry's old history) So even though you call them "wrong", they're way better! Now all my transliteration efforts have gone down the drain. Very nice, really. :-/ -andy 77.191.199.87 18:08, 9 November 2013 (UTC)
I don't know what you're talking about, transliterations are supported by virtually all our templates. My advice to you is to stop complaining so much. If you don't understand how things work around here, then ask questions politely and learn. --WikiTiki89 20:07, 9 November 2013 (UTC)
Ironically, Wikipedia would probably frown upon this kind of reference even more than us. See w:WP:SELFPUB. Keφr 18:19, 9 November 2013 (UTC)
Why not create an account? Mglovesfun (talk) 18:33, 9 November 2013 (UTC)
The spam filter blocks any attempt by a user with <2 edits to add an external link. So far today, it has blocked nine spammers from making 14+ edits, one vandalistic but non-spam edit, this questionable edit (which I instated for demonstration purposes and then undid), two possibly-legitimate or possibly SELFPUB / self-published-blog-pushing edits (including yours), and no unambiguously helpful edits. (In days past, it has also blocked quite a few editors who try to link to Wikipedia by pasting URLs rather than using w: notation.) - -sche (discuss) 19:23, 9 November 2013 (UTC)
At least our sandbox is what Lua is for, thankfully. --Lo Ximiendo (talk) 19:33, 9 November 2013 (UTC)

Ivan's endorsement of Altaic

Ivan recently created Category:Altaic languages, added languages to the family in Module:languages and Module:families, and also created a code for Proto-Altaic. I think this endorsement of a theory that is not widely accepted is very worrying, especially when the category failed RFD for the same reason 4 years ago. —CodeCat 16:38, 10 November 2013 (UTC)

So what if it is controversial? "Failing" RfDO in which no one participated years ago does not mean it cannot be recreated. --Ivan Štambuk (talk) 16:46, 10 November 2013 (UTC)
I see no problem with creating the category as long as we mention on its page that its existence is controversial. --WikiTiki89 17:08, 10 November 2013 (UTC)
This kind of stuff needs to be discussed beforehand. I got rid of everything for now. There are millions of theories. If we have this, what stops one from adding Dene-Caucasian next? -- Liliana 17:25, 10 November 2013 (UTC)
No there are not millions of theories. There are in fact very few theories, and specifically Altaic Studies is an established field of scholarship. Here you have inexplicably removed perfectly valid referenced etymology. But let's return to the matter - why do you think we shouldn't include Proto-Altaic reconstructions? --Ivan Štambuk (talk) 17:35, 10 November 2013 (UTC)
As of yet Altaic has not been accepted by the majority of linguists, and we shouldn't lead readers the wrong way by purpoting this theory. Anything that is accepted in the linguistic community is fine, but Altaic isn't so far. -- Liliana 17:41, 10 November 2013 (UTC)
Wikipedia is driven by notability, not acceptance. Controversial topics merit their own articles if they are proven to be notable enough. Similar position could be taken for groupings of languages - theories such as Altaic, Nostratic and so on have decades of scholarship behind them, active community of researchers, and published works (by non-fringe publishers such as Brill), so it's safe to assume that somebody is reading them, and that Wiktionary readers could be interested in them as well. No harm is done if such theories are unambiguously marked as not widely endorsed, similar to what we already do for "unsafe" reconstructions and speculations of prehistorical borrowings. If we could have "all words in all languages", why not also "all reconstructions in all protolanguages" :)
I also think that we should host obsolete etymologies and reconstructions, because they are important for historical reasons. It would be interesting to read evolving explanations of origin of a word X evolved through the ages. --Ivan Štambuk (talk) 18:00, 10 November 2013 (UTC)
I think this is exactly why we can't just blindly rely on references. If references assume Altaic is valid, that certainly doesn't mean we should just copy their point of view. Some OR is necessary to separate it. —CodeCat 17:43, 10 November 2013 (UTC)
I have references that prove that the earth is a flat surface. Can I add it to our definition on earth since it's sourced? -- Liliana 17:47, 10 November 2013 (UTC)
Since Wiktionary only cares about meanings of attested words, you could do that only if you have attestations of Earth being used in the meaning "flat surface". Which is when you think about it not that improbable.. --Ivan Štambuk (talk) 18:07, 10 November 2013 (UTC)
But why not copy their POV, if it is clearly marked? Every reconstruction and etymology in general is a POV by the person signing it. --Ivan Štambuk (talk) 18:00, 10 November 2013 (UTC)
We shouldn't sort e.g. Category:Turkic languages into Category:Altaic languages, but I suppose the latter category still has a reason to exist, namely to contain Category:Proto-Altaic language (and Category:Terms derived from Proto-Altaic). I think it's acceptable to include Altaic theories in etymologies as long as they're sourced and qualified by a mention that the existence of Altaic is controversial, e.g. "As part of the controversial Altaic theory, Smith connects this word to Japanese foo'." A stronger qualifier than was suggested here, certainly. - -sche (discuss) 17:36, 10 November 2013 (UTC)
Maybe it can be mentioned, but this shouldn't warrant a whole category system, or else we will soon have Category:Dene-Caucasian languages. -- Liliana 17:41, 10 November 2013 (UTC)
I'm not convinced that accepting one thing forces us to accept a second, more controversial thing. (If WT:RFD has taught us anything, it's that Wiktionary is capable of being inconsistent.) And if we do mention any Dene-Caucasian theories, I think it is a good idea to gather them in a category in case we later decide to delete them. That said, I'm not wedded to the categories; I could live with allowing qualified mentions of Altaic theories while denying them a code and a category. - -sche (discuss) 18:01, 10 November 2013 (UTC)
It makes sense for there to be a category. Categories are meant for categorization and it is certainly useful to have a category of terms that have proposed Altaic etymologies or of languages that are proposed to have descended from Altaic. The existence of the categories does not imply that we endorse these theories. --WikiTiki89 19:03, 10 November 2013 (UTC)
I certainly agree with the view aiming to create a category for Altaic languages as far as reliable sources are evident, and I think reliable sources such as ToB are worth to be used. --Hirabutor (talk) 20:28, 10 November 2013 (UTC)
So you want Dene-Caucasian too? -- Liliana 22:28, 10 November 2013 (UTC)
If anyone wants to add it, I wouldn't mind. --WikiTiki89 22:52, 10 November 2013 (UTC)
This has the potential of going on forever. Here's my $.02:
(a) There are (more or less) consensus theories, and there are (more or less) non-consensus theories;
(b) The traditional role of a dictionary would be to go with the best (= closest to consensus) theories whenever possible; this would make Altaic unacceptable (as it would Amerind, another proposed superfamily that has next to nothing of substance in favor of it -- yet there are references, "cognate" sets, "etymologies", etc.
(c) We might decide this is not the case for Wiktionary, since wiki-is-not-paper, we-have-plenty-of-space etc. Ivan suggests even older reconstructions should have a place here; in this case, we might even actually accept any proposed reconstructed form, any proposed hypothesis (including Altaic and Amerind), simply adding comments to it ("this hypothesis is rejected by most scholars", etc.; perhaps some handy templates could be created).
(d) However, this would complicate matters enormously for the casual reader -- since in principle Etymology sections could make reference to all these reconstructed forms, obsolete and dernier cri, consensus or far from it. Now, maybe an Etymological Dictionary should be a project in itself, independent from Wiktionary, in which all details of all published theories could be taken into account. But as long as we remain within Wiktionary, it seems we shouldn't try to limit etymological information to some extent.
(e) Which is why in the end I prefer to stick to the traditional practice: only things that are as close to consensus as possible. Therefore, no Altaic, no Amerind, no Nostratic, etc.; at least not in the Etymology sections. (One could, of course, create independent Appendix pages for all proposed reconstructions at all levels, perhaps with indexes in the Appendix to facilitate navigation; but only the (near-)consensus forms should appear in Etymology sections.).--Pereru (talk) 22:58, 10 November 2013 (UTC)
Another facet of this is the reference in some Japanese and Korean etymologies to comparison with Turkic and other Altaic languages (for instance, the one added in diff). I get the impression that the Altaic theories are more accepted in references for those languages- probably due to the lack of anything better for inherited terms in language isolates such as these. I'm not sure how easy it would be to even find all of the entries with these, let alone to convert them. Personally, I wouldn't mind reference in etymologies (with proper cautions/qualifiers) to a few of the more linguistically-rigorous minority theories, but we have to be selective. The sheer volume of mutually-contradicting speculative theories for isolates such as Basque and Sumerian, and (even some Indo-European and Afro-Asiatic languages) would make a horrible mess out of quite a few entries. Chuck Entz (talk) 23:59, 10 November 2013 (UTC)
I don't see a problem in individual theories and etymologies being mutually contradicting. There is functionally no difference between 1) unknown origin of a word in a language belonging to an widely accepted language family, or reconstruction established within a generally accepted protolanguage, which often have half a dozen proposed explanations ranging from probable to speculative. 2) Unknown etymology of a word in a language isolate, or reconstruction established within minority accepted protolanguage such as Altaic, which are inherently moderately speculative. Really, why shouldn't someone interested in Basque or Sumerian see a list of all of the proposed far-range etymologies, provided they are all clearly marked as speculative by their respective authors, and not generally endorsed. Perhaps a necessary notability filter should be published work? --Ivan Štambuk (talk) 05:39, 11 November 2013 (UTC)

How about this proposal:

  1. Altaic, Nostratic and other minority-held theories should only be created in the appendix namespace, as reconstructions in their respective protolanguage frameworks.
  2. Instead of the usual {{reconstructed}} template they would have {{reconstructed-minority}} which would clearly indicate that we're not dealing with a widely accepted theory.
  3. Language isolates (i.e. entries in the main namespace) and protolanguages (i.e. entries in the appendix namespace) covered by such theories should in their respective etymology sections link to such appendices only by means of a special template. There would be no lists of (potential) cognates - user would have to click on the link to the appendix page. That template would have a wording reflecting a degree of uncertainty, such as Within the controversial Proto-Altaic theory, derived from *X, where *X would link to the appendix page. Usage of such template would ensure that editors adding such etymologies don't overzealously emphasize genetic relationship.
  4. For such theories, only reconstructions occurring in published sources are allowed and references are mandatory.

Later if the community decided that e.g. listing of cognates in the main namespace would be appropriate, adding them would be a matter of copy/paste. --Ivan Štambuk (talk) 05:39, 11 November 2013 (UTC)

Altaicist theories should be allowed, for example in 두루미, although the actual reconstructions probably shouldn't, due to crudeness. Wyang (talk) 05:58, 11 November 2013 (UTC)

Well for the "crane" word Starostin-Dybo-Mudrak's dictionary reconstructs PA *tùru ( ~ *ti̯ùro). That doesn't seem too crude to me, as opposed to e.g. Nostratic etymologies which are full of cover symbols (V for vowel and similar). If there are multiple incompatible reconstructions, there is no problem in listing them all in the page name. --Ivan Štambuk (talk) 07:04, 11 November 2013 (UTC)
So what was the revert about? Any good reason for that? -- Liliana 07:27, 12 November 2013 (UTC)
It seems that most of the interested people support having Altaic reconstructions in some form. Your removal of tut and tut-pro codes from Module:languages and Module:families has also caused script errors in some instances where they were used through {{term}} and {{etym}}. --Ivan Štambuk (talk) 07:43, 12 November 2013 (UTC)
But what does the existence of Proto-Altaic have to do with making Turkic, Mongolic etc. subcategories of the Category:Altaic languages? If we have this, what stops people from making Category:Indo-European languages a subcategory of Category:Nostratic languages? -- Liliana 08:17, 12 November 2013 (UTC)
Isn't it required for automatic categorization to work properly? I think it's best that the treatment of proposed but not generally accepted families be handled on a case-by-case basis. We can ignore issues connected with Dene-Caucasian, Nostratic etc. until someone sufficiently knowledgeable starts adding their etymologies/protolanguage appendices. Each of those proposed families has a different set of issues. --Ivan Štambuk (talk) 09:33, 12 November 2013 (UTC)
I think it's only needed so categories like Category:Terms derived from Altaic languages get proper subcategories depending on the members of the family. Arguably, if we leave everything as is and only let {{etyl|tut}} directly categorize in there, we can just leave everything as is and have Category:Altaic languages with no language family subcategories. -- Liliana 13:02, 12 November 2013 (UTC)
I agree that Japonic, etc, shouldn't be in Category:Altaic languages. We shouldn't categorize language families into hypothetical superfamilies the existence of which is disputed / rejected by many. Our categorization system is slipping far enough out of sync with modern scholarship as it is (people have already raised issues with our categorization of things as Finno-Ugric vs Uralic). - -sche (discuss) 17:39, 13 November 2013 (UTC)
If it doesn't break anything, feel free to remove it. Also, what is really the problem with having a language categorized into controversial, or multiple and incompatible Stammbaums? It's not like putting a language into a category means "the official position of Wiktionary is that Japanese is an Altaic language". It's just meant to look things up. Slapping a banner that says "This category represents a language family that is not generally endorsed" should suffice IMHO. --Ivan Štambuk (talk) 06:15, 14 November 2013 (UTC)

You can lead a bot to the river, but you can't feed it

Trying to feed the bot recently, I got an error message saying I wasn't allowed to edit another user's pages. Any way to make an exception? --ElisaVan (talk) 11:29, 11 November 2013 (UTC)

The filter in question came up recently in the Grease Pit (quod vide). Perhaps User talk:BuchmeierBot/FeedMe could serve as a staging area, with the understanding that the bot owner or others must review the accuracy of anything posted there before letting the bot create the entries. I see you've already thought of that. :) - -sche (discuss) 06:51, 12 November 2013 (UTC)

Toggle (HTML/CSS/js) question

I wanted to inquire whether there's a way to modify the toggle link text (and its styling, e.g., make it superscript) specifically on en.wikt. Here's what my creation looks like right now {{lv-pron}}, but instead of "expand" (or "collapse") I'd like the link/button (or w/e it should be called) to say "references." I'm using the built in toggle functionality that's shipped with all MediaWiki installations (for example, Wikipedia's toggle templates didn't work on here.) Is there any way to modify the text/style of the toggle link/button? In my tests it seemed to be immune to a parent element (another div wrapped around it) having, e.g., font-size:small. Neitrāls vārds (talk) 21:51, 12 November 2013 (UTC)

See mw:ResourceLoader/Default_modules#jquery.makeCollapsible. You can use the data-collapsetext and data-expandtext attributes to change the labels, or use customtoggles to make the toggle look however you'd like. --Yair rand (talk) 22:35, 12 November 2013 (UTC)
Thanks! Neitrāls vārds (talk) 16:23, 13 November 2013 (UTC)
Don’t we already have about five different ways of including footnotes? Heck, this is putting three different interfaces into a single short bullet point: link for “IPA”, superscripted link in parentheses for “key”, and now add a superscripted expand control with a dingbat character in square brackets for “references” (which is apparently meant to reveal a link decorated with an icon leading directly to a raw audio file?).
The type design is also too busy. Professional designers avoid superscript text containing whole words, punctuation, or extra dingbat symbols for good reason.
This looks like far too much visual clutter and cognitive overhead for a reader. Why can’t we just link something with a link any more? Michael Z. 2013-11-13 16:42 z
Personally, I can't stand sliding and would prefer if it expanded instantly. --WikiTiki89 16:59, 13 November 2013 (UTC)
We need to expand a two-word link at all?
More importantly, how does automatically-generated machine audio constitute a reference for pronunciation of human language? Michael Z. 2013-11-13 17:23 z
By the way, I don't believe there was much consensus for including synthesized speech. --WikiTiki89 17:40, 13 November 2013 (UTC)
Bingo. Just include "(speech synthesis)" in parentheses after the IPA, unhidden, if it is to be included at all. But like Michael, I'm not sure auto-generated speech can function as a reference... - -sche (discuss) 17:29, 13 November 2013 (UTC)
A reference is an authority. If we refer to it, we should formally cite it, not just throw an un-annotated link to it in the body of an entry.
But these Google pronunciations are not suitable references. They are synthesized based on pronunciations spidered from the web. I bet the main sources are Wikipedia and Wiktionary, as well as other dictionaries, but the synthesis is of unknown reliability, and is not an authority. For goodness’ sake, please don’t enter IPA transcriptions based on Google’s synthesized pronunciations!
If we link to these as additional resources, we should do so the same way as we do to other external links. Michael Z. 2013-11-13 23:51 z

Yes, the "references" did look kind of awkward so I changed it to "audio" yesterday. On the subject of adding as any other external reference - the way I want to make entries (well "researched") I'm already kind of pushing it (with how many references I have.) For example in bēbis I went up to 3. 1st ref'ing that it is indeed a borrowing from English, 2nd first attested use ever, in a non-standard form "bebijs," 3rd first attested use of the modern form "bēbis" in the rather authoritative multi-volume konversācijas vārdnīca. That's kind of a lot for a dictionary yet all of them add value to the entry. Labeling the synth. speech "references" might have been misleading I kind of intend it to be just an alternative because after all there are as many IPA styles as there are users, for example I think that AIDS#Latvian should be ['aits] another person could go for ['aids] yet another ['ajds]... So just an alternative. The synthesizer is just using a map of chars to sounds it (unfortunately) is not pulling IPA's from Wikt. (I hope that in future it could.) With "trickier" words it can be off (e.g., not usable on čoms) and should probably be used only by native speakers "with discretion." Not saying everyone should use it. Some people choose to use {{usex}} others choose not to, for example, I don't think that slight variations in layout like this are detrimental to the dictionary.

Personally I am more of a user than an editor on Wikt. (those detailed Latvian etyms just keep me coming back, also, English idioms...) And I think it's convenient from a user's perspective, for example, Hungarian has a rather similar sound inventory to Latvian (there are differences ofc.) But even with IPA I feel overwhelmed by words like agyhártyagyulladás, I mean, I know it would be aģhārķaģulladāš (respelled with Latvian letters) but it's actually quite an amount of mental work to read the IPA and put it together in your head. Stuff like this agyhártyagyulladás would be a boon to lazy people like me. Obv. I'd know it's a robot and that it might not be 100% correct but at that point I couldn't care less, I want to hear it and I want to hear it now, lol. I'm not saying HU editors should be adding that type of link to their entries although if they were I def wouldn't mind either. Neitrāls vārds (talk) 11:54, 14 November 2013 (UTC)

P.S. I agree that the sliding is slightly annoying but I didn't see how to disable it in the toggle documentation, let me know if you know how. Also, the IPA template shouldn't have a separate (key) link at all, imo. IPA should be linking to what (key) is linking to right now. Right now IPA links to a page with the academic names of all of the sounds which might not be the most relevant page to link to with every mention of IPA. Neitrāls vārds (talk) 11:54, 14 November 2013 (UTC)

If you yourself find it convenient then go ahead and enter the word in Google translate and click play. But as far as Wiktionary is concerned, bad pronunciations are worse than no pronunciations. --WikiTiki89 15:22, 14 November 2013 (UTC)

Displaying adjectives in the adverb headers?

Is it a good idea to display adjectives in the adverb headwords? E.g. French heureusement (adjective heureux), Russian сча́стливо or счастли́во (adjective счастли́вый). I'm thinking of changing Russian Module:ru-headword a bit to allow an additional optional parameter, just using a French example to make it clearer what I need. If an adverb already has a comparative form in the header, what should the order be?

Is a display like this acceptable? For example for глубоко́:

глубоко́ (glubokó) (comparative глу́бже, adjective глубо́кий)

OR should it be

глубоко́ (glubokó) (adjective глубо́кий, comparative глу́бже)

Perhaps the 2nd example is more confusing, it may not be clear if comparative form refers to the adverb, to the adjective or both.

--Anatoli (обсудить/вклад) 23:36, 13 November 2013 (UTC)

I think that is a good idea and the first version is better. --WikiTiki89 00:07, 14 November 2013 (UTC)
OK, thanks. I'll wait for more feedback. Will change it later. BTW, I think Russian comparatives should be unified under adverbs, except for a few adjectival comparatives like худший, лучший, больший, меньший, старший, младший (that's all perhaps). The rest of comparatives are all adverbs and grammatically used differently, no need to duplicate headers like at лучше, хуже, etc, which have both Adjective and Adverb headers. In other words, most comparatives forms for most adjectives are adverbials (if they don't use additional words более or менее). --Anatoli (обсудить/вклад) 00:20, 14 November 2013 (UTC)
Yeah, I agree that лучше and хуже are not adjectives. --WikiTiki89 00:45, 14 November 2013 (UTC)
The derived terms header isn't good enough? DCDuring TALK 01:14, 14 November 2013 (UTC)
It's good enough but this is an alternative and a shortcut for adverbs derived from adjectives (usually not the other way around). Cf French [[heureusement]] is derived from [[heureux]] (not the other way around). It's similar in Russian for many adverbs. (Not planning to change French headers as well but it would make sense, IMHO). I highlighted "alternative", one can always do the usual way and the longer way ("====Derived terms====") and not all adverbs are derived from adjectives or may have a corresponding adjective. --Anatoli (обсудить/вклад) 01:32, 14 November 2013 (UTC)
So, then the adjective will appear in the "Etymology", right? And it will appear under "Related terms" as well, yes? So why confuse readers with other parts of speech in the headword line, which is meant to be a shortened form of the inflection section, with transcription information. I dont' think that mixing other parts of speech in the headword line is a good idea. --EncycloPetey (talk) 01:44, 14 November 2013 (UTC)
Are you sure this is a good idea? If it's good for de-adjectival adverbs, then why not for diminutives (cf. Dutch nouns), negatives, nicknames, and especially more-frequently-used synonyms, as well as other possible targets for shortcuts that someone more creative than I could dream up?
I know that some view English de-adjectival adverbs ending in -ly as inflections, but this seems a minority view, not widely accepted even among linguists, let alone lexicographers. Is the situation different among Russian lexicographers and linguists? DCDuring TALK 01:49, 14 November 2013 (UTC)
Many Russian adverbs ending in -о /-е are short neuter adjective forms - честный (short form/neuter) -> честно. There is no problem with this view, AFAIK but I haven't done a research yet. Many Polish noun headwords includes diminutives as well, e.g. ryba#Polish. I don't see a problem with that. --Anatoli (обсудить/вклад) 02:12, 14 November 2013 (UTC)
They are not short neuter adjective forms, but independent words formed from an adjectival stem with the suffix -o/e. They should in fact be formatted under different etymologies, because these same-spelled -o/e are two different pairs of suffixes. --Ivan Štambuk (talk) 13:17, 15 November 2013 (UTC)

@EncycloPetey. With this approach the term won't appear in the "Etymology" or "Related terms". As adjectives and adverbs have the same root, the etymology will not be duplicated and may be only in adjectives. Re "mixing other parts of speech" is a stronger argument (or DCDuring's suggestion) against this approach. What about changing the header to e.g. from adjective глубо́кий? --Anatoli (обсудить/вклад) 02:27, 14 November 2013 (UTC)

Numerals redux

Is there any one place where our policy on when to describe something as a "number"/"cardinal number"/"cardinal numeral"/etc is comprehensively documented? A user recently changed quite a few entries from those things to "numeral", but I recall that there was actually a logic behind our use of the different terms. - -sche (discuss) 02:21, 14 November 2013 (UTC)

I stopped following the arguments when I took my extended wikibreak near the end of 2010, but at that time, no community consensus had been reached. However, three years have passed and some decision may now exist that I do not know about. --EncycloPetey (talk) 04:49, 14 November 2013 (UTC)
No policy, there was a vote but it failed. De facto policy seems to be to prefer numeral though. Maybe it's time to list pros and cons of both terms and restart the vote? --Ivan Štambuk (talk) 06:06, 14 November 2013 (UTC)

Vowel length needs to be marked in Greek

Hello. I'm fairly new to Wiktionary but not to English Wikipedia, where I've done a great deal of work editing historical linguistics articles, esp. on Indo-European (IE) languages (Ancient Greek, Proto-Greek, Old English, Gothic, Proto-Germanic, Latin, various Romance languages, Old Irish, Proto-Celtic, various Slavic languages, Proto-Slavic, Proto-Balto-Slavic, Tocharian, Sanskrit, numerous Proto-Indo-European articles, etc.).

In this case, a number of Wiktionary articles on words for "drink" in various IE languages included references to the Greek word pī́nō "I drink", but written with no length mark on the i, either in the original Greek or the English transcription.

Length marks are (correctly) noted in all other languages on Wiktionary AFAIK, including Latin, Old English, Old High German, etc., and need to be there in Greek as well. In this case the length of the i is extremely important in understanding the close cognacy of the Greek word with e.g. the pi- of Slavic piti (stemming from long pī- in Balto-Slavic) and also Albanian and probably modern Indic languages (with pīnā- and the like) but less so with words like Sanskrit pibati, Latin bibō, Old Irish ibid, where the short i in all of these is a reduplication vowel and is unrelated to the long ī of the other forms.

In this case, I corrected the problem in the transcription of this word in the various pages, and they all ended up reverted, with a link to Wiktionary:Ancient Greek romanization and pronunciation, which asserts that e.g.

In Classical polytonic, the length distinction of ᾰ ([a]) and ᾱ ([aː]) is not indicated usually in writing nor in transcription. However, if ᾱ needs to be transcribed, ā suffices.

This appears to represent a Classicist viewpoint, where length is often omitted because the original texts omitted such length marks and the exact form of words is secondary to their meanings and the broader significance of literary texts. This is a fine practice in a Classicist context. However, Wiktionary is not a Classicist work but fundamentally a linguistic work, particularly when discussing etymologies, and from a linguistic standpoint this suggestion not to include length marks is completely, 100% wrong. All historical linguistic works that discuss Ancient Greek, whether by itself or in the context of other Indo-European languages, include length marks consistently on all Greek words cited (likewise on all other words cited in all other languages where phonemic length exists). Note that in Greek this applies only to α ι υ because the other vowels have inherently distinct ways of notating short vs. long vowels (ε vs. η and ει, ο vs. ω and ου).

We need to follow this practice, also. This should not be very controversial; I am at least 99% positive that all linguists will agree with me, because all follow these conventions and understand their importance.

If for some reason or other people object on aesthetic grounds to including length marks, they still need to be included in the transcription.

Please also note that in a linguistic context, transcription is critical and often exceeds in importance the inclusion of the original text. This is contrary to the Classicist viewpoint, as expressed e.g. by Atelaes, who said:

Transliterations are never used here as a substitute for the original script, as they are in many other contexts. They are a pedagogic tool, used to help those who don't understand the original script, which they accompany. So, they are an approximation for the uninformed. A highly precise technical transliteration is unnecessary, and serves only to confuse those whom it is meant to help.

This viewpoint however is wrong from a linguistic standpoint. As a simple demonstration of this, consider the discussion of the etymology of the Old Irish word ibid "he drinks", which either does or should make references to Latin bibō and pōtō, Greek pī́nō, Armenian ǝmpǝm, Sanskrit pibati, Old Church Slavonic piti. (Various of the articles on these words, all meaning "drink", reference various of the other words, but not all articles reference all words.) There are at least four non-Latin scripts here (Greek, Armenian, Devanagari, Cyrillic) if we insist on representing the words in their original scripts. Requiring that all our readers understand all of these scripts and claiming that transcription is of secondary importance and only for "uninformed" readers will make everyone go utterly crazy. It's for this reason that Indo-European historical linguistics books often don't bother to include the original script at all, but only the transcription. An exception is often made for Greek in highly technical works because it's assumed that the highly technical readers of them will know Greek script, but layman introductions (e.g. Benjamin Fortson's "Indo-European Language and Culture: An Introduction", James Clackson's "Indo-European Linguistics: An Introduction", Philip Baldi "An Introduction to the Indo-European Languages" etc.) invariably transcribe Greek and often leave out the original script, as with the others. The intelligent layman reader of these books is the same type of reader paying attention to the etymology entries, and we should follow the same conventions used in these books. I'm not suggesting throwing away the original script (which is also extremely useful, for a slightly different but still important set of readers), but (a) the transcription is absolutely key and must be included whether or not the original-script text is present, and (b) vowel length must always be notated, both in the original and in transcription.

I suggest that the text on Wiktionary:Ancient Greek romanization and pronunciation should instead read

Although in Classical polytonic, the length distinction of ᾰ ([a]) and ᾱ ([aː]) is not normally indicated in writing, Greek words in Wiktionary should indicate vowel length both in writing and transcription, with the long vowel indicated as ᾱ, transcribed as ā.

Similarly for ι and υ.

Benwing (talk) 10:02, 15 November 2013 (UTC)

  • I agree with default scholarly transliteration for Greek, Arabic, Persian (macrons instead of circumflexes), Russian and so on. For Ancient Greek this could easily be remedied by fixing Module:grc-translit. Greek lengths, however, should only be displayed in the headword line, and not in a page name, like it is the practice for Latin (and stripped when wikilinking with {{term}} and {{l}})). --Ivan Štambuk (talk) 12:45, 15 November 2013 (UTC)
  • Would these accent marks interfere with other Polytonic accent marks? --WikiTiki89 14:40, 15 November 2013 (UTC)

I have addressed this issue and some closely related others a number of times, and so I imagine many will tire of reading this. I think that Benwing does well to raise the issue of context, of exactly what type of work we are and/or what we are trying to be. However, my mind produces a different answer than theirs, which might well explain our disagreement. I think that Wiktionary is supposed to be a general reference work. We are trying to give every possible reader every bit of information on a given word or phrase that they might want to know. This is, of course, impossible (to those who still clung to that lofty ideal, I apologize for shattering your hopes and dreams). Different readers have different needs, and to put any one bit of information that one would like runs the risk of confusing or distracting another. That being said, impossibly lofty goals are often worth striving for nonetheless. When Benwing says that we are or should be a linguistic work (I can only assume they mean historical linguistics, based on their other comments) I must disagree. To be clear, I am glad that we have the capacity for more involved etymologies than most comparable reference works. I feel quite proud that "my" dictionary has full-blown entries for hypothesized terms in hypothesized languages. I think all of this is useful and interesting and I absolutely support its inclusion. However, I simply can't believe that this is our primary thrust. If I were forced to come up with a most common use scenario, I would think it would be more along the lines of someone encounters a word or phrase while reading or speaking, and wants to know what it means. Knowing the history of a word can definitely help flesh out the answer to that question, but I feel it must be secondary to the definitions. And so I will say, as I have said before many times, that I think our transliterations serve to bridge the gap for someone who does not know the script, and that highly nuanced and technical transliterations do a disservice to the majority of their users. Information of such a nature should be (and often is) covered in the pronunciation section of an entry, where we can document specific dialectical and temporal nuances. In spite of its admitted shortcomings (I think it would be well served to be rewritten in Lua, which I have long-term plans to do), I would hold out {{grc-cite}} as evidence that we can provide accurate and precise phonological information without burdening our transliterations with it. This raises another problem with highly technical transliterations, namely that "Ancient Greek" covers over two millennia. Greek is about as conservative a language as they come, but there were nonetheless a number of important sound changes over that period. The difference between long and short alphas, iotas, and upsilons only exists for the briefest of moments. For the majority of the time there is no such difference. Mind you, even the rough transliterations that we currently have run into that problem, as many of the vowels converge on /i/. But in my opinion, this simply serves to reinforce the need for as basic a transliteration as possible. One possibility which might serve as a compromise would be to have a different transliteration format for etymological contexts vs. others. -Atelaes λάλει ἐμοί 03:10, 16 November 2013 (UTC)

  • Is there a way to generate different transliterations in Lua based on user's preferences? --Ivan Štambuk (talk) 06:57, 16 November 2013 (UTC)
    No: all pages contents are the same for everyone. You need javascript/Gadgets to customize content. Dakdada (talk) 12:47, 16 November 2013 (UTC)
    Could we have the transliteration modules/templates output two transliterations, one Classicist and one Scientific, and then use javascript to hide one or the other (per each user's preference)? That would be awfully complicated even if we could do it.
    Personally, my inclination is to indicate length, and to favour scientific transliterations generally; the only thing that gives me pause is Aelaes' point that "the difference between long and short alphas, iotas, and upsilons only exists for the briefest of moments". I don't think users who can't read Greek script are going to be confused by a long vowel mark any more than any of the other accent marks we use... especially given that we do indicate vowel length in Latin (and Old English, etc). - -sche (discuss) 15:07, 16 November 2013 (UTC)
That wouldn’t be too hard. Such a framework would also allow the reader to choose IPA/SAMPA/respelling for pronunciations, and their choice of standards for romanization in other languages.
A template can output two or more romanizations, perhaps in an HTML unordered list. Our default CSS can hide all but the first one. A simple JavaScript widget can introduce a control that toggles CSS visibility for the different list items.
Issues: What would an unobtrusive control look like? This should only be used with automated romanizations – too complicated to deal with missing items, keeping multiple romanizations updated in every etymology where a term appears. We should stick to romanization according to standards, offering readers reference information, not our own unpublishable wikibation. Michael Z. 2013-11-16 17:51 z
  • I think the claim that "the difference between long and short alphas, iotas, and upsilons only exists for the briefest of moments" is a red herring. The difference between long and short vowels did eventually disappear in Greek, but it was present during the Golden Age of Classical Greek literature (the period most people who read Ancient Greek are interested in) and it was present in all older stages of Greek, making it of crucial importance in etymologies. Macrons should be used to mark vowel length in Ancient Greek in all circumstances where they're used for Latin, Old English, etc.—and not just in transliterations, but also in the Greek script directly. Thus for example the headword line of ἄγκυρα should read ἄγκῡρα (but isn't it actually ἄγκῡρᾱ, despite what the pronunciation section says?), and in the etymology section of ibid#Old Irish, the Ancient Greek cognate should be listed as {{term|πῑ́νω|lang=grc}}, and Lua should know to link "πῑ́νω" to πίνω and to transliterate it pīnō. —Aɴɢʀ (talk) 17:03, 16 November 2013 (UTC)
    One option is to install a MW extension that would enable Lua/templates to fetch user's name, and then we could have per-user settings in e.g. Module:User:Xxx/conf for transliterations and other things of dispute. --Ivan Štambuk (talk) 18:09, 16 November 2013 (UTC)
    I think that is a very bad idea. We should keep as many preferences in Special:Preferences as possible. If there were an extension that could pull information from preferences, that would be a different story. --WikiTiki89 18:16, 16 November 2013 (UTC)
    Why should we? Generating all of the possible outputs in Lua and hiding unwanted ones in Javascript sounds like...bad engineering. --Ivan Štambuk (talk) 18:25, 16 November 2013 (UTC)
    I think we're looking at this the wrong way. The crucial decision is what the vast majority of our users are going to see, what the default is. Quite frankly, making a user preference for different Ancient Greek transliterations is probably a waste of resources. Let's focus on a singular decision. -Atelaes λάλει ἐμοί 18:35, 16 November 2013 (UTC)
    I think such an extension, even if possible to implement, would defeat the extensive caching system built around here (template expansion results could no longer be cached, because they depend on the user who is viewing the page). Also, it would be possible to write code like
    if USERNAME == "Ivan Štambuk" then return "" else return "[[User:Ivan Štambuk]] smells" end
    
    . Now tell me, should Special:Whatlinkshere/User:Ivan Štambuk list a page which invokes such a module? So I would not expect anything like that installed here. Keφr 18:37, 16 November 2013 (UTC)