This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live.

Beer parlour archives edit

2024

2023

Earlier years

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

December

February 2009

Wiktionary:Votes/2009-02/Amending ELE Order of Headings

I've just opened a new vote on this, it's a trival change to ELE, just reorganizing a few headings, and clarifying the suggested position of trivia sections (which we already describe). Just alerting folks via the BP. JesseW 20:40, 6 February 2009 (UTC)[reply]

language agnostic?

Shouldn't the context categories be language agnostic, and not favour English, instead having a category:en:contextname or category:English contextname subcategory instead of putting everything into the head category? 76.66.196.229 14:49, 7 February 2009 (UTC)[reply]

No, because this is the English Wiktionary, where we describe all the words in all the world's languages, in English. It is correct and appropriate to have our context categories be in the language we are using to define words in. 70.213.100.180 (really, w:User:JesseW/not logged in) 01:31, 8 February 2009 (UTC)[reply]

I always thought that English Wiktionary meant all explanatory text, etc was in English. I've noticed that some context categories have "English ..." subcategories. Since this dictionary has multiple langauges, it'd be easy to mix in other languages into the head category, so that non-English words end up in the English category unless it would be specifically categorized. 76.66.196.229 05:09, 8 February 2009 (UTC)[reply]

I raised the same point a couple of months ago (IIRC), with some support. However, the general desire then was to retain the status quo, and have the ‛bot account AutoFormat correct any miscategorisations. TBH, I didn’t find the counterarguments particularly convincing, and I don’t know how much reprogramming has been done to AF since then to support the status quo. I don’t think that it’s seen as much of an issue ATM. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 12:19, 8 February 2009 (UTC)[reply]

I agree with the OP that the fact that this is the English Wiktionary shouldn't entail that English-language entries get special consideration. It's always annoyed me, for example, that where English words are homographs of words in other languages, the English word appears first on the page, instead of in alphabetical order by language. But that's the status quo, and inertia and fear of change will no doubt keep it that way. Angr 13:04, 9 February 2009 (UTC)[reply]

The status quo need not necessarily remain as it is. The last time this sort of “language agnosis” was proposed, it was shot down; however, if there is greater desire for it this time, things could change, and we could begin a vote to achieve those changes. That said, FWIW, I’m in favour of retaining the English-first structure of our entries, because those are the sections that contain proper definitions (rather than just straight translations), extensive translations, &c., and are usually simply more detailed. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 16:08, 9 February 2009 (UTC)[reply]

If English really were the thing, then shouldn't the parts-of-speech categories also make English prime, instead of category:English verbs, it should just be category:Verbs? I think that the context categories should be treated like the POS categories. 76.66.196.229 09:44, 11 February 2009 (UTC)[reply]

Doing so would blur a very important distinction. The topical categories sort meanings according to the definitions. The POS categories sort labels according to their usage. This is an critical distinction, and is the same reason we define dog as "an animal, member of the genus Canis..." and not as "a noun..." or "a word used to represent...". The lexical meaning and the grammatical function are not the same thing. --EncycloPetey 08:27, 14 February 2009 (UTC)[reply]

AWB request

Hi, I am trying to fix the family of German declinations templates, of the form Template:de-decl-noun*. This will involve batch-moving the entries of some preëxisting categories, which is work ideally suited for Wiktionary:AutoWikiBrowser. I request that an admin enable my account for use with AWB. Thank you, OldakQuill 13:42, 8 February 2009 (UTC).[reply]

Unknown etymologies

Although it's probably been answered already somewhere on this wikt, I'm going to ask it anyways. Should there be (or, is there) a special template/category for words/phrases whose origins are unknown? If there isn't, I suggest there be one made. This would allow etymologists who work here to have at their disposal a list of all the words featured here which have been designated as having an unknown origin. Also, those wiktionarians who happen to already know the origin of a given word can add it if they see it in the list. The system here is simple yet effective. Here is basically what it would look like:

Etymology
Origin unknown. (This would be represented by the template {{etyu}} or {{etyl-u}} or whatever is best.)

This template would also add the pagename to the category of words with unknown etymologies:

Category:Entries with unknown etymologies

or something to that effect.

Again, if such a scheme already exists, please let me know. I'm too lazy—er, busy—to search all over en.wiktionary.org to find out. So, whaddaya think?—Strabismus 04:40, 9 February 2009 (UTC)[reply]

{{rfe}} - to request an etymology, {{unk.}} for really "unknown" etymologies. Both do the sorting automagically. --Ivan Štambuk 04:43, 9 February 2009 (UTC)[reply]

Ok. I'll check it out. Thanks.—Strabismus 21:30, 10 February 2009 (UTC)[reply]

Wouldn't {{etyl|und}} be appropriate and allow us to have one less template to deal with? (See discussion on Wiktionary talk:Etymology.) Carolina wren 20:33, 3 March 2009 (UTC)[reply]

Completely Obvious Things

Title redacted: Original title was: == Completely Obvious Things that should have been Implemented from the Get-go and how Simple it will be to Add their Functionality if Someone Pulls their Thumb out ==. -- Visviva 12:14, 9 February 2009 (UTC)[reply]

Or, how to piss everyone off with a feature-request, by authors unknown.

Search: It should be possible to search for sub-verbal particles of words and according to language, type or any other templated taxon. Example search query: "{{Finnish adverbs}} -vasti" would return (amongst others) http://en.wiktionary.org/wiki/toivottavasti

There already exists a special page to find words begining with a certain stem, which one assumes incorporates a custom sql query. This could be facilely extended to allow for generic word components.

Interlingua: when checking the suitability of a potential word translation, it is vital to be able to easily check the reciprocal definitions in other language wiktionaries. Currently, one has to look up a definition, then click the interlingua link to the desired language, then click to the various offered translations, then click back to the initial language in the interlingua box.

This is 2009 people... Creating a graph of article links is potty-training these days as would be incorporating the reciprocal definitions via a "showbox" (a la word declensions) into the interlingua link.

It would be really nice if someone with the administrative privileges to implement these features would give it a go. If how to do either of these things remains unclear, please feel free to ask for further elaboration or clarification.

217.112.245.49 11:50, 9 February 2009 (UTC)[reply]

These are perfectly reasonable requests, but you're talking about software issues, which none of us here have anything to do with. Please try Bugzilla. -- Visviva 12:14, 9 February 2009 (UTC)[reply]

These are wiktionary issues, not mediawiki issues. Implementation of the features suggested is the onus of wiktionary administrators not the developers of its software base. The mediawiki devs have other things on their minds than improving wiktionary; wiktionary admins, one would like to think, less so. 217.112.245.49 19:50, 9 February 2009 (UTC) (OP)[reply]

Wiktionary is hosted by the same people who host Wikipedia, the server admins are the same people. They concentrate almost solely on Wikipedia (grumble) but they will implement changes for Wiktionary (if someone else progams them, they can be bothered, and they don't think it'll hurt performance too much). The software is not well designed for a dictionary, it is designed for an encyclopedia, this brings issues - but it doesn't seem to be too detrimental (far worse would be software that had been designed for a dictionary badly). Feel free to contribute patches or extensions to MediaWiki - it is then possible that they will enable them for us - but not without an unnecessary amount of red tape. </rant>

Search by category would be a wonderful feature, but it requires adding all category information to the search indices, (it's not "not doable", but it would require a lot of thought and clever hackery of the search libraries). The same "searching for substrings" is just not feasible if you have to scan 1.2 million records - you need to add it in some normalized form to the indices. (patches welcome :p) [note that "starting with" or "ending with" are easy, as a prefix search is cheap and you can add the reversed string to the indices).

I'm not sure what you mean by "reciprocal definitions" but that sounds much easier - you just want the definition of a word in that words language to appear on the page for that word in the English Wiktionary? We can probably hack a good enough solution to this in Javascript reasonably easily (yes, we could spend much longer writing a PHP solution but it's complicated by the fact that not all Wiktionaries are in the same database clusters), but it'd be nice to know exactly what you'd want.

Some of us have given thought to how to build software that can actually store dictionary information in a structured way, which would allow for much more amazingly cool features, but it's an exceedingly hard design problem as you no-sooner make an assumption about words than some other clever-bugger finds you a couple of thousand places you can't make that assumption. Conrad.Irwin 00:16, 10 February 2009 (UTC)[reply]

You should also note that Interlingua is a language (albeit an artificial one). So, I had to read your request several times to figure out that you probably weren't asking about that langauge. --EncycloPetey 08:21, 14 February 2009 (UTC)[reply]

New word

Could the word xzf become a word? We have decided that it means the "sound of a tennis ball machine ejecting its yellow round items of torment"--God'sGirl94 15:02, 9 February 2009 (UTC)[reply]

No, see WT:CFI. --Ivan Štambuk 15:21, 9 February 2009 (UTC)[reply]

Rats. I'll have to find another way. Scrabble players will love it!--God'sGirl94 14:53, 25 February 2009 (UTC)[reply]

Entries with "Shorthand" sections

I came across some entries (modified a few years ago, and, so far as I see, all beginning with "a") where someone added a ===Shorthand=== header and shorthand notation. Some examples: abhorrence, abjuration, abdicated, abated, abhorred, abase, abed, abbreviated, abasedly. Maybe I just haven't been paying enough attention, but do we do this? If this notation is to be added anywhere, shouldn't it be in the "Translations" section? Is any action called for here? -- WikiPedant 17:35, 9 February 2009 (UTC)[reply]

It seems like a good thing to have in a dictionary unless the shorthand of a word is easy and algorithmic to figure out if you know the word. (That does not seem to be the case for Gregg, for example.) Although it's technically allowed for under ELE ("...other trivia and observations may be added, either under the heading "Trivia" or some other suitably explanatory heading...."), it would probably be wise to allow for it explicitly, or vote it out of Wiktionary. On another note: the entries you mention have shorthand represented by letters, whereas I would think that images of actual shorthand would be far better.—msh210℠ 21:26, 9 February 2009 (UTC)[reply]

I guess this is in the same category as alternative scripts and other representations, probably including American Sign Language, Braille, semaphore, Morse code, etc. Shouldn't this be a link to another entry? —Michael Z. 2009-02-10 05:23 z

A side point: Not American Sign Language, which, unlike Braille and Morse code, is a separate language rather than a representation in signs of English. Main point: Since shorthand is not currently in Unicode (but see the thread [1] and specifically the message [2]), we have no way of representing the shorthand representation of a word except s.v. the English entry. Even if shorthand were in Unicode, whether a shorthand representation of a word would deserve an entry would need to be decided. Note that we don't include Braille representations of words, even though we do have individual Braille letters. That's probably as it should be.—msh210℠ 18:21, 12 February 2009 (UTC)[reply]

Leading bullets in `R:` templates

Can we please get rid of all the leading bullets in our various R: templates? Templates with initial bullets break if used in <ref>…</ref> tags. They are easy enough to præface with a bullet, if necessary, in the entries themselves, by the mere addition of *. AutoFormat could probably be programmed to add a bullet before any R: template included in a References section which is not called by <references/>. I would be most grateful to whoever sorts this out. Thanks in advance. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 17:55, 9 February 2009 (UTC)[reply]

Support. —Ruakh_TALK 01:42, 10 February 2009 (UTC)[reply]

Support. This sounds like a good idea to me too. But it is essential to perform both parts of the task described Doremítzwr. The bullets can't (a) be removed from the templates without also (b) dispatching AutoFormat to insert bullets in the entries. -- WikiPedant 05:12, 10 February 2009 (UTC)[reply]

Not sure AF would be fast enough for this job; removing the bullets would instantly break a large number of entries, since anything with two or more "R:" templates in sequence would be garbled.

I do have one of my trademark baling-wire-and-duct-tape solutions for this, which I implemented for example in {{R:Dictionary.com}}. If by some combination of human and bot activity we first change all instances of "[newline]{{R:Dictionary.com}}" to "[newline]*{{R:Dictionary.com|bullet=}}", we can then remove the "{{{bullet|*}}}" code from the template and AF could then gradually remove the "bullet=" junk DNA from the entries in the normal course of its work. This is very inelegant, but it would allow a smooth transition with no breakage. -- Visviva 06:10, 10 February 2009 (UTC)[reply]

It is much easier than that! If there is both a bullet in the wikitext and in the template, it still renders just fine. (It looks like two lines, one containing just "*" which is elided.) AF has been routinely adding bullets before a number of templates, which can then at some point have the "internal" bullet removed. One doesn't need a "bullet=" parameter. If I add all R: names to the templates that AF adds * before (using a wildcard, anything starting with R:) it will add them fairly rapidly, and then we can just strip the * out of the templates. I've changed AF, just give it a while. (day or two) Robert Ullmann 17:21, 10 February 2009 (UTC)[reply]

Like this, note the references are rendered properly in abigail. It is screening for them, so will find all in the current XML in one pass. Robert Ullmann 17:27, 10 February 2009 (UTC)[reply]

Thanks very much for sorting this out, Robert. I am most pleased to see this problem fixed so quickly. † ﴾^(u):Raifʻhār ^(t):Doremítzwr ﴿ 17:18, 11 February 2009 (UTC)[reply]

Yay! -- Visviva 02:23, 11 February 2009 (UTC)[reply]

Do note that your "baling-wire" method would be fine if we needed to, in this case it was just a bit simpler. AF has now completed a pass with the 27 Jan XML dump, and again with the 10 Feb dump. There are still a few out there because it doesn't pick up in its screening any entry more than once in 35 days, but they will get caught; as will any new uses of R: templates without *'s. What all this means is that it is now "safe" to remove the internal * from R: templates as desired. Robert Ullmann 11:16, 12 February 2009 (UTC)[reply]

Entry layout explained confusion

There is an inconsistency between the order of sections in the index and their order in the Additional Headings and Order of headings sections.

In the index, derived and related terms are shown after translation, in the Additional Headings section they are shown before. In Order of headings, Coordinate terms and Descendants are also shown before Translations.

This is confusing. - dougher 21:16, 9 February 2009 (UTC)[reply]

As the "Order of headings" was established by vote, while the "Additional headings" are present only by inertia, there should be no problem with bringing the second into line with the first. In fact it should not even require an additional vote IMO. I will wait for counterarguments before proceeding. -- Visviva 05:19, 10 February 2009 (UTC)[reply]

ELE clarifications

The WT:ELE currently has a section labeled "Order of Headings" within the "Headings after the definitions" section. It is missing some of the headings described below it, and some that are listed are described in a different order from how they are listed. It is also unclear how widely the "Order of Headings" applies: does it just apply to the L3 headings that appear after the definitions, or does it apply to any instances of the listed headings, wherever they appear, or does it apply in some other fashion? There is also simmering disagreement over the very presence of some of the headings currently described in the ELE.

I don't want to get into the discussion over the presence of any headings currently mentioned in the ELE -- that's a complex argument, and separate from clarifying what currently is there. I hope that re-ordering the heading descriptions to match that of the "Order of Headings" list is uncontroversial. I think that the ELE currently is quite unclear in the "Headings after the definitions" section -- many of the descriptions of headings say their contents should normally not be in those headings, or that the headings should be within the definitions, rather than below them, or in other combinations. JesseW 00:01, 10 February 2009 (UTC)[reply]

I agree that this is/should be non-controversial. But I do think that the ELE should address the issue which led to the unfortunate collapse of the previous vote, namely ontologies (that "Related terms", "Derived terms", and "See also" may be placed at various levels, usually L3-L5, depending on whether they refer to a particular POS/Etymology/Pronunciation or to the language section as a whole, while "Anagrams" is always at L3). This present discussion would be a good place to hash out the wording of that paragraph, which IMO should go either just above the "order of headings" section or somewhere higher in the tree. We could then also update the example (or add another example) accordingly, thus covering all the points of the previous vote and a little more, while (hopefully) staying clear of controversy. Not sure what the best wording would be ... -- Visviva 05:30, 10 February 2009 (UTC)[reply]

Well, here's some proposed wording, to go directly above the Order of Headings:

Many of the headings listed below may be specific to
a particular Part of Speech, Etymology, Pronunciation,
or language, and therefore may appear multiple times.  
This Order of Headings only applies to the order 
within a particular set, i.e. Derived terms should 
always go before Related terms, but Verb-specific
Related terms can go before non-specific Derived
terms.

This doesn't address which headings can be repeated, but it's a partial solution. Visviva, thanks for your response. JesseW 19:50, 10 February 2009 (UTC) (corrected per EP's comment below. JesseW 23:02, 15 February 2009 (UTC))[reply]

But that contains an error. Derived terms should precede Related terms. We agreed on that in a vote setting the order of L4 headings. --EncycloPetey 08:15, 14 February 2009 (UTC)[reply]

Great, thanks; I've corrected the proposed wording. I'm not sure how I missed that on my first reading. JesseW 23:02, 15 February 2009 (UTC)[reply]

what about conjugation/declension? Would it be better, if quotations precede it, since they contribute to the semantical part of the entry, whilst conj./decl. gives only grammatical information. Bogorm 09:53, 17 February 2009 (UTC)[reply]

Formatting of intentional errors

I'm thinking of including in a usage note for an entry I'll be writing soon an example of what not do (and what might be done by someone who tries a literal translation from English into Catalan of numbers such as fourteen hundred ninety-two). Is there any template or CSS class that would be used to give a consistent format to examples of erroneous text such as catorze-cents noranta-dos? If not, would it be a good idea to have, and what should the format be? (Probably not what I gave in this example.) Carolina wren 23:14, 10 February 2009 (UTC)[reply]

Often times an asterisk is included in running text before such words or phrases. For example, "The past tense of run is ran, not *runed". --Bequw → ¢ • τ 05:04, 11 February 2009 (UTC)[reply]

I thought asterisk was a fairly standard marker to indicate a reconstructed word, or does that convention apply only to Proto Indo-European? Carolina wren 05:25, 11 February 2009 (UTC)[reply]

It's used both to indicate ungrammatical constructions and to indicate reconstructed forms. It's multipurpose. :-) See [[*]]. —Ruakh_TALK 13:45, 11 February 2009 (UTC)[reply]

Is there another generally accepted way of marking "ungrammatical" constructions that would prevent confusion with unattested? I suppose that it may be true that the contexts of use may prevent there being many real opportunities for confusion. Are there contexts where one would need to distinguish? I think so. For example, *"most consanguinal" could be read as an assertion about attestation or ungrammaticality. In a discussion of what should appear on an inflection line, both uses are plausible. Also, an unattested form might be deemed a source of an alleged ungrammaticality. OTOH, the wonderful piece by Arnold Zwicky, Mistakes, seems to have no distinct marking at all for mistakes in its 47 pages. DCDuring TALK 16:24, 11 February 2009 (UTC)[reply]

Some linguists make the difference between syntactical ungrammaticality, which is expressed with an asterisk, and pragmatic or semantic unfelicitousness (we need this word!), which is marked with a hash sign (#). Unfortunately, this convention is not widely honored. Nevertheless, I do would plead for something more intiutive, less technical here, though I have no idea what. Why don’t you simply create a template {{disprefered}} or {{ungrammatical}}, give it an initial formatting, and maybe later people come up with something better. H. (talk) 12:41, 12 February 2009 (UTC)[reply]

{{sic}}? Circeus 21:01, 23 February 2009 (UTC)[reply]

Archiving/cleaning up of talk pages

I am regularly annoyed with the unusable stuff one finds on talk pages. There seems to be some sort of convention not to remove anything from talk pages, because they are supposed to reflect discussion, but sometimes the stuff is really old or no longer usable, ar has long been resolved. Can’t we introduce a system that after a certain amount of time, such information can be removed, maybe even have the page deleted. One concrete example I just saw is Talk:gullible. While originally it was a proper discussion, this is just no longer relevant, and there are some stupid remarks in there. What do people think? H. (talk) 12:34, 12 February 2009 (UTC)[reply]

Not having an outlet for such stuff might be worse in its consequences. DCDuring TALK 12:39, 12 February 2009 (UTC)[reply]

I would agree that things like Talk:gullible can be speedily deleted (though I won't delete it now, lest I derail this discussion). "No usable content given", I would call it... On the other hand, IMO any substantive conversations about entry content should be kept indefinitely, even after they have been resolved. -- Visviva 13:56, 12 February 2009 (UTC)[reply]

Semi-relevant: should we delete the talk pages of entries that have been deleted? I think not (assuming they are relevant) - but I know Wikipedia does, so maybe there is some method to their madness. Conrad.Irwin 00:21, 15 February 2009 (UTC)[reply]

I oppose deletion of talk pages no matter how old they are. One can often find useful information, share knowledge or try to give a helping hand as in Talk:reich, albeit belatedly. I advoacate either archiving (in case of voluminous content) or just leaving them. Bogorm 09:51, 17 February 2009 (UTC)[reply]

Just delete talk pages with no relevant discussion on sight, no one will miss them (and people are less likely to waste time investigating it). --Ivan Štambuk 09:59, 17 February 2009 (UTC)[reply]

You should have written some rather indolent people are less likely to investigate, I am likely to and always inquisitive. Why did you delete ¨Talk:gullible, I was not there and do not know what it was about. Bogorm 10:17, 17 February 2009 (UTC)[reply]

Ethical question

Hello there. For twelve years I've been running a web dictionary of British slang, called 'The Septic's Companion'. It's something I've pottered around on in my spare time - the content was all written by me, but it's evolved over the years based on large amounts of feedback from visitors. I think it's reasonably accurate now, but obviously it's not the OED. I recently self-published the thing as a book, which is advertised on the site. This effectively makes the site a commercial one (not one that makes me any money, really, but I guess it's the thought behind it that counts). I'm about to get to the point any minute now...

I noticed that lots of the words in my dictionary were also here, so the other day I added a link to the word on my site as an external link to a few pages here. Once I'd done that I got a very polite message from Ivan Štambuk suggesting that if all I was intending doing was adding these links, then I wasn't really doing anything particularly useful, and may be considered as spamming promotional material. I do see his point, and I'm not going to do that any more. I have an idea of what I might do instead, but I'd be very interested in other points of view as to whether this is ethically appropriate or not. I'm thinking I will go through my own dictionary word by word, looking up each word in here. For the ones where I have more information (or corroboration of information that's already here) I would like to add my web site as a numbered "Reference" link on those items. Of the 700 words in my dictionary, around 400 have British slang meanings which aren't already covered here, so there's quite a lot of new content. What's the general feeling on that? Obviously it involves me blatantly sticking my web site links on Wiktionary, but it's only in places where I'm corroborating or adding content. The best way to do this might be to add my site reference using an automatic URL generator, much like you already have for Webster - Ivan has reservations about this because of the "some guy doing this in his spare time" nature of my site, the fact that Wiktionary usually references published utilisations of words rather than definitions, and the fact that I'm trying to sell a book. He said he'd be willing to float the idea here to see what the general feeling is - I jumped the gun on that but I'll let him chime in if he thinks I'm misrepresenting him.

Anyway, very interested in your thoughts. In support of my case, I noticed today that someone had actually already linked to my site from another word here (pillock), although using an older URL of mine (english2american.com). --Pugwash 22:16, 12 February 2009 (UTC)[reply]

When you are adding your site as a references, I assume that is because you are adding the missing content from your site to here? If you have improved an entry using the information you have researched for your book, then by all means link to the website (alternatively just cite the book). On the other hand, adding only links just lowers our signal to noise ratio, please don't do it. Conrad.Irwin 00:27, 13 February 2009 (UTC)[reply]

Ooh, I notice that you're giving away free copies of your book to websites that link to you. Can I put in an order for 1077 free copies? :p (151,000 would be fairer, but the inactive users aren't going to notice). Conrad.Irwin 00:42, 13 February 2009 (UTC)[reply]

Yep, I'd only be adding a reference if I actually took some info from my site and put it up here. The, umm, higher signal-to-noise ratio version was what I was originally intending doing before Ivan slapped my wrists about it (I've already done a couple of those that I'll revert). I'd rather link to the web site than the book - the web site is a bit more up-to-date and slightly easier for the average user to verify. As for the free copies, I hadn't exactly established what I'd do for sites with community ownership. However, once I've established that by some means, "no" will be the ultimate answer. :) --Pugwash 01:37, 13 February 2009 (UTC)[reply]

Web Services

Any chance someone would provide a Web Service interface to access definitions? I'm willing to do some work there if that helps

Have you seen the mw:API.php? I think that's what you are asking for, no? 75.212.71.60 07:48, 14 February 2009 (UTC)[reply]

We have nothing yet that lets you look at Wiktionary data - though there are some quick heuristics that allow you to parse most entries (you can get the text from the API or the the XML dumps). Definition lines always start with a #, translations are in the approximate form *Language name: {{t|language code|translation}} or *Language name: [[translation]]. User:Polyglot has a parser that will get almost all translations from any Wiktionary entry. We've been thinking about making a web API out of it, but it needs a bit of time. What information are you interested in? Conrad.Irwin 12:14, 14 February 2009 (UTC)[reply]

I'm looking for a simple service as in (REST in this example):

GET en.wiktionary.org /defs?word=daisies ...

200 OK <definitions plural="daisies" type="noun"> <definition>A wild flowering plant Bellis perennis of the Asteraceae family, with a yellow head and white petals</definition> <definition>Many other flowering plants of various species</definition> <definition context="Cockney rhyming slang">Boots or other footwear. From daisy roots.</definition> <etymology>old english...</etymology> <link href="http://en.wiktionary.org/wiki/daisy"/> </definitions>

Discussion board for etymologies

I was thinking of the idea of Wiktionary establishing a separate discussion board that would discuss etymologies of words (all languages). The Tea Room is generally populated with discussions pertaining to usage and senses, prevalently with English, so the entries discussing exclusively etymologies in there might get missed by the folks that would otherwise show keen interest in them (e.g. me :)

So far the etymological discussions are dispersed in various article/user talkpages, and are held by a minority that usually track each others talk pages and confabulate on the basis of already established knowledge of who might give valuable info on the subject. This way, such discussions would be centralized to a specific place, so anyone could track them, and possibly provide constructive feedback. This esp. pertains to some irregular contributors (e.g. Strabismus) that add fantastic etymological stuff, but usually on the principle of randomness.

Also, it would be a place to centralize and promote community feedback on heated debates of the conclusiveness of some particular etymological scenarios, as opposed to the current modes of resolving these and which are generally one-on-one debates, thus enabling the establishment of the community consensus on issues which would otherwise had to be resolved by a more empowered debater utilizing his sysop privileges to resolve the issue towards the conclusion he personally thinks to be the most objective (has happened to me, but also to some other admins in the past).

Relevant discussions on the talkpages should thus be moved/copied to such places and continue from there (and prob. copy/pasted back after they get archived, like the TR/BP discussion are). --Ivan Štambuk 00:07, 15 February 2009 (UTC)[reply]

It would make more sense to use Extension:Labeled Section Transclusion. This would keep the master-copy on the talk page, but let the relevant section be posted in the Wiktionary:Etymology chamber. (I think we should run all our word-specific discussions this way). For an example of LST in use, see [3] and [4]. Obviously sections that are related to multiple words could share the same discussion and it would appear on all of their talk pages; discussions that aren't related to specific words can still be done as is normal on WT:TR or WT:BP at the moment. (though sub-pages is another option). Our pages are a pain to archive at the moment, so it would be a little silly to start a new page using the same system :p. Conrad.Irwin 00:19, 15 February 2009 (UTC)[reply]

Interesting new concept. So this "Etymological chamber" would initially include transcluded sections of the relevant word-specific discussin page(s), and the resulting discussion section on the EC (as it goes on) would be automagically transcluded in the relevant individual word's talkpages. Did I get that right? In that case, both the original word-specific talkpage discussion and these relevant individual word's talkpages that are subsequently marked with "EC contains a discussion related to this word.." should better be marked with some notice saying "Please continue this discussion on the EC, and not here", so that someone might not get into temptation to answer at the talkpage itself.

Also, I don't see how it's relevant to page archiving..if all the discussion is to be made (or continued) on some master page, it would certainly grow in size, and old entries would have to be archived sooner or later. To me appears that the subpages approach would be the best (in case of multiply-relevant words, named after the one most relevant, or after the entry on whose talkpage the original discussion started.) --Ivan Štambuk 00:57, 15 February 2009 (UTC)[reply]

See [5] and [6]. When the section is edited, it's the talk page that actually gets updated. Conrad.Irwin 01:18, 15 February 2009 (UTC)[reply]

Neat! However, I'd very much like to a see a set of templates abstracting away this horrid Wiki markup! Something like {{etymology-discussion-begin}}, {{etymology-discussion-end}, and {{etymology-discussion-transclude}}, taking appropriate parameters where necessary (i.e. the labeled section name).

Also, it would be desirable if such templates supported multiple discussions on the respective word's talkpages (e.g. by taking alternative parameter for the transcluded section name, defaulting to etymology, or whatever the bestest approach be), as the same word, of the separate words in different languages, sharing the same spelling, or the different words in the same language sharing the same spelling but having different etymologies, might get discussed in different discussions on the master page. --Ivan Štambuk 01:32, 15 February 2009 (UTC)[reply]

This is exceedingly cool. I didn't think we were ever going to get LST. Now, if we can just use it for the Tea Room, and RFV, and RFD... :-/ -- Visviva 14:58, 17 February 2009 (UTC)[reply]

I suppose that it's not possible to use NavFrame on the master page which would tranclude the individual discussions only if user clicked the "show" link? Otherwise we'd still have to resort to archiving when the amount of transcluded text grows too large.. --Ivan Štambuk 08:49, 17 February 2009 (UTC)[reply]

No, it would still be transcluded, just hidden. I have seen an extension -- I think it was on Meta or the Foundation wiki -- that did approximately what you're describing. I don't know what it was called, but Conrad probably does. :-) But I'm not sure it's necessary -- you would want to archive (or rather, un-transclude) discussions that have run their course anyway, wouldn't you? -- Visviva 14:58, 17 February 2009 (UTC)[reply]

Too bad, I thought that some on-demand AJAX magic could fix it. Anyhow, for archiving purposes there would be simply a list (or lists) containing transcluded discussiond but limited by size (fixed size of e.g. 200-300K) of totally transculded text. Now ideally a bot would, once the amount of transcluded text on the master page surpasses that limit, automatically archive the oldest discussion to some archive (containing just the transclusions, itself organised by the oldness of discussions, and of fixed size), and also, when someone comments on the labeled section talk page that is itself archived (i.e. not on the master page anymore), bring back that discussion to the master page (again at the expense of the oldest discussion currently transcluded on the master page). So archiving would not mean copying the discussion with "this was discussed on this date in the EC.." on the talk page, as the current format is for some other discussion boards, but would rather pertain to keeping focus on what is actively commented on. I hope I didn't express myself too confusingly. --Ivan Štambuk 15:29, 17 February 2009 (UTC)[reply]

Oh....this is so awesome. Ƿidsiþ 15:05, 17 February 2009 (UTC)[reply]

"No exact match" and multi-word searches

I noted that a search for "asciibetical oder" did not produce the page asciibetical, which is probably as it should be, even for the mythical "normal" unregistered non-contributing user. However, users might have trouble finding an appropriate entry when searching for idioms or SoP multi-words or as a result of misspelling a word. In all cases facilitating linking-to or searching-for component terms might reduce user frustration by providing an alternative path to determining meaning.

What might be helpful would be text on the "no exact match" page(s) that provided links to the component words of a multiword search term (blue-linked) and/or suggested searches for component words, especially if red-linked, which might lead to related terms (via other red-links or usage examples or citations). DCDuring TALK 17:12, 15 February 2009 (UTC)[reply]

From looking at the new search on Wikipedia [7], the software will soon support this (when they roll out that change to the other database servers - they didn't tell me a specific time-frame, but I gather it's on the "todo" list). Conrad.Irwin 00:14, 16 February 2009 (UTC)[reply]

I don't think I explained myself well. I would like a failed search for "foo bar" to provide as options on the failed search page the links to "foo" and "bar" if they are blue-links and to searches for "foo" and "bar" should they be red-linked. This would help for variant forms of idioms as well as for searches for strictly SoP multi-word search terms. I didn't see the WP search result link you helpfully provided as doing that. DCDuring TALK 00:58, 16 February 2009 (UTC)[reply]

Inflection tables

How desirable is it seen to be able to offer the table option in inflection templates? It's going to complicate things quite a bit for an inflection template I'm working on, especially if I support the full range of plural options I'm supporting for the inline version (one plural form, two plural forms, one masculine and one feminine plural forms, 1m + 2f, 2m + 1f, 2m + 2f), but if it's seen as "strongly preferred", instead of "nice if its there, but not really needed" I could see putting in the extra effort to do so. Carolina wren 03:20, 18 February 2009 (UTC)[reply]

Assuming we're talking about inflection-line templates, not desirable at all IMO. Mzajac did a good job of explaining what a mess these create -- not sure where the discussion is now, but it wasn't very long ago. It may be difficult to remove the ones we have now, but I don't think we want to be creating any more. -- Visviva 03:27, 18 February 2009 (UTC)[reply]

Names of animals

Should the non-taxonomic names of animals be capitalized (and classed as proper nouns) or uncapitalized (and classified as nouns)? e.g Great Frigatebird or great frigatebird SemperBlotto 12:05, 21 February 2009 (UTC)[reply]

Re capitalization ... Not sure. Agree that we should have a consistent approach. The Chicago Manual of Style recommends lowercase as the default (8.136), but also recommends following "authoritative guides" such as the ICZN. The ICZN ignores vernacular names entirely, but specialized works like Mammal Species of the World capitalize throughout. [8] Wikipedia also generally capitalizes, but I believe that is largely the product of one editor's obsession.

As for proper/common... they sure look like common nouns to me. On the other hand we have generally treated taxonomic names as proper nouns; presumably this is because they refer to a (single, unique) species, rather than an individual member of it. So I guess one question is whether the common name is simply a proxy for the scientific name -- I think this would be the case only for scientists, which may explain scientific works' preference for the capitalized form. Since common names can have perfectly ordinary plurals, and are generally lowercase outside of technical contexts (e.g. in news reports [9]), IMO ===Noun=== should be the norm.

So, based on the above and my own gut feeling, I would vote for great frigatebird as the canonical form: common noun, lowercase. Bonus: although this is a fairly obscure species, of the 44 hits on Google News Archives, 27 use lowercase (~62%). A possible exception would be those "vernacular" names that are not really vernacular, but are used only or primarily by specialists. In those cases the specialists' usage -- appropriately tagged -- should probably win out. Some sort of templated usage note for the {{alternative capitalization of}} entries would be ideal, though I'm not sure exactly what it should say. -- Visviva 15:24, 21 February 2009 (UTC)[reply]

FWIW, a search for "Frigate bird", "Frigatebird", and "Frigate-bird" on coca found 11/5/0 hits, all in lower case. In past bgc searches, I was under the impression that the capitalised forms were more common when they were used as taxonomic names (before 20th century). In fiction and newspaper usage the capitalised forms seemed fairly rare in all periods. DCDuring TALK 16:58, 21 February 2009 (UTC)[reply]

First, just a comment - the policy for capitalising species names on the Wikipedia bird project is consensual and certainly not an individual's obsession. It reflects widespread usage - check out most bird field guides, handbooks and magazines. Capitalising is also the norm on the wikiprojects covering mammals, reptiles and amphibians, though (curiously, I find) not fish.

My argument is that a species name, e.g. Great Frigatebird, is a proper noun referring to a particular taxon. Genera e.g. Fregata, families e.g. Fregatidae, and orders e.g. Pelecaniformes, are capitalised for similar reasons, they refer to particular entities. Another reason is avoiding ambiguity; capitalising, in many though not all cases, adds information to a term. A brown thornbill is not the same as a Brown Thornbill, nor a singing honeyeater a Singing Honeyeater. Why go down a path where information is lost or degraded? At any rate, while it would be helpful to get this sorted out as a guideline to capitalising (or not) the common name of a species, I think there is no excuse for not including capitalised common names for species under 'Derived terms', or 'Alternate spellings' as they are terms in current use throughout ornithological and birding literature. Maias 02:17, 22 February 2009 (UTC)[reply]

It seems that it is fairly obvious that vernacular usage favors the lower cased entities (subject to the caveat that on rarely used non-taxonomic names, specialist usage will predominate), and also appears to be the practice in other English-language dictionaries. However we also have some style guides that favor the use of capitalized forms. Clearly we should have an entry for both the capitalized and non-capitalized variants. Since the more commonly used animals will overwhelmingly have the lower case variant as being the preferred variant for most uses, and will be the more commonly used entries, then to avoid having to make judgment calls as to when a creature crosses the threshold from common to specialist, we should have the lowercase form be the primary entry and have a templated usage note that could be applied to all such nouns. As for the capitalized form, I think the link to it would be better off under =Derived terms=, in an {also} note, or in the aforementioned templated usage note than under =Alternate spellings= as a cat is not always a Cat, especially if it is a jazz artist. Carolina wren 05:25, 22 February 2009 (UTC)[reply]

It would be possible to do something like this, within the "Noun" section:

====Alternative forms====

(feline): Cat

I think that would be preferable to "derived terms", as for many of these, it is not particularly clear which one derived from the other. -- Visviva 07:38, 22 February 2009 (UTC)[reply]

I stand corrected on Wikipedia. I guess the consensus has shifted since I was active there.

It seems to me that a "great frigatebird" is primarily a common noun referring to an individual of the species ("We saw two great frigatebirds"), though it can also be used in reference to the species as a whole, in which case it is essentially a proper noun ("Populations of the great frigatebird have declined"). I believe the same duality applies to the species name -- "A Fregata minor was observed" -- but the scientific name is weighted toward the proper noun, while the vernacular name is weighted towards the common noun. Thus, those who regard the vernacular name as a proxy for the scientific name will tend to capitalize it, while those who use it simply as the name of a particular animal will not.

I believe this is borne out by usage: the more widely a vernacular name is used -- i.e. the lower the proportion of use comes from specialists -- the less likely it is to be capitalized. For example, outside of ornithological and birdwatching publications, how often does one ever see "peregrine falcon" or "bald eagle" capitalized? [10]-- Visviva 07:38, 22 February 2009 (UTC)[reply]

Seems like we should include whatever's attested. In most cases, I imagine that that will be both, and we can list one on the other's =Alternative spellings= list.—msh210℠ 17:35, 23 February 2009 (UTC)[reply]

So - my understanding of the consensual essence of the above discussion is to use non-capitalised forms for the primary entry, include the capitalised form under alternative spellings where the only difference is capitalisation, and treat derived (extended) terms as such. Maias 00:28, 26 February 2009 (UTC)[reply]

In my opinion, capitalized names should not be considered as different spellings. Non-capitalized words may be capitalized in a few cases: beginning of sentences, titles, etc. This does not make them alternative spellings. I think this is such a case: plant or animal names can be capitalized for insisting on the fact that the name is used in a generic way, but this is the same spelling, just like the In beginning my text is the same spelling as in, and does not justify a new page. Lmaltier 08:05, 28 February 2009 (UTC)[reply]

GENDER update

A recent software update defines the 'GENDER' magic word. After updating my Special:Preferences to reflect my gender, someone can, for instance write, "Bequw said that {{GENDER:Bequw|he|she|??}} would ...." and it will output the correct text ('he'). So, if you don't mind others knowing your gender, set the pref and enable this small advance in online communication. --Bequw → ¢ • τ 20:05, 21 February 2009 (UTC)[reply]

Sorry, but I don't see where to set that preference....—msh210℠ 17:36, 23 February 2009 (UTC)[reply]

I see it on the "User profile" tab right under the signature options. --Bequw → ¢ • τ 16:25, 24 February 2009 (UTC)[reply]

Thanks! Don't know how I missed that.—msh210℠ 16:57, 24 February 2009 (UTC)[reply]

Japanese Etymology

I’ve made a start at a policy think tank at:

Wiktionary:About Japanese/Etymology

This is analogous to

Wiktionary:Etymology

…but focused around issues specific to Japanese (of which there are…a few).

Hope this is useful and productive!
Shall we add a link to it at Wiktionary:About Japanese?
The page is an Official Policy, so I’ve not edited it, but I’ve created (and reverted) a

proposed edit

…which just adds a link to the subpage.

Is this ok to add?

—Nils von Barth (nbarth) (talk) 02:55, 22 February 2009 (UTC)[reply]

Seeing no objections, I’ve made the change – if any objections, please revert and advise me so we can discuss; thanks!

—Nils von Barth (nbarth) (talk) 18:24, 28 February 2009 (UTC)[reply]

Etymology sections

preserving ancient cognates in etymologies

Recently User:EncycloPetey deprived the etymology section in two Latin entries of multiple cognates. I would rather skip the discussion about the recent ones, but I cannot coneal mine indignation from the removal of Old Norse (first diff) and Old Norse and Old High German (second one). His retort on my appeal for explanation - if you won't add them I won't remove them, sounds rather inimical. Previously, Ivan had urged me to add ancient cognates instead of modern ones to the etymology of the entries from ancient languages and had explicitly juxtaposed Old Norse, Old Church Slavonic, Old High German, Latin, Sanskrit... How are the two expurgated ancient words going to be restored to where they belong. One supposition which passed my mind was that he based his edit on the difference in centuries when these languages were spoken (Old Norse 7th century AD - 14th century, OHG approximately the same period) but I have not remarked any of his edits to remove Old English or Old Church Slavonic cognates which are entirely coeval. Furthermore, the brevity of his answer does not permit to figure out if that is the actual motivation. Bogorm 20:45, 21 February 2009 (UTC)[reply]
While protection of the article was imposed, the edit summary claims some edit warring in spite of the fact that I have never undone any edit in the literal sense of that word, as is evident from the history. Please someone explain to me wherein the editwarring consists. Bogorm 21:21, 21 February 2009 (UTC)[reply]

I just wish that etymologies weren't overloaded with cognates. I don't think that we need every Romance language to be eligible to have a cognate in every word with a Latin etymon. I would be much happier if descendants were shown on the entry for each etymon. DCDuring TALK 21:49, 21 February 2009 (UTC)[reply]

I did not mean this section as a discussion on modern languages (which is probably what you meant by the expression every Romance language), but as a justification for the right of every ancient term to refer to fellow ancient words. Words in Romance languages may be traced back to Latin, in Slavic to Old Church Slavonic, in North Germanic to Old Norse, Iranian to Old Persian/Avestan and yet all these ancient ones ought to refer to each other, since their predecessors, hypothetital PIE words, are not attested, but artificially created, they are not numerous after all and only in a scarce number of words cognates are attested in all of them simultaneously. Bogorm 22:09, 21 February 2009 (UTC)[reply]

I agree with Bogorm (on several points). One comment in particular:

His [EncycloPetey's] retort on my appeal for explanation - if you won't add them I won't remove them, sounds rather inimical.

just tickles me. Don't worry, Bogorm, I'm not laughing AT you, I'm laughing WITH you! :) The last thing this world needs is another confounded paedarchy! In any case, I think that the etymologies should cause no undue worry. Especially if bandwidth is anything to go by. These etymologies save many a trip to a host of disparate websites featuring scattered etymologies, some of which require $$$ to join. :( So there you have it. "Editwars are for drawhores." whatever THAT means… :/ One idea that occurred to me is to have a hideable/closeable box that contains etymologies, where they run long. That way you need only click the arrow to open/close it. Whaddaya think?—Strabismus 23:46, 21 February 2009 (UTC)[reply]

overloading etymologies with cognates

Recently, User:Bogorm has begun adding irrelevant cognates to Latin entries, without any sound reason for doing so. Specifically, Old High German, Old Norse, Danish, Russian, and "Serbo-Croatian". Ivan, Atelaes, and others have been adding cognates from other branches of Indo-European, but usually just Ancient Greek, Aramaic, Sanskrit, and other suitably old (Classical) languages. (See Ivan's viepoint expressed to Bogorm here.) None of the Germanic languages nor the descendants of Old Chuch Slavonic are attested from an early enough date to be comparable to these other branches. Nor do these later languages add to the usefulness of the entry. They can be added to the appropriate Appendix page on the PIE root. --EncycloPetey 03:55, 22 February 2009 (UTC)[reply]

There is no hard and fast rule that anyone's been following on this, but there is a worthwhile discussion here. Generally, I think it's not a good idea to list a dozen cognates. Doing so doesn't really help the article too much, and it clutters things. I typically prefer to limit an etymology to between three and five choice cognates. My process in assigning priorities on which languages to list are well exemplified in ἱδρώς (hidrṓs) and ἰδίω (idíō). ἰδίω (idíō) has my typical listing, which is Latin, Sanskrit, and Old English (Armenian is not there, but that is also a language I like to see on Ancient Greek entries). Latin is given priority because it is, like Ancient Greek, a classical language. Additionally, the two are rather conservative languages and so common descent is often more easy to see than with other languages. I like to have Armenian because of the Graeco-Armenian hypothesis. I like to have Sanskrit because it, like Latin, is has early attestation and is a classical language, and because of the Graeco-Aryan hypothesis. I like to have Old English (and English, the only modern language which I like to see in the etymologies of ancient words) because, well, this is the English Wiktionary, and we rightly give English a privileged position. However, I often ignore my general language preference when a few languages have a specifically close reflex, as is seen on ἱδρώς (hidrṓs), which has only one of my normally preferred languages, as only these languages have a word specifically from *swidrōs. I do think it entirely redundant to list multiple cognates from the same family, so listing an Old Norse and an Old High German word in the same etymology should not be done, except for an Old English or Gothic word. Also, I'd like to take a moment to poke fun at EP for listing Aramaic as an Indo-European language. :-P But yes, in short, Bogorm's listings are not appropriate in these cases. -Atelaes λάλει ἐμοί 06:19, 22 February 2009 (UTC)[reply]

I agree with Atelaes: Danish, German, Russian etc. have no place being listed whenever there is etymon or an older cognate of the same IE branch already listed. In case of Germanic, the præferable would be to list Gothic (as the oldest-attested and the most archaic) and Old English/New English (when it's present). In case of the letter, I have no doubts that the German and Danish/Norwegian/Swedish/Icelandic/Faroese wiktionaries would give precedence to OHG and Old Norse, respectively, in lieu of Anglo-Saxon. --Ivan Štambuk 09:12, 22 February 2009 (UTC)[reply]

And what about Old Church Slavonic (completely coeval to Old Norse)? Are you willing to sacrifice it in etymologies in order to please to the Old English-supremacy opinion? Bogorm 10:06, 22 February 2009 (UTC)[reply]

(To Atelaes) According to you, one is allowed to list all cognates in languages which are connected on the basis of some hypotheses which you obviously favour. Well, in that case I feel obliged to respond that you incidentially or not omitted the Gothic-Old Norse hypothesis and since there is no Gothic cognate for specio, how would you defend the light-minded dismissing of the Old Norse word? (especially given the fact that Old Norse is far better attested due to multiple sǫgur, whereas for Gothic the main source is Ulfila's translation) Giving præference only to those hypotheses in whose languages you are interested, does not seem convincing, therefore please respect the interests of other contributors. I surmise that your interest in Old Norse and Gothic is comparable to mine in Ancient Greek and Old English, therefore let us respect each other instead of promoting the own and belittling the collocutor's one, ok? (To Ivan Štambuk) Your reluctance to reaffirm your juxtaposition of Old Norse, Avestan, OHG, Gothic, Latin and Ancient Greek is as disappointing as untenable.
Well, some may say: Latin was spoken before the 5th century, while ON and OHG emerge as written in the 7th-8th centuries (for these two written periods=spoken periods at least for the terminus post quem non; no one created ON literature after 1350, but switched to his native Danish/Icelandic asf, same for OHG->MHG), thence they must not be regarded as æqual. To disprove this inanity, I would like just to remind the venerable etymologists of the fact that Avestan was spoken (spoken! if we are discussing written languages, then Latin until 18th century, ergo æqual to ON, OHG) before the 5th century BC and at the time of Zarathustra it was verging extinction (as spoken), whereas Latin as a spoken language emerged in the 1st century BC (before that - Old Latin, same relation as Ancient Greek - Modern Greek) and if they are as fierce enemies of double standard as I am, they must expurgate all Latin entries from all Avestan cognates and vice versa. Are you willing to go that far or not? Bogorm 09:36, 22 February 2009 (UTC)[reply]

Bogorm don't be inane, it's doubtless that all IE languages descend from the same mother language (which is not so obvious and generally-accepted in case of Gotho-Norse theory): it's the reconstructions themselves that are hypothetical. Please don't confuse those two particular applications of the attribute hypothetical.

Also, the date of archaicity of a language is not equal to the oldness of its attestation. The rate of language change is conditioned by various external socio-cultural factors. Just look at the Lithuanian - not attested until the 16th century, but living dinosaur in phonology. Same is valid for Sanskrit/Avestan, and Old Anatolian (Hittite, Luwian - attested even earlier) - they're not necessarily more archaic then e.g. millennium later attested Latin or Ancient Greek cognate. They have their own other merits tho, by which they are very important..

The point is that nothing is gained by ignoring Old Norse and OHG when we have 2 other Germanic-branch cognates listed (OE and Gothic) in non-Germanic entries' etymologies. If you click on the listed OE and Gothic cognates you can in their own etymology section find out OHG, Old Norse, Old Saxon etc. cognates. --Ivan Štambuk 10:15, 22 February 2009 (UTC)[reply]

Even if I concurred with you that OE is the best Germanic language for cognates in non-Germanic entries (which I categorically refuse to do, since this is not the Old English Wiktionary, but the English one) then how would you explain that EncycloPetey erased the Old Norse cognate from specio, although it was the only Germanic one there? Old Norse incomfortable? Old Norse not enough archaic (to claim which would be an ineptitude, especially since its descendant Icelandic is one of the most conservative Indo-European languages at all, probably more than Lithuanian)? I have never quæstioned the IE appurtenance of all hitherto discussed languages, just that the IE appendices are based on artificially created words for PIE and therefore the best solution would be to link all ancient languages between them, they are not so many after all and even if they are a dozen, as Atelaes' expression was - Latin, AG, OE, Old Norse, OHG, Gothic, OCS, Avestan, Sanskrit, Hittite, Old Irish from the modern ones Armenian (Lithuanian ?) = 12/13, there is virtually only a handful of cases where all 12 are attested, in most cases there are 5-6, of which to expurgate the half seems far-fetched. Bogorm 10:32, 22 February 2009 (UTC)[reply]

As you say, its descendant language Icelandic is conservative, but only conservative to the time of the Old Norse sagas (circa the 10th c.), not to the age of Classical Latin or Old Latin. Old Norse is not so highly conserved during its period. Although the Old Norse language is believed to have been spoken since the 7th/8th century AD, the earliest vernacular writings in that language date from two to three centuries later. It is thus contemporary with Later Latin and Medieval Latin, not with Classical Latin, and is a woefully inappropriate language for comparison with Latin, as it does not at all overlap with the Classical Latin period. By the time it is being written, it is already full of Latin borrowings, which can most easily be demonstrated by citing some of the "Old Norse" personal names found in works like the Landnámabók, which describes the Norse settlement of Iceland (in the 9th/10th c). The names demonstrate that many Biblical and Christian names were already in use: Agata, Agnes, Aron, Ádám, Benedikt, Dávið, Grégóríús, Jakob, Jóhannes, Kristófórus, Matheus, ... and the list continues. This is clearly a language that already had substantial contact with Medieval Latin by the time it was being written. --EncycloPetey 15:44, 23 February 2009 (UTC)[reply]

After I discovered that Atelaes also encouraged unambiguously the addition of Old Norse cognates in Latin entries, as did Ivan (at lease in November 2008), I think that EncycloPetey must feel in minority now. I pray some of the two admins to restore the Old Norse cognate to specio 1) because it is the only Germanic cognate at hand in this case (perchance there is OE one, but other editors may research for it, if interested) and 2) because of the juxtaposition of Latin and Old Norse which is unambiguous in the comments of both of them. Bogorm 11:23, 22 February 2009 (UTC)[reply]

As usual, you have seen all the trees and yet missed the forest entirely. The point is this: You have persisted in adding cognates in a manner which has been oft noted to be inappropriate. EncycloPetey had to catch you at it and slap your hand before you acted in the responsible way (the way explained numerous times to you beforehand, the way which you should have sense to do without scolding by now). EP is then so irritated by your foolishness that he simply refuses outright your additions to that entry. So, you now try to make him out to be the villain. I expect such behaviour from small children, but not from dictionary editors. Old Norse cognates are of course welcome in Latin etymologies, but Danish, German, and Serbo-Croatian are not. This discussion is not about one word (discussions on this page quite rarely are); rather, it is about a general practice, a practice which you continue, despite the clear consensus of disagreement from other, more knowledgeable editors. -Atelaes λάλει ἐμοί 02:20, 23 February 2009 (UTC)[reply]

I disagree about the usefulness of Old Norse cognates in Latin entries, for reasons I have described above. In short: the earliest written Old Norse records are contemporary with the end of the Late Latin period and with Medieval Latin, and demonstrate the language is already contaminated by Latin borrowings. Hopwever, thank you for summarizing the edit situation. I have been very sick for the past few days, which is why I have not been editing much during that time. --EncycloPetey 15:47, 23 February 2009 (UTC)[reply]

(@Atelaes) You again missed the point. In the precedent section I explicitly declared that I have no intention to discuss modern cognates, but the addition of ancient ones, and yet you again metion that. Bogorm 09:23, 24 February 2009 (UTC)[reply]

Wikimania 2009

English: Wikimania 2009, this year's global event devoted to Wikimedia projects around the globe, is accepting submissions for presentations, workshops, panels, posters, open space discussions, and artistic works related to the Wikimedia projects or free content topics in general. The conference will be held from August 26-28 in Buenos Aires, Argentina. For more information, check the official Call for Participation. Cbrown1023 talk 18:12, 22 February 2009 (UTC)[reply]

English: Please translate this message into your language. Cbrown1023 talk 18:12, 22 February 2009 (UTC)[reply]

Template:Commonwealth

[Discussion copied from WT:TR#vapourise —Michael Z. 2009-02-22 20:55 z]

Is vapourise a misspelling in commonwealth areas? RJFJR 17:46, 12 February 2009 (UTC)[reply]

Yes. Pingku 17:56, 12 February 2009 (UTC)[reply]

Also vapourize. Please don't use the phrase “Commonwealth spelling,” because there is no such thing. Canada has its own preferred and acceptable spellings. —Michael Z.

What would you suggest as an alternative? -- Visviva 04:25, 14 February 2009 (UTC)[reply]

British for terms not used in Canada, British, Canadian for terms which are.

The Commonwealth of Nations is a political organization whose membership crosses linguistic lines. Most of its members inherited their English vocabulary and spelling from Britain. The exception is Canada, whose English is closely related to the other main branch, American English, but also influenced by British English and Canadian French.

There's no need, and no economical way, to specify that aluminium is the primary spelling in the UK, Ireland, Australia, India, South Africa, and all the other countries of the Commonwealth except Canada. It's just British English. Specific regionalisms like Australian tucker should be labelled according to their native geography, of course.

I haven't seen any dictionaries which apply the strictly political labels “UK” or “Commonwealth” for varieties of English. —Michael Z. 2009-02-22 16:52 z

Ireland ... and all the other countries of the Commonwealth. Please, more cautious with this issue - the Republic of Éire has not been a member for 60 years and is a completely independent country. The Republic of South Africa has not been either from the Verwoerd's times until de klerk came to power. But Éire is definitely not going to rejoin (here all other former members which are not as influential as Éire in matters of regional varieties of English). I personally favour the designation Commonwealth, because in this case we can settle for only 4 templates: UK/British, Ireland, Commonwealth and US (and Canada, although part of the Commonwealth) instead of dozens. Bogorm 17:50, 22 February 2009 (UTC)[reply]

Oops, sorry. I routinely assume that it's a convenient synonym for “former British Empire” or “British colonies”, but of course it's not.

So should we review all occurrences of the template since Nauru became a “member in arrears?” I notice we didn't use “Commonwealth and Pakistan” during 1999–2004 and 2007–08 when that country's membership was suspended, but I guess we shouldn't worry now that it is reinstated. Fiji, Nigeria, and a few other country's memberships have changed in the last decade. Mozambique is not a British colony at all, and its official language is Portuguese.

By the way, we need way more than four templates to account for members of Category:Regional English, and there's no reason to minimize the number.

Commonwealth membership in good standing has no clearly-defined relationship to the language used in a country, any more than membership in the United Nations, Nato, Nafta, or GATT. Another reason to use geographic labels and avoid purely political ones. —Michael Z. 2009-02-22 20:04 z

[end of copied discussion]

I propose that we stop using the context label Template:Commonwealth. As far as I know, there is no precedent in dictionaries, and the political organization Commonwealth of Nations does not correspond to any linguistic subdivision of English. Cf. w:English in the Commonwealth of Nations, w:List of members of the Commonwealth of Nations.

The template only occurs in 54 articles. —Michael Z. 2009-02-22 20:57 z

I cleaned up a bunch which were incorrect or unclear. It now remains in 33 articles. —Michael Z. 2009-02-23 06:13 z

I'm sorry Michael, but on this you are quite wrong. Your edit to metre for example, changing Commonwealth to UK, Canada is incorrect. In Kenya, part of the Commonwealth, as in the other parts, we use metre, and we are not part of the UK (or Canada, last I looked). If you are going to remove "Commonwealth", you will kindly replace it with the names of all 53 memebers? (It might be a bit much). Please restore it to the entries from which you have removed it. Robert Ullmann 09:26, 23 February 2009 (UTC)[reply]

Actually, I entered “British” (unfortunately, the text of that label incorrectly displays “UK”, but that's another issue to deal with). Isn't British English, or a close variation used in Kenya? If not, then perhaps we should add Kenyan English. Isn't metre used in Ireland? – then we should also add Republic of Ireland, because that's not part of the Commonwealth. If we follow this logic to its conclusion, then we end up with entries displaying (Commonwealth except Canada, Republic of Ireland, Madagascar, Rwanda, Sudan, Zimbabwe). Does every single one of 53 members use British English (Canada does not)? Algeria, Madagascar, Rwanda, Sudan, and Yemen have applied for Commonwealth membership – should we add their names to the context tag now, or will we have to add them as exceptions if they join? Who is going to track membership in this organization?

There is no such language as “Commonwealth English.” Do any other dictionaries use “Commonwealth” as a regional label? Do any use “UK?” What constitutes the Commonwealth of Nations changes every few years, and neither of these political entities corresponds to a discrete linguistic region. —Michael Z. 2009-02-23 14:28 z

Template:Commonwealth English

I just discovered {{Commonwealth English}} and its redirect {{CE}}. Any objection to redirecting these to {{Commonwealth}}, since they are identical? The title “Commonwealth English” implies that this represents a discrete English dialect or orthography. —Michael Z. 2009-02-23 20:38 z

Done. —Michael Z. 2009-02-26 20:06 z

Preterite tense vs. past tense

I notice at Template talk:conjugation of that either past or preterite may be specified. What is the difference? If there is a difference, that will have consequences at least for the Norwegian entries since in Norwegian, we call past tense preteritum, but as we classify word forms on Wiktionary, past tense is the term we go by. __meco 14:09, 23 February 2009 (UTC)[reply]

In English, there is only a "past" tense, not a tense referred to as "preterite". The term "preterite" tends to be used in English only when discussing a language that has more than one way to construct the past tense. However, the usual "past" terms are "perfect", "imperfect", depending on the point of completion of the action.

In Romance languages, the "preterite" is a "perfect" tense for actions completed in the past, rather than on-going. However, the native term in the various Romance languages varies (French passé simple, Italian passato remoto, Spanish pretérito). The distinction is made in these langauges between the "preterite" (English term) and an imperfect past tense that describes repeated or habitual action in the past. English expresses the concept of imperfect verbs with a compound verb (more than one word) rather than a simple tense, so in English we simply call our tense the "past", even though it is "preterite" in meaning.

In German, the tense called "preterite" is an imperfect tense, not comparable to the "preterite" of Romance languages. I don't know about Swedish, but Plattdeutsch still has two past tenses (preterite/imperfect and perfect), and so would need to use a more specific term than "past". If there is only one Swedish past tense, then "past" would be the best choice, but if there are two then I recommend using "perfect" and "imperfect", since "preterite" can be misleading in referring to either one, depending on context. --EncycloPetey 16:01, 23 February 2009 (UTC)[reply]

Swedish has previously used imperfect, but it is now recommended to use preteritum (preterite) instead. There is additionally the perfect tense, corresponding to English have + <past participle>. \Mike 11:15, 24 February 2009 (UTC)[reply]

Swedish and Norwegian appear similar as preteritum (“preterite”) is the term which a few decades ago (when I began school in the early 70s my impression is that this changeover had just been effected) was called imperfektum (“imperfect past”). Then we have the perfektum (“perfect”) (=have gone) and the pluskvamperfektum (“past perfect, pluperfect tense”) (=had gone). __meco 17:00, 24 February 2009 (UTC)[reply]

The Cambridge Grammar of the English Language holds that English has two past tenses: the preterite, and the perfect. The preterite is the "simple past", generally formed by adding -ed. It is a "primary verb form". The perfect, often called an aspect, but in the CGEL a tense, is the paraphrastic past tense formed by combining the auxiliary verb have with the past participle. It is a "secondary verb form".--Brett 16:50, 23 February 2009 (UTC)[reply]

So, in order to drag a usable answer out of anyone ;), I intend to start using "preterite" for the Swedish tense. Will anyone object to that? \Mike 13:55, 2 March 2009 (UTC)[reply]

I won't object to it, but I recommend linking something somewhere that explains what "preterite" means in the context of Swedish entries, since it doesn't mean the same thing as the "preterite" in Romance languages. --EncycloPetey 14:33, 6 March 2009 (UTC)[reply]

Regional English context

Further to #Template:Commonwealth above, I surveyed regional English labels in a few dictionaries at hand. Here are the labels I found in use. [Please expand the list.]

CanOD Austral., Brit., Cdn, Irish, N Amer., NZ, Scot., US
NOAD (Mac OS X) Austral., Austral./NZ, Brit., Canadian, Indian
M–W online[11] Australian, Australian & New Zealand, British, Irish, Scottish
Dictionary.com[12] Anglo-Indian, Australian, Australia and New Zealand, British, British Dialect, Canada, Irish English, Scot, Southern U.S.
AHD (Dictionary.com) Australian & New Zealand, British, Irish, Scots
Webster's Revised Unabridged (Dictionary.com) [practically no indication of regional English]

Observations:

Wiktionary covers a wider variety of English dialects than these dictionaries.
Dictionaries have three different kinds of geographic labels:
1. Regional language: these are usually adjectives like “British,” identifying the regional English dialect the term is used in. I don't see the political entities Commonwealth or UK in use to identify variety of English.
2. Geographic context: less common, these are usually in the form “in England”, indicating where the sense applies (e.g., SAS is used worldwide, to refer to the Special Air Service which is in England, or to the Scandinavian Airlines System in Scandinavia). Some dictionaries make this a separately-formatted label, others include it in the prose of a definition. [not included in the list above]
3. Etymological: indicating in which English dialect the term originated. [not included in the list above]

—Michael Z. 2009-02-24 06:44 z

Regional labels in dictionaries are also treated in the following (anyone have access to the papers?):

Usage labels in Merriam–Webster.
Juhani Norri, “Regional Labels in Some British and American Dictionaries,” in International Journal of Lexicography, v 9 (1996), n 1, pp 1-29.
“Forum: Dialect Labeling in Dictionaries,” [several articles] in Dictionaries, v 18 (1997). index
Collation of language names, OED Online: full list, including varieties of English.

Inconsistencies among administrators, necessity for policy

I would rather refrain here from ornate speech, so I would go in res mediasby enumerating disturbing facts connected with administrators and asking for solution from these without trying to impose own opinion:

first incoherence: here EncycloPetey overtly discourages Old Norse and Old High German cognates in etymology sections of Latin entries. Here Atelaes encourages (or at least allows) the use of Old Norse cognate in Latin etymology. Here: Old Norse cognates are of course welcome in Latin etymologies.
second incoherence: Ivan permits cognates from the following set of languages in the etymology entries of each of them: OHG, Old Norse, Latin, Old Norse, Ancient Greek, Gothic, Sanskrit, Old Church Slavonic, Avestan, OE. Here however, he permits two cognates - Gothic and OE (above section), from the Germanic branch(the præferable would be to list Gothic (as the oldest-attested and the most archaic) and Old English/New English (when it's present), bold by me), to be included in non-Germanic etymologies. When I added the Gothic next to OE according to that, Atelaes reverted and blocked me (imposition of opinion through administrative power) Atelaes refuses to accept more than one Germanic cognate and overtly expresses following præference: if no OE cognate then Gothic, if no Gothic, then ON (Ivan treats Gothic and OE æqually !). Cf. the diff from the first incoherence, where he unconditionally permits ON
third incoherence: Atelaes explains: If the Gothic was not attested, then the Old Norse would be quite welcome on the entry - exactly that is the case in specio, where the ON was the only cognate, but EncycloPetey removes the ON cognate and protects the entry.
fourth incoherence: here Atelaes encourages the addition of Old Norse to Strabismus on his talk page, but on mine, as already quoted, I am permitted to add an Old Norse cognate only if there is no OE and no Gothic one. No such restraint for Strabismus. Additionally, after blocking me without giving any reason, he writes: It simply makes the entry's etymology a bit less focused and trim. If anyone else on this project had made that edit, I probably would not have reverted it, and certainly would not have blocked them. Why does the centuries younger Old Norse not make the etymology less focused, when Strabismus adds it, but the Gothic does, when I add it? Atelaes continues: I have opened up that discussion with you too many times - yes, but he is not the only person who has opened a discussion on my talk page, Ivan did also open such one and said: Old Norse, Avestan, Sanskrit, Latin, Ancient Greek, OCS, OHG, Gothic etc. go in one list. I am as sick and tired as scared as a consequence of this being torn apart between the opinions of the two administrators/between the opinion of the one of them expressed to Strabismus and the opinion expresses on my talk page.
As a cosequence of the second glaring inconsistency, when I added the Gothic cognate next to the OE one according to Ivan's advice, I was blocked by Atelaes. I demand an apology for this one-sided and reckless imposition of own opinion through administrative power on a user whose edit was in full compliance with what another admin told him.

Feeling the impeding and perplexing effect these four incoherences between three of the administrators here have on the average ordinary user, I demand from them to make a coherent single solution on all the discrepancies just disclosed here and thereby surmount them. Furthermore I demand explanation why Atelaes would allow from one editor to add Gothic, but from me not? Why is this double standard possible? As a conclusion, I would ask if there is a way for Atelaes to be restrained from imposing anew his one-sided views on me through blocking. I feel intimidated and one more reason for that is the following: In this case, I have to agree with Bogorm instead of I agree with Bogorm - this discloses assuming bad faith - agreeing with Bogorm is always an exception, disagreeing is the rule; the following: But yes, in short, Bogorm's listings are not appropriate in most cases. (I would rather not comment, but for me this is argumentum ad hominem) and the following Rest assured that I have driven away rather more useful contributors than you who were unable to work with consensus. - implicit invitation to retire from wiktionary and implicit claim that I do not want consensus. Solely for the consensus' sake did I open the current section and I expect from the three administrators and other interested a consensus as soon as possible, otherwise I shall be blocked, when complying with the advice of one administrator, by the other, with whose advice his is in dissonance. Bogorm 09:21, 24 February 2009 (UTC)[reply]
As far as Germanic cognates in ancient languages etymologies' sections are concerned, I stand by what I said: Gothic should always be welcome as it is by far the most archaic and earliest attested Germanic language. Anglo-Saxon would be allowed too since it's ancestral to New English. If there is no Gothic/AS cognate (very doubtful IMHO) but there is ON, ON is welcome. With that in mind, I wouldn't really agree with Atelaes' revert above, but I guess he was just pissed off by earlier actions of yours :P You're needlessly hyperbolising this whole issue which started as you deliberately adding modern cognates to ancient languages despite the fact that you've been warned multiple times of not doing so. It's prob. high time to fixate the common practice by policy in order to avoid further misinterpretations and confusions. --Ivan Štambuk 10:00, 24 February 2009 (UTC)[reply]

I repented for the modern ones, but we are not discussing them here, please, no here, lest the section become impossible to follow. None of the two reversions involves modern cognates. If Atelaes had blocked me for adding modern cognates, I would not have shown indignation, but in his block explanation he disclosed (on my talk page) that he blocks me for adding Gothic. What can we endeavour in the etymology of θρῆνος, if you always support Gothic cognate? I feel angst for this entry after what was unleashed and therefore I would rather not edit it henceforth. Bogorm 10:05, 24 February 2009 (UTC)[reply]

OK, enough with this. Bogorm any anyone else, you can comment the relevant WT:ETY changes here. --Ivan Štambuk 10:55, 24 February 2009 (UTC)[reply]

Sense only in plural

I'm working on a entry (not yet ready to commit the edit as I need to check some translations first to make certain of them) in which some senses are found only in the plural. Should I place the plural-only meanings at the singular lemma with a usage note, or as additional definitions in the plural form after the line I use {{plural of}} to indicate the lemma? Carolina wren 20:32, 24 February 2009 (UTC)[reply]

pants doesn't mention things on pant. And {{pluralonly}} may be useful. Conrad.Irwin 21:14, 24 February 2009 (UTC)[reply]

Exporting flashcards, and importing open source dictionaries

I'd really like to be able to export Wiktionary entries as either electronic flashcards, or printable flashcards.

For entries that support it, each card could have the keyword, the part of speech, pronunciation, and definition. Ideally, users could select which elements appear on which side of the cards, and in the digital realm, there is no reason not to have three-sided, or even many-sided cards.

Because of the existing structure between Wiktionaries for different languages, this would also make a great foreign language study resource. Below are links to an example set using the Mongolian, English and Turkish words for "mosquito."

Mongolian-English http://en.wiktionary.org/wiki/%D0%B4%D1%8D%D0%BB%D1%8D%D0%BD%D1%87

Turkish-Mongolian http://tr.wiktionary.org/wiki/%D0%B4%D1%8D%D0%BB%D1%8D%D0%BD%D1%87

The hex characters in the URL translate to дэлэнч, the Mongolian word for mosquito. The first URL above is equivalent to http://en.wiktionary.org/wiki/дэлэнч and the second is equivalent to http://tr.wiktionary.org/wiki/дэлэнч. This would make it very easy to create sets of flashcards from sets of words. All that would be needed is a method of exporting the resulting list.

The low-tech work-around here is to use an HTML parser, or even a screen scraper, that pulls up the URL for each term on the list in the desired Wiktionary, and copies the source term and its corresponding information out to a file for export. This places a greater load on the organizations resources, so if for that reason alone, it makes sense to build this functionality into Wiktionary.

I'm interested to write a web app to display flashcards and go through exercises, but I'm not able to get into extending Wiktionary right now. Anyone else interested in helping with this?

On a related note, it would be useful to have support for uploading from open source dictionaries that first looks for a correspondence between keywords and Wikipedia to find ontological references as categories in each new Wiktionary entry. This would allow the flashcard exporter to use classes of words, e.g. "Countries in Africa" <http://en.wikipedia.org/wiki/Countries_in_africa> or "Kinship Terms" <http://en.wikipedia.org/wiki/Kinship_terms#Abbreviations_for_genealogical_relationships> or "Counties in Washington" <http://en.wikipedia.org/wiki/Counties_in_Washington>. This style of exporting could support both lists of source terms for more specific output, and wholesale exports of the whole category.

Again, this really should be built as a feature of the import engine, rather than a crude work-around that loads the system.

I'm very interested in everyone's input here. Thanks for your time, and hopefully, your help.

What is the output format you're looking for? GIFT "matching question" format, or something XML-based?

In terms of having it built into the export engine, I wouldn't hold my breath. :-/

In terms of getting the data, I suppose the first question is whether you want to use data from multiple Wiktionaries (which may be more difficult than it seems), or data from only one Wiktionary (translations and/or entries). In either case, using the API (e.g. [13]) would be somewhat easier on the server and also easier to work with. For EN Wiktionary specifically, we also have daily XML dumps ([14]). -- Visviva 05:45, 28 February 2009 (UTC)[reply]

February 2009

Literary Chinese

Just a heads-up that I’ve created a new language template, Template:lzh, for the recently added ISO code for Literary Chinese; I’ve also made associated Category:ja:Literary Chinese derivations, Category:cmn:Literary Chinese derivations, Category:Literary Chinese derivations, etc.

While being cognizant of the distinction between Literary Chinese and Classical Chinese (i.e., post-Han to 20th century vs. until fall of Han), as in Late Latin/Classical Latin, this seems a useful language code to use, esp. for Chinese proverbs of classical provenance; this arose in giving a proper etymology to the Japanese proverb 井の中の蛙大海を知らず (a frog in a well cannot conceive of the ocean), from Zhuangzi.

Hope I’ve created this correctly and that it is of use.

—Nils von Barth (nbarth) (talk) 23:07, 21 February 2009 (UTC)[reply]

Just to clarify the way I've been doing it so far, I've been using the terms "literary" and "archaic" more in the western sense. When I put literary beside a definition for a Chinese word, I mean that the word in question is most likely to be encountered in a literary work. By literary work, I mean anything ancient or modern that the average reader would comprehend upon reading, but would not necessarily use when speaking. A simple example would be 明日 and 明天. Both words mean tomorrow. Neither of these words is considered to be Literary Chinese (Classical Chinese), however 明日 is generally only used in writing, whereas 明天 is used in spoken Chinese. Therefore, I would mark 明日 as "literary" (again, not that it is classical Chinese, but that it is a written rather than spoken word). By contrast, when I mark a word as "archaic," I mean that the word is no longer commonly used in modern spoken or written Chinese (an English example would be "thou"). It is important to note that "archaic" does not necessarily equal "classical." Some classical Chinese is preserved in modern spoken Chinese, and is therefore not archaic. This is particularly true in the case of idioms and proverbs. So for example, 太僕 (Minister Coachman) is an "archaic" term. However, 有朋自遠方來，不亦樂乎 ("Is it not delightful to have friends coming from distant quarters?") is a classical Chinese phrase dating back to the Analects of Confucius, but is well known to modern Chinese speakers. Hopefully, that explains my approach. Please let me know if you think it needs adjusting. For example, maybe I should use "written" instead of "literary" to avoid confusion. Thanks. -- A-cai 12:26, 22 February 2009 (UTC)[reply]

P.S. - In the Pinyin Chinese-English Dictionary, they use Chinese characters to make the same distinctions. So, for where I would use "literary," they would use <书>. Where I would use "archaic," they would use <旧>. -- A-cai 12:38, 22 February 2009 (UTC)[reply]

Hi A-cai – thanks for clarifying; that’s very helpful!

As I understand it, you are using {literary} and {archaic} as Usage context labels in the Definitions of Modern Chinese terms, which is fine, and these seem the correct terms.

I was suggesting using {etyl|lzh|cmn} in Etymology sections of terms or phrases that can be dated back to classical times, like 井底之蛙 (jǐngdǐzhīwā) or anything from the Analects.

If anything, perhaps we should change the displayed name of {lzh} from the ISO (but potentially confusing) “Literary Chinese” to “Classical Chinese” – but perhaps this would set us up for trouble in future, if ISO adds {azh} or such for Ancient/Classical Chinese, as distinct from Literary Chinese.

I don’t think you need make any changes to your current approach; if you’d like and find it useful, perhaps you could use {etyl|lzh|cmn} in Etymology sections of relevant entries.

I’ve added your comments about the distinction between the Literary Chinese language and the literary register of Modern Chinese at Literary Chinese#Usage notes, and made a brief start at a discussion of Chinese Etymology at Wiktionary:About Chinese#Etymology; not sure where we’d like to go with Chinese Etymologies.

Thanks again!

—Nils von Barth (nbarth) (talk) 14:41, 22 February 2009 (UTC)[reply]

Thanks for your discussion, however, I do not agree with the explanation above. Where 明日 was frequently used in Literary Chinese too, e.g. "明日復明日，明日何其多". In reality, Classical Chinese is very similiar to Literary Chinese, and should not obtain two different ISO code. However, Old Chinese, which means the pronounication and grammar of Chinese Language before Zhou Dynasty is compeletly different from Literary/Classical Chinese, thus it reamins as a separate ISO code.--Itsmine 05:33, 26 March 2009 (UTC)[reply]

Allow me to rephrase. When I label a Wiktionary definition as "literary," I'm not referring to 文言文 (Literary/Classical Chinese). I'm merely referring to the fact that a word such as 明日 tends to be written, rather than spoken, in Modern Mandarin. However, you are correct in pointing out that 明日 can be found in "literary Chinese" (Classical Chinese). In fact, the vast majority of modern written Chinese words trace their origins to "Literary/Classical" Chinese. Some "literary/Classical" Chinese survives in Modern Chinese, while most has not. The words that have not survived into modern Chinese are labelled "archaic." I recognize that these distinctions can be difficult to pin down at times, and may be a bit subjective. As such, if you feel that I have labelled a specific word incorrectly, please post a note on the talk page for that entry, along with your reasoning. I am happy to revisit any of my entries, if need be. As for the ISO codes, I had never made an argument for or against them. Hope this clears things up. -- A-cai 22:28, 25 April 2009 (UTC)[reply]

Wiktionary:Beer parlour/2009/February

Contents

February 2009

Wiktionary:Votes/2009-02/Amending ELE Order of Headings

language agnostic?

AWB request

Unknown etymologies

Completely Obvious Things

New word

Entries with "Shorthand" sections

Leading bullets in `R:` templates

Entry layout explained confusion

ELE clarifications

Formatting of intentional errors

Archiving/cleaning up of talk pages

Ethical question

Web Services

Discussion board for etymologies

"No exact match" and multi-word searches

Inflection tables

Names of animals

GENDER update

Japanese Etymology

Etymology sections

preserving ancient cognates in etymologies

overloading etymologies with cognates

Wikimania 2009

Template:Commonwealth

Template:Commonwealth English

Preterite tense vs. past tense

Regional English context

Inconsistencies among administrators, necessity for policy

Sense only in plural

Exporting flashcards, and importing open source dictionaries

February 2009

Literary Chinese

Navigation menu

Wiktionary:Beer parlour/2009/February

February 2009

Wiktionary:Votes/2009-02/Amending ELE Order of Headings

language agnostic?

AWB request

Unknown etymologies

Completely Obvious Things

New word

Entries with "Shorthand" sections

Leading bullets in R: templates

Entry layout explained confusion

ELE clarifications

Formatting of intentional errors

Archiving/cleaning up of talk pages

Ethical question

Web Services

Discussion board for etymologies

"No exact match" and multi-word searches

Inflection tables

Names of animals

GENDER update

Japanese Etymology

Etymology sections

preserving ancient cognates in etymologies

overloading etymologies with cognates

Wikimania 2009

Template:Commonwealth

Template:Commonwealth English

Preterite tense vs. past tense

Regional English context

Inconsistencies among administrators, necessity for policy

Sense only in plural

Exporting flashcards, and importing open source dictionaries

February 2009

Literary Chinese

Navigation menu

Search

Leading bullets in `R:` templates