Wiktionary talk:Translations

Definition from Wiktionary, the free dictionary
Jump to: navigation, search

Archives[edit]

Archives

Earlier page: Wiktionary:Language considerations

Archive of discussion copied from Beer parlour - 22-May-2005=== then modified.

When a discussion becomes inactive, move it if necessary and link to it below.

Following are forked discussions. When one becomes inactive, remove it from here and move just its link up above.

Wiktionary talk:Translations/Translation into lemma only[edit]

from Wiktionary:Beer parlour#Translation into lemma only Rod (A. Smith) 22:13, 3 November 2007 (UTC)

WT:ELE#Translation dos and don'ts says, “Do follow the translations of nouns and adjectives by their grammatical gender, if appropriate, using the templates {{m}}, {{f}}, {{n}} and {{c}} for "masculine", "feminine", "neuter" and "common" respectively.” I suspect some editors have interpreted that sentence as a request to provide more than just the lemma form of the adjective (e.g. here). In the languages I know of whose adjectives have gender (i.e. the Romance languages), only one form is considered the lemma, e.g. the masculine singular form. It only makes sense for the translation section to list the lemma form of such adjectives, except perhaps in rare cases where English actually has a gender-specific adjective (e.g. “blonde”) and that gender does not match the gender of the lemma in the target language.

So, should we change “the translations of nouns and adjectives by their grammatical gender” in WT:ELE#Translation dos and don'ts to “the translations of nouns by their grammatical gender” or perhaps “the translations of nouns and adjectives (to be given in lemma form only) by their grammatical gender”? Or perhaps should we just insert a line above it to clarify that we only want the lemma form of each translation? Rod (A. Smith) 21:07, 12 September 2007 (UTC)

Yes, please. I don't have a specific preference for how it should be clarified, but those giving-all-forms translators are putting us giving-only-lemmata translators to shame, and I won't stand for it! :-P —RuakhTALK 22:08, 12 September 2007 (UTC)
I can't tell if you are joking or not. --Connel MacKenzie 23:29, 12 September 2007 (UTC) Or rather, what part of the above is intended as a joke...all? --Connel MacKenzie 00:36, 13 September 2007 (UTC)
My point was serious, my wording was a joke. It looks wrong to give all forms of some translations and only lemma forms of others, but it's not feasible to give all forms of all translations. It also looks wrong to label the gender of an adjective, unless all of its forms are provided; after all, the adjective itself isn't (for example) masculine, it's just that we use the masculine form to represent it in the table. I'm not necessarily saying we should prohibit adding non-lemmata, but I'd like for ELE to be rephrased in a way that makes it clear that adjectives don't need to be labeled for gender. And if someone started removing all non-lemma forms from translation tables, I wouldn't be the one complaining. —RuakhTALK 05:11, 13 September 2007 (UTC)
I think it is very unwise to prohibit the other inflected forms. Within translation tables, that may be true for many languages, but it is not true for entries themselves. Being inconsistent in the TT instructions may cause unnecessary newbie confusion. That inconsistency could result in the return of contributors ripping out valuable translations of pronouns again (well, that hasn't happened in a long time, but could, with the above wording.) Dictating "lemma form only" is a Bad Thing, anywhere, and probably far too restrictive in the long run. --Connel MacKenzie 23:29, 12 September 2007 (UTC)
Connel, I'm confused by your comment, “Within translation tables, that may be true for many languages, but it is not true for entries themselves.” Above, I am only suggesting to limit translations within translation tables (in English entries) to lemma forms only. I was not suggesting to prohibit entries for the non-lemma forms. What are “the TT instructions” and what does this have to do with translation of pronouns? You really are trying to confuse me, aren't you? Rod (A. Smith) 00:25, 13 September 2007 (UTC)
TT = "translation table". I was saying that we do (as a multilingual dictionary) want entries for all forms of a word. So, on the entry pages themselves, we pretty much require that the full conjugations are spelled out, wikilinked and entered. To prohibit them in the TT seems wrong, or at least inconsistent. The notion that they would be prohibited for nouns would lead to confusion about pronouns (and that is the example, with Ncik, that sprung to mind.) --Connel MacKenzie 00:36, 13 September 2007 (UTC)
Connel, I think you're missing the point of what Rod is saying. I'll provide a specific example based on my understanding. In the translation table for the English word slow, what should be given as the Latin translation? Should it:
  1. just be the lemma form lentus
  2. or have all possible nominative singular forms lentus m, lenta f, lentum n
  3. or have all possible nominative forms lentus m, lenta f, lentum n, lentī m pl, lentae f pl, lenta n pl
  4. or (as you suggest) have all the 36 forms listed in the table below:
    First/second declension.
Number Singular Plural
Case \ Gender Masculine Feminine Neuter Masculine Feminine Neuter
nominative lentus lenta lentum lentī lentae lenta
genitive lentī lentae lentī lentōrum lentārum lentōrum
dative lentō lentae lentō lentīs lentīs lentīs
accusative lentum lentam lentum lentōs lentās lenta
ablative lentō lentā lentō lentīs lentīs lentīs
vocative lente lenta lentum lentī lentae lenta
To date, my understanding has been that when we translate in a translation table, we give the link only to the lemma, and the lemma page of the translation will carry all the necessary grammatical information the reader will need. Such detailed grammatical information does not belong in a translation table, and certainly doesn't need to be duplicated in every translation table where a word appears. --EncycloPetey 16:05, 14 September 2007 (UTC)
I am consistently taking them out wherever I see them. I would greatly support such a clarification. But indeed, it has to be clear that this counts only for translation sections. H. (talk) 13:06, 13 September 2007 (UTC)
I think the gender markings should stay (for the lemmata), since it is a visual key that the word carries gender. --EncycloPetey 16:05, 14 September 2007 (UTC)
When a single English noun has a different translation in the target language when applied to a man and to a woman, both should be allowed in the translation table, because they may be considered as different nouns (this does not apply to adjectives...) Lmaltier 15:51, 15 September 2007 (UTC)
Good point. How about this?
Add these: (The following text was changed per suggestions from Lmaltier and EncycloPetey below. Rod (A. Smith) 23:07, 17 September 2007 (UTC))
  • For nouns, do translate only the English lemma form into the target language lemma form. If the target lemma is specific to one gender and the English noun is used for both genders or for a gender different from that of the target lemma, provide a translation for both genders.
  • For other parts of speech (e.g. adjectives, adverbs, verbs), do translate only the English lemma form into only the lemma form for the target language.
immediately before this existing instruction:
  • Do follow the translations of nouns and adjectives by their grammatical gender, if appropriate, using the templates {{m}}, {{f}}, {{n}} and {{c}} for "masculine", "feminine", "neuter" and "common" respectively.
How does that seem? Rod (A. Smith) 18:45, 15 September 2007 (UTC)
I just would add to the first part: If two different nouns exist in the target language depending on the gender (for the same English word), provide both. Lmaltier 06:51, 16 September 2007 (UTC)
I altered the suggestion above to incorporate your recommendation. Comments? Rod (A. Smith) 23:59, 16 September 2007 (UTC)
I would rearrange the wording as "For each noun, do..." and "For other parts of speech, do..." in oreder to make it immediately clear that the first sentence deals only with nouns. --EncycloPetey 15:44, 17 September 2007 (UTC)
Done. Rod (A. Smith) 23:07, 17 September 2007 (UTC)

Please help refine the language of Wiktionary:Votes/2007-10/Translation into lemma only. Rod (A. Smith) 02:03, 15 October 2007 (UTC)

Considering the TLDR problem of WT:ELE, and the fact that WT:ELE#Translations already links to Wiktionary:Translations, I have updated the latter, merging in everything from the former, updating the examples to match current practice, and reorganizing it a bit. I did this with the idea that readers may be best served by our trimming WT:ELE#Translations down to a brief paragraph. Readers who are further interested can click through to the detailed document. Comments? Rod (A. Smith) 21:01, 18 October 2007 (UTC)
I have withdrawn Wiktionary:Votes/2007-10/Translation into lemma only as suggested above. A discussion at Wiktionary talk:Votes/2007-10/Translation into lemma only#Wording questioned whether each gender specific version of animate nouns in gender-inflected languages should be considered a lemma (e.g. whether Spanish chica (girl) and chico (boy; child) are both lemmata). In support of the position that the underlying lexeme is gender-neutral, I collected references, which are now at Wiktionary talk:Translations#Animate noun gender inflection for posterity. If anyone would like to discuss that topic, Wiktionary talk:Translations#Animate noun gender inflection seems an appropriate place. My plan for lemma translations is to let Wiktionary:Votes/2007-10/Lemma entries conclude, then to propose trimming WT:ELE#Translations, modifying Wiktionary:Translations, and approving it as policy. Rod (A. Smith) 18:27, 19 October 2007 (UTC)

Wiktionary talk:Votes/2007-10/Translation into lemma only[edit]

Wording[edit]

"If the target lemma is specific to one gender and the English noun is used...for a gender different from that of the target lemma, provide a translation for both genders. "

So you'd translate girlfriend as novio? DAVilla 00:29, 16 October 2007 (UTC)

The intent is to translate good as just bueno m; friend as amigo m, amiga f; boy as chico m; and girl as chica f. That wasn't clear from the original wording, so I reworded it. Better? Rod (A. Smith) 07:14, 16 October 2007 (UTC)
Ruakh improved the text, and as it stands covers the first two numbered situations below. Reviewing WT:BP#Translation into lemma only, I notice we're missing the request made by Lmaltier: If two different nouns exist in the target language depending on the gender (for the same English word), provide both. To cover that request, I think the wording needs expanded:
  1. For foreign nouns without grammatical gender, those whose grammatical gender is unrelated to the semantic gender, those whose form is gender invariant, and those whose lemma gender matches the only English semantic gender, only give the lemma translation:
  2. For foreign nouns with grammatical gender that varies with semantic gender and whose lemma form is different from the only semantic gender of the English noun, list the translation that varies from the lemma only by gender but link to the lemma entry:
  3. For foreign nouns whose grammatical gender varies with semantic gender and whose lemma form matches one of the two semantic genders of the English noun, only the lemma translation and the equivalent for the other semantic gender(s) should be given:
Is there a clear but brief way to specify the above? Should I delay the vote to allow wording refinement? Rod (A. Smith) 01:25, 17 October 2007 (UTC)
I don't much like the combination of numbers 2 and 3; either of them by itself makes sense, but taken together they seem somewhat contradictory in spirit. Also, by only talking about gender, the section as a whole makes number 1 seem pointless: obviously, if the foreign language has only one gender for a given noun, then that's what we translate to. I think I'd prefer something like this:
  • In translating nouns:
    • In many languages, nouns have different forms depending on number, case, definiteness, and/or other factors. Only one form, called the "lemma form", appears as headwords in dictionaries, and only this form should appear in translation tables in English noun entries. So, for example, the English translations at computer include Spanish computadora ("computer"), but not Spanish computadoras ("computers").
    • Where English has a single gender-neutral noun, many languages will have two nouns (or one noun with two forms, depending how you look at it), one for men/males and one for women/females. For example, English cousin corresponds to both Spanish primo ("[male] cousin") and Spanish prima ("[female] cousin"). In such cases, both the male/masculine lemma form and the female/feminine lemma form should be listed in the translation table.
    • Where English has a gender-specific noun, like boy or girl, the translations should correspond to the gender. In the event that the foreign-language entry is a soft redirect to an entry with a different gender (say, with a definition along the lines of "Feminine counterpart to noun."), then … ??
  • In translating adjectives:
    • In many languages, adjectives have different forms depending on number, gender, case, definiteness, and/or other factors. Only one form, called the "lemma form", appears as headwords in dictionaries, and only this form should appear in translation tables in English noun entries. So, for example, the English translations at good include Spanish bueno (the masculine singular form), but not Spanish buena (the feminine singular).
I don't know what to do in the last noun case. My preference would be for something like "chica f, see chico", but I'm not sure how other editors would feel about that.
RuakhTALK 03:48, 17 October 2007 (UTC)
My biggest motivation in this is to stop editors from filling adjective translations with the non-lemma genders, so I won't put up a strong fight against such variations as above. My opinion, though, regarding non-lemma entries and "see" is that hyperlinks are a terse way to say "see the entry linked here for details". Assuming that Spanish chica is to be considered a non-lemma form of Spanish chico in the same way as the plural noun form words (ug, I just noticed that "words" has translations), all of the details to be found at "chica#Spanish" and "es:chica#Español" should also be at "chico#Spanish" and "es:chico#Español". It seems, then, that "chica f" is the terse and reader-friendly way to say "chica f, see chico". I guess this makes most sense in the context of the other ELE lemma discussion, i.e. acknowledging in WT:ELE that some words are to be given full lemma entries but others should be soft redirects. Rod (A. Smith) 17:59, 17 October 2007 (UTC)
Personally, I don't really agree that chica is just a non-lemma form of chico, but I'm O.K. with treating it as one for Wiktionary purposes. The problem is, your proposal doesn't do that; #2 in your proposal treats chica as a non-lemma form of chico, while #3 treats chico and chica as separate nouns. —RuakhTALK 18:40, 17 October 2007 (UTC)
I agree that the proposal as it now stands is unfinished. It's an inadequate attempt to incorporate Lmatier's request to list m/f noun forms. I have delayed the start and end by a week so we can work out the kinks.
As for the lemma/non-lemma status of "chica", consider that a single member of "chicos" is either "chico" or "chica". Thus, all four forms ("-o", "-a", "os", "-as") are inflections of a single word. (Note that English plurals denote some semantics of quantity not denoted by the corresponding singular, but the semantics of quantity do not grant lemma status to plurals. Similarly, "chica" and similar Spanish feminine noun forms denote human gender without granting them lemma status.) Rod (A. Smith) 19:33, 17 October 2007 (UTC)
Chicos can mean "children" (generally) or "boys" (specifically); "chicos y chicas" gets 1.4 million Google hits, which wouldn't make sense if chicas were strictly a subset of chicos. However, like I said, I'm O.K. with taking chica as a form of chico for Wiktionary purposes, as I do see the argument for that approach; my concern is with the inconsistency, firstly because in general I dislike inconsistency, and secondly because I can only imagine that new editors find inconsistencies harder to learn. —RuakhTALK 21:46, 17 October 2007 (UTC)
(Strangely, the query you linked only gives me 289,000 hits. Did Google reindex after your search? Yes, "chicos y chicas" is common, but "chicos" by itself does not specify a gender, except to exclude a group of specifically all females.) I also prefer consistency, so I'll think about this, see whether more feedback comes, and attempt another revision. Rod (A. Smith) 22:24, 17 October 2007 (UTC)
Re: "Strangely, the query you linked only gives me 289,000 hits.": Even more strangely, I now clicked again, and it says there are 1.59 million hits! Dunno what's up with that. Regardless, it gets tons of hits, enough to make my point. :-P   —RuakhTALK 23:04, 17 October 2007 (UTC)

Some slight changes in wording I recommend:

  1. In the ===Lemma entries section===, say "kinds of words", not types of words. There is an important and often overlooked difference between these words. A type is a theoretical model, not a specific instance (kind). I strongly prefer the word "kind".
  2. "...and words of other inflected languages". I think it would be better to say "and in some cases, words in other inflected languages are handled differently." with the phrase "other inflected languages" linked to Wiktionary:Language considerations.
  3. What does "optinally" mean? I assume this is a typo.

The only additional thing I don't see here is a clear explanation ot what "lemma" means. I think rather than simply linking to the WT entry for that term, we need an explanation specific to ELE. --EncycloPetey 23:36, 25 October 2007 (UTC)

The above suggestions refer to Wiktionary:Votes/2007-10/Lemma entries. I have changed that text as recommended, linking to Wiktionary:Glossary#L for clarification of the lemma.[1] I did slightly alter the suggested change. Let me know if the reword conflicts with your intent. Rod (A. Smith) 16:27, 26 October 2007 (UTC)

Animate noun gender inflection[edit]

This section moved to Wiktionary talk:Translations#Animate noun gender inflection. Rod (A. Smith) 18:11, 19 October 2007 (UTC)

What's the point?[edit]

Sorry, but it is entirely unclear what this addresses. The wording in it, seems wrong, in that it prohibits glosses for non-lemma foreign language entries. But that aside, it is giving no new clarification that I can see, only adding verbosity to a policy that is already far too long. Am I missing something here? --Connel MacKenzie 17:07, 26 October 2007 (UTC)

Is the wrong vote linked from WT:VOTE? --Connel MacKenzie 17:08, 26 October 2007 (UTC)

I think the point is to have

==Translations==
*Romanian: produs
*Spanish: hecho

instead of

==Translations==
*Romanian: produs m, produsă f, produşi m pl, produse f pl
*Spanish: hecho m, hecha f, hechos m pl, hechas f pl

which seems to happen relatively frequently in translation tables. That help? — [ ric | opiaterein ] — 17:40, 26 October 2007 (UTC)

2010 update[edit]

this vote was withdrawn before it started. I'm not sure in such circumstances how honest it is to reopen it. It probably is. How much can we change the wording? Mglovesfun (talk) 11:49, 21 June 2010 (UTC)

Wiktionary talk:Translations/Noting lemma forms in WT:ELE[edit]

from WT:BP#Noting lemma forms in WT:ELE Rod (A. Smith) 22:20, 3 November 2007 (UTC)

With the recent update to WT:AJ, we documented our common practice of differentiating the full format used for lemma entries from the abbreviated format used for non-lemma entries. We should do something similar in WT:ELE. I think the following wording reflects how we currently select English lemmata and belongs in the ==Basics== section of WT:ELE:

===Lemma entries===
Each language may have its own traditional choice of lemma forms for various parts of speech. For English entries, the lemma form is usually the “bare” form: the singular form for nouns (e.g. word but not words), the bare infinitive form for verbs (e.g. talk but not talks, talking, or talked), and the positive form for adjectives (e.g. easy but not easier or easiest). With some types of words, an alternate form is the preferred lemma, e.g. the plural entry for a plurale tantum. For other types of words, there may be no distinction between lemma and non-lemma forms, e.g. for pronouns, articles, prepositions, and defective verbs like may (have permission to) that lack an infinitive form. With such terms, all forms are treated as lemmata. When the situation is unclear, editors are advised to use their best judgment on a case by case basis.
Following are guidelines for entries for the lemma form of terms. For non-lemma entries (e.g. for the plurals of most nouns), a more abbreviated format is used instead.

Is the above accurate? Is it too wordy for WT:ELE? Does it allow for enough or too much flexibility? Rod (A. Smith) 07:16, 24 September 2007 (UTC)

Sounds good to me. But ideally ELE would address the format for "form-of" entries as well. -- Visviva 07:56, 24 September 2007 (UTC)
I think it would be much better to avoid using the term "lemma" instead of "base form." The above doesn't read very cleanly. But the very last phrase is concerning. AFAIK, we welcome clarifications on the form-of entries. (That is, tags and a gloss, examples, pronunciation, etc.) The wording you have above, suggests those be removed. Again, AFAIK, only the ===Translations=== section is discouraged in the "form-of" entries (and only for noun forms and verb forms.) Is there a better sub-page for this? Entry layout explained is far too long as it is. It obviously fails the TLDR test for most newcomers. --Connel MacKenzie 08:21, 24 September 2007 (UTC)
I'm not sure about "base form", because that seems easily confused with "stem" or "root". "Citation form" or "canonical form" are pretty clear, but I'm not sure either is better than "lemma". Is there a precise layman synonym? Rod (A. Smith) 22:15, 9 October 2007 (UTC)
Good suggestions, Visvia and Connel. We seem to need Wiktionary:Entry in a nutshell for the TLDR problem and the full WT:ELE for the specifics, including the details for the non-lemma formats. My understanding is that non-lemma entries are not supposed to include the following:
  • Etymology (e.g., the entry for speaking should not show the Old English etymons spēcan or sprēcan)
  • Other non-lemma inflections (e.g., the entry for speaking should not show the inflection spoke on the headword line)
  • Detailed definitions (e.g., the entry for speaking should show that it is the present participle of speak, but should not have separate definitions “communicating with one's voice”, “having a conversation”, “communicating by some means other than orally”, “delivering a message to a group”, or “being able to communicate in a language”)
  • Synonyms, antonyms, or other -onyms
  • Translations
Is anyone under the impression that non-lemma entries should contain the above information? Rod (A. Smith) 18:01, 24 September 2007 (UTC)
My understanding is that a non-lemma should include the following:
  • Alternative spellings (or whatever)
  • Pronunciation
  • POS header
  • Simple bold term inflection line (possibly with gender, number, etc. specific to that form but not any other inflectional forms)
  • A gloss/definition that links to the lemma and explains the relationship
  • Example sentences
  • Supporting citations
My understanding also is that a non-lemma should not contain other kinds of information, though I can imagine there may be exceptions in unusual situations, such as when a past tense form has an unusual etymology. Including synonyms & antonyms would get messy; consider that the antonym of whiter is less white, not blacker. Likewise you run into problems if you're going to have synonym listings for all the inflections of Latin nouns, adjectives, verbs, etc. --EncycloPetey 23:25, 24 September 2007 (UTC)
I don't think non-lemma entries usually need alternative spellings, example sentences, or supporting citations; all of those should ordinarily go in the lemma page. Conversely, I'm O.K. with a term derived from a non-lemma appearing at both the lemma entry and the non-lemma entry (say, double-dealing being listed both at deal and at dealing), and I'm O.K. with non-lemmata belonging to relevant lexical categories (like Category:English plurals). —RuakhTALK 20:03, 27 September 2007 (UTC)
I have to disagree strongly about alternative spellings and citations. The non-lemmae need to have separate citations listed because they are different spellings from the lemma! In Latin, for example, the reason we know that a particular word is irregular is through documented citations of the irregularity. It would be silly to burden the lemma page citations with all the dozens of inflected forms, and would be difficult for anyone trying to make use of the data to parse out the appropriate citation or two that supports, say, the irregular dative feminine plural form.
Likewise, an inflected form may have an alternative spelling that the lemma does not have. A non-lemma page should therefore have an alternative spellings section of its own. --EncycloPetey 22:15, 28 September 2007 (UTC)
Re: citations: Are you saying that citations for an inflected form should never appear at the lemma entry, or only that they shouldn't necessarily appear at the lemma entry? If the latter, then I agree completely; but if the former, then I beg to differ. If in the oldest known citation using a noun, it appears in the plural genitive, then I think that citation needs to be included in the entry for the noun, i.e. the entry for the noun's singular nominative form. The entry for the lemma is really the entry for the word as a whole; that's why we define dictionary as "A publication, usually in the form of a book, […]" and not "The singular form of a noun used to refer to a publication, usually in the form of a book, […]" or whatnot.
Re: alternative spellings: How can that be? Either the same word has multiple spellings for one form (in which case the lemma page should note that in the inflection line, inflection table, and/or usage notes), or it has multiple spellings for all forms (in which case the lemma page will have its own alternative-spellings section). Can you give an example of what you mean?
RuakhTALK 00:22, 30 September 2007 (UTC)
I do mean that they shouldn't necessarily appear in the lemma page. I imagine that the citations for a lemma page could include the lemma form or various inflected forms, and even alternative spellings (see the citations for parrot). However, I want to see the inflected forms backed up with at least some citations specific to that form and listed on the non-lemma page.
With regard to alternative spellings, I mean exactly what I say. sometimes only one or two of the inflected forms have alternative spellings but the lemma does not. For example, a Latin word might have an alternative form for the accusative singular, but not for any of the other forms. For example ingēns has two forms of the ablative singular, but only a single form for each of the other inflections. I've set up the appropriate pages for ingēns, ingenti#Latin, and ingente#Latin to show how I would handle this. Putting the alternative spelling information onto only the lemma page would cause the information to be visually lost. In the case of some entries with multiple inflectional parts with mutliple forms (such as deus), it can be be downright confusing. --EncycloPetey 03:09, 30 September 2007 (UTC)
It seems like your approach would be less effective for deus, because suddenly deī would need to have a full-out explanation of its alternative spellings dīvi, , diī, and dii, (are those last two really different, or is that just a typo?) explaining which alternative spellings exist for which senses. If, instead, all the information is in the declension table at deus (as it is currently), and the entries for inflected forms all point there, then we don't need to worry about all these complicated explanations at the inflected forms; if we want to be explicit about it, we can even have dei#Alternative spellings say to see deus. —RuakhTALK 03:40, 30 September 2007 (UTC)
As I said, it gets really confusing in the case of deus, but that's an exceptional case. There are only a handful of those in the whole of the Latin language. For most alternative spelling issues, there are just one to three forms with alternatives (or all of them), and there's usually just one alternative. --EncycloPetey 04:17, 30 September 2007 (UTC)

I've been talking to Ruakh about including definitions etc. of non-lemma entries, and I disagree quite strongly with what appears to be the current practice of simply stating the grammatical form of a word. If a reader doesn't know these grammatical terms, it isn't helpful for them to see something like "first-person singular pluperfect form of X" instead of an actual definition. For example, I'm going to use the word citisem, the first-person singular pluperfect form of the verb citi. Assuming that the reader does not know what "pluperfect" and "citi" mean, if the entry for citisem only includes "first-person singular pluperfect form of citi" they need to look first at citisem, to see the form, citi, to see "to read" and then pluperfect, to see that it means "pertaining to action completed before or at the same time as another," which honestly might not be the best definition. After all of this, without the definition "I had read," they still might not know what "citisem" means. Better I say to see as much as possible about the word "citisem" in its own entry than for it to be necessary to hop around wiktionary looking at definitions for other words. It might take more time to include all this information, but in the end it's more helpful for the readers. Sorry for being so wordy, but I feel strongly about this =) — [ ric | opiaterein ] — 18:25, 30 September 2007 (UTC)

The reason is that we want to avoid multiple unnecessary duplication of content. Consider that the Latin adjective albus has 3 definitions and 35 inflected forms. If we add definitions to the non-lemma pages, that's 105 additional definition lines that have to be added and maintained. Then there are the comparative and superlative forms of all Latin adjectives, which aren't included in the count above. Just for the forms of albus, that's an additional 42 forms with 3 base definitions. Now multiply all that by two, beacuse the comparative of "white" in Latin can mean whiter, but can also mean "rather white", and the superlative can mean "whitest" but can also mean "very white". It makes more sense to explain this behavior of Latin adjectives once in a single location instead of needlessly repeating it on thousands of individual entries.
Likewise, consider that the Spanish verb tener has 7 definitions and 61 additional inflected forms. That makes 427 additional definition lines. And what happens when someone adds a new definition sense to the lemma? That sense has to be added to all 71 non-lemma pages. The same happens if an edit is made. It makes much more sense to give the definitions in a single central location, and provide a grammatical appendix for interpretation. --EncycloPetey 18:51, 30 September 2007 (UTC)
In such extreme cases as tener, where a word has that many definitions to begin with, with even one example of how the word is translated and possibly a "see tener for further definitions" that wouldn't be a huge problem at all. That, or maybe only give those definitions which are actually used frequently. Appendices are useful, if you know they're there (which even I didn't for a long time). But still, I think that words in a dictionary should be defined.
Still yet, not all languages have to be done the same way. I work mostly on Romanian entries, and being that I'm one of the few people who does so, I'm really not bothered much to go back and change things in multiple articles at once if I change something that affects multiple entries. — [ ric | opiaterein ] — 19:16, 30 September 2007 (UTC)
tener is hardly an extreme case. The example you gave—citi—has eight definitions in the Romanian entry and presumably about a hundred forms. To give non-lemma words full treatment in this dictionary, then, we would eventually need to maintain eight hundred definitions. As this dictionary becomes more complete, such maintenance concerns will become the norm, not the exception. Rod (A. Smith) 19:41, 1 October 2007 (UTC)
One option is to have the “form of” templates include a link to an appendix that explains what the grammar terms mean. Readers could then get a usable sense of the word by reading the appendix and clicking through to the lemma. We would do readers a significant disservice if we pretend that citi has eight distinct senses but citisem can only be used in the sense “I had read”. Rod (A. Smith) 19:52, 1 October 2007 (UTC)
I'm not sure where you go the hundred forms thing from. I'm not talking about a single entry for every form with auxiliary words (am citit, am să citesc). The definitions of citi on the Romanian wiktionary mostly mean "to read," whether it be music, a book or something else. Another says "to learn" or "to study" for which Romanian has other specific words, which are named directly in the definition. "A citi" means "to read" and has numerous definitions with almost the same meaning just as the entry read for English. Looking at the translation table of citi in the Romanian wiktionary entry, you'll see one word per language, with the exception of German. — [ ric | opiaterein ] — 16:39, 2 October 2007 (UTC)
OK, so it seems Romanian has not a hundred verb forms, but merely thirty-five or so. For the eight meanings, though, it doesn't make sense to discount them just because there are other words for those concepts. If citi can mean, "to learn" or "to study", those senses belong in our entry. So that means we would merely need to maintain 280 definitions for the various forms of that word if we give non-lemma entries full treatment. Regardless the exact number, though, it seems wrong to use the number of forms and senses to decide whether to give full treatment to non-lemma forms. Doing so would seem to imply a rule along the lines of, “if a language has fewer than forty(?) inflected forms for a given word class, all the forms should be given semantic definitions, but if there are forty(?) or more forms, only the lemma should be given semantic definitions while other forms should show only the gramatical relationship with the lemma.” Rod (A. Smith) 16:30, 3 October 2007 (UTC)
Even 35 is a bit high... verbs generally have about 26 forms (with one word, that is without auxiliaries). If I remember correctly, definitions for languages other than English aren't supposed to be the full definitions as they would be given in that own languages wiktionary. We wouldn't have "to interpret typographic indications of a map or plan and to reconstruct after them the conforms of the terrain" under citi. Words (at least in Romanian) don't generally have meanings that are completely different. Using citi as an example again, the definitions as they would be in English would all be to read, whether it be a map, a book, music or whatever. (Side note: In an entry instead of having separate definitions for each of these senses, one could just point them out in example sentences, which are much more useful anyway.) The problem with the definition of "to study/learn" is that they don't appear to have a word that means to study in the sense of re-reading something you've read before to remember it: they just have "to read". Their word for study is cognate to our word for study, but it means about the same thing as "to learn". If you asked someone "what are you studying" they would answer with what they're learning about in school. If you saw someone studying for a test, or something, and asked what they were doing they would say they were reading. Anyway I've gotten off the main point.
Back on the topic: All that said, as I said before, in non-lemmas, you could simply give the most basic or most widely used definitions and adding "see [whatever the main article is] for more possible definitions". This would make "maintenance", etc, a lot easier. I never meant to give non-lemmas FULL treatment, just enough so that one wouldn't have to run between 3 or 4 articles to know how to translate a certain word. Right now, with non-lemmas, I'm including pronunciation, POS, a definition (and sometimes a see also/synonyms, if there's an alternate form or something), whereas normally I'd include etymology, synonyms, antonyms, related words, etc. I think maybe people have been thinking I meant to go all out with everything included in these entries. — [ ric | opiaterein ] — 18:17, 3 October 2007 (UTC)
Well, regardless of the specific format for non-lemma entries, it seems that everyone agrees we should distinguish between lemma entries and non-lemma entries. The next logical step is to decide what constitutes a lemma entry and which details WT:ELE should explicitely exclude from non-lemma entries. Rod (A. Smith) 19:42, 7 October 2007 (UTC)

Please see Wiktionary:Votes/2007-10/Lemma entries and help me refine it to something we can all support. Rod (A. Smith) 23:32, 17 October 2007 (UTC)

Wiktionary:Votes/2007-10/Lemma entries is now open. Rod (A. Smith) 19:14, 30 October 2007 (UTC)

The vote is clearly headed toward rejection of a distinction between lemma entries and non-lemma entries. Consider the degree of completion we might expect to achieve if contributors work primarily on lemma entries for the next five years. If we instead dilute those contributions across the various inflected forms of words, we can expect to achieve that same degree of completion in fifty years instead. I hope this community can understand how counterproductive it is to our project for us to reject a normative focus on lemma entries. Rod (A. Smith) 22:09, 1 November 2007 (UTC)

I don't think anybody is saying that there should be no distinction. What kind of completion are you talking about? I don't think anybody focuses primarily on non-lemma entries. And supposing that people work primarily on lemma entries for 5 years, you still have to go back and add non-lemmas (even if you're only adding "form of" information). So what it comes down to really is the total amount of effort you're willing to put into the dictionary. Focusing on basic forms of words and for the most part ignoring their forms isn't going to help readers that aren't familiar with grammatical terms. Even then the clarification is nice.
Note: In the above I'm referring mostly to translations under the inflection line for languages other than English. I think providing translations, etymology, synonyms, antonyms, etc to all English entries is basically a waste of time and energy. However, having the absolute minimum amount of information in non-lemma entries, while easier to manage, is just lazy. — [ ric | opiaterein ] — 00:42, 2 November 2007 (UTC)
That's pretty sensible, so long as it's clear to readers that more information can be found in the lemma entry. As this project matures, we will serve readers much better by expanding our foreign entry lemma definitions from crude one-word translations to fuller explainations of the terms' subtle nuances than we will by expanding non-lemma entries and trying to keep them synchronized. Unfortunately, it doesn't seem that a change to the proposal to more strongly encourage brief gloss translations of non-lemma entries would sway anyone who currently has an opposing vote to one of support. Rod (A. Smith) 02:46, 2 November 2007 (UTC)
I've been doing it like I'm suggesting now. It never occurred to me that there would or should be another way of doing it. :-) See prietenului for one example of how I do non-lemmas. For those articles in which it's necessary, I definitely think a note should be made that there is more than one definition. The problem is how to work that in without having another subsection or something... — [ ric | opiaterein ] — 03:39, 2 November 2007 (UTC)
OK. It seems important to minimize confusion now caused by phrases like formal third person trial animate pluperfect negative perfective subjunctive. Since the format requested by you, Robert, et al. places full grammatical description in the headword line, we should have some {{xx-verb}} templates display the headword line with a full technical jargon phrase linked to a language appendix specific to the corresponding inflection in that the language. Since grammatical properties cannot be only in definition lines under the requested system, we will also need to repeat a given POS header, e.g.:

(unindenting)

===Verb===
{{es-verb|third person present singular indicative of|ganar}}

# [[win]]s. 
#: ''Él '''gana'''.'' — “He wins.”

===Verb===
{{es-verb|second person singular imperative of|ganar}}

# [[win]].
#: ''¡'''Gana'''!'' — “Win!”

Verb

gana

  1. wins.
    Él gana. — “He wins.”


Verb

gana

  1. win.
    ¡Gana! — “Win!”


A concern I have about this is that some editors have tried to merge sequential "===Verb===" sections, so we'd need agreement not to do so. Does that format seem ideal? Rod (A. Smith) 07:28, 2 November 2007 (UTC)

That's how I've always done it. :-) {{ro-verbform}} is what I use for the headword/inflection line. I suck at template design, so it might not be the best template evar, but it seems to get the job done. mâncaţi (needs example sentences, I know)
I think a {{see lemma}} template should have parentheses and <small></small> tags or something so it doesn't distract too much from the definition itself? And to repeat it every time, I don't know. That's the main thing I was worried about with such a notification. We could always just put it under ====See also==== or ====Usage notes==== with "See the lemma article, x for more detailed information" or "This word (or these words) can be translated in more diverse ways. For other possible usages"... etc.— [ ric | opiaterein ] — 15:09, 2 November 2007 (UTC)
I dislike the above approach intensely. It would be counterproductive for Latin. See the page/section alba#Latin. Using the above format would require six separate Adjective sections under Latin for that word. That would be silly. Providing a translation for each separate "sense" would also be silly. I've been doing Latin inflected adjectives the way alba is done. --EncycloPetey 22:07, 3 November 2007 (UTC)
Hmm. Striking a balance here is challenging, but is it reasonable to show important grammatical properties in each headword/inflection line, gloss-type definitions in each definition line, and detailed translations within just “main” (lemma) entries? For example, in alba, we could just do this:

(unindenting)

alba (nominative feminine singular, nominative neuter plural, accusative neuter plural, vocative feminine singular, vocative neuter plural of albus)

  1. white.

albā (ablative feminine singular of albus)

  1. white.

Does that seem reaonable? I don't think any fidelity is lost or any confusion introduced. Right? Rod (A. Smith) 23:04, 3 November 2007 (UTC)

More than reasonable, actually :-p I still really think the {{see main}} formatting should be <small> or something, though :-D Maybe it should even be included in the form-of templates. But right now, I'm too tired to have ideas. — [ ric | opiaterein ] — 03:32, 4 November 2007 (UTC)
We can create some customization options for {{see main}}, but first I want to know that the overall approach has some chance of success. There is resistance from both the "list each grammatical role as a definition" camp and the "allow translations in all entries" camp, so I'm not yet sure this approach can succeed. Rod (A. Smith) 17:46, 4 November 2007 (UTC)

With 3-5-1 opposing, the current vote is suspended. Wiktionary:Votes/2007-11/Lemma entries 2 is where I will work on the new version. Following are the change so far:

  • Instead of "lemma" and "non-lemma", it uses "main" and "secondary" to describe the entries. Hopefully that makes it more clear that if any special senses exist for an inflected word, that inflection counts as a "main entry", regardless of whether it's considered "lemma" or "citation" form.
  • The "form of" information is now in the headword line, where it belongs.
  • {{form of}} is dropped. Instead, it recommends a brief semantic definition followed by "see full entry at... for details".
  • An example sentence and quotation are given in the example of the secondary entry.
  • Fewer sections are explicitly prohibited from secondary entries.

It's still not clear to me whether the opposition really believes we should leave translations in secondary entries like talks. I will try to describe clearly why such translations are harmful to this project. Rod (A. Smith) 02:01, 3 November 2007 (UTC)

I think translation tables in secondary entries are a bad idea in most cases. Plurals and stuff yeah, they're pretty harmless and easy to keep under control, but the information is just a repetition of what's in the primary entry which can be easily accessed to begin with. If you want to see the plural that bad, you can go through the primary article to the word you want. As we've seen, such tables for verbs would be a fucking nightmare.
Rod, do you think we should try to keep all our talk about this stuff in one place? So far we have it here at the beer parlor, on the old vote page, its talk page, and the new vote's talk page. Seems as messy as the translation tables lol — [ ric | opiaterein ] — 20:33, 3 November 2007 (UTC)
Agreed. Everything is now accessible through Wiktionary talk:Translations. Better? Rod (A. Smith) 17:46, 4 November 2007 (UTC)

Wiktionary talk:Votes/2007-10/Lemma entries[edit]

Listing the lemma on the headword/inflection line[edit]

User:Opiaterein makes a good point in saying, I think the "form of" information should be on the main inflection line, as with lemma entries, with a brief definition below. I like the form of entry that produces. It's not how we've been doing things, but perhaps this is the time to change. Does anyone else have any opinions about moving the lemma link and the relationship to the lemma onto the inflection line, presumably within parentheses? Rod (A. Smith) 02:05, 18 October 2007 (UTC)

That's basically exactly what I was saying. It might not work as well in English entries, like
hands (plural of hand)
  1.  ??
but with other languages, it should be fine.
manos f (plural of mano)
  1. hands
Encyclopetey pointed out that with verbs this could get crazy, but we don't need to add every definition, just enough to give an idea of the way the word is generally used, maybe with a note pointing toward the main article for more definitions or something.
By the way, sorry for jumping into the vote half a month early, I was focused so hard on reading the actual content of what the vote was on that I missed the "Green means go" part. :o — [ ric | opiaterein ] — 02:24, 18 October 2007 (UTC)
Glad to get feedback. Looking at hand, I can understand why you had some difficulty determining what to give as the English gloss. It comes out as something like this:
hands (plural, singular hand)
  1. specific body parts or things that resemble them
  2. multiples of three or of four inches
  3. Sides
  4. Powers of performance
  5. (archaic) Actual performances
  6. agents
  7. Handwritings
  8. Personal possessions
  9. things held in a hand at once
  10. Agencies in transmission
  11. (obsolete) Rates.
  12. pointers on analog clocks
  13. (firearms) small parts of a gunstock
  14. bunches of bananas
Or, condensed like the following:
hands (plural, singular hand)
  1. specific body parts or things that resemble them; multiples of three or of four inches; sides; powers of performance; (archaic) Actual performances; agents; handwritings; personal possessions; things held in a hand at once; agencies in transmission; (obsolete) rates; pointers on analog clocks; (firearms) small parts of a gunstock; bunches of bananas
There are currently fewer definitions for mano, so I think it would look like this:
manos (plural, singular mano)
  1. (of a person, game, or clock) hands
  2. (of an animal) front feet
  3. (of paint) coats
Or condensed:
manos (plural, singular mano)
  1. (of a person, game, or clock) hands; (of an animal) front feet; (of paint) coats
Right? After typing all of that in, I'm weary about how easily non-lemma semantic definitions will scale. It seems like we might encounter some problems consistently implementing it, especially for languages that are more highly inflected than English and Spanish. Rod (A. Smith) 04:39, 18 October 2007 (UTC)
See, that's what Petey was getting at. But the thing is, we don't need to include every possible definition of a word in the non lemma entries. For instance, I've never heard the word hand, plural or singular, used to mean "personal possessions." In the form of entries, we only really need to give one or two of the most common uses, and if there are others, we can put in some kind of note to see the main article I'm not sure, something like:
manos f (plural of mano)
  1. hands
See mano for more details
Or something like that; not sure how what the best wording for the see x for more part would be. Keeping all the definitions in line may take some fingerwork, but in the end I think it will be better as long as we keep it simple, but not too simple. — [ ric | opiaterein ] — 06:19, 18 October 2007 (UTC)

The proposal above has some promise, but it needs refinement and significant trial use before we can reasonably consider adoption. That belongs as part of a separate proposal, so I am going to allow this vote to open as a simple codification of existing practice. Please feel free to open another discussion (probably on WT:BP) to propose this new format, but not as part of this vote. Rod (A. Smith) 15:44, 25 October 2007 (UTC)

No. It is a bad idea and should be stopped now. This works fine only if there are a small number of similar inflected forms. Latin entries would suffer and be rendered almost unreadable by the above format. See alba#Latin, which would require six separate adjective sections. Reformatting that page to the above "promising" specifications would be a disaster, and would make it much harder for a user to figure out what was going on. --EncycloPetey 22:21, 3 November 2007 (UTC)
Thanks for your input, EncycloPetey. With the shockingly strong failure of Wiktionary:Votes/2007-10/Lemma entries, I obviously need help explaining briefly in Wiktionary:Votes/2007-11/Lemma entries 2 why proliferating translations, etc. is undesirable. Rod (A. Smith) 22:39, 3 November 2007 (UTC)
Adjectives aren't inflected in English, so the definitions would mostly be the same anyway. If a user doesn't know what "vocative feminine singular" means, they won't be confused by this any more than they would by the old formatting. The old formatting wouldn't kill me so much if example sentences were more common. As it is, the reader is less likely to know wtf the word means, and even less likely to know htf to use it. — [ ric | opiaterein ] — 03:23, 4 November 2007 (UTC)
Confusion can arise from content of from formatting. What you are proposing will ensure that the format is confusing, since all the definitions will be identical. It will be difficult for users to detemine what the deifferneces among the six adjective entries are (because they may not even fit into a single monitor screen) and it will be harder for editors to determine which, if any, senses are missing. A format that makes it hard on the editor and hard on the user is a very bad idea indeed. The goal should be to improve the situation for at least one of those audiences, if not for both. This proposal asks us to head in the opposite direction, so I am firmly against it. Consider:
alba f (nominative feminine singular of albus)
  1. white
See albus for more details
alba n (nominative neuter plural of albus)
  1. white
See albus for more details
alba n (accusative neuter plural of albus)
  1. white
See albus for more details
alba f (vocative feminine singular of albus)
  1. white
See albus for more details
alba n (vocative neuter plural of albus)
  1. white
See albus for more details
albā f (ablative feminine singular of albus)
  1. white
See albus for more details
How many forms are there above? How many different genders? How many spellings? It's slow and tedious to answer those questions because of the formatting. Now answer those same questions using the format below for the same information:
alba
  1. nominative feminine singular of albus
  2. nominative neuter plural of albus
  3. accusative neuter plural of albus
  4. vocative feminine singular of albus
  5. vocative neuter plural of albus
albā
  1. ablative feminine singular of albus
It's much esier to get an overview of what's happening because the information is presented in a sensible format. Please don't continue asking the community to abandon good format for a difficult-to-read complicated and confusing one that will make entries difficult to edit and difficult to read. --EncycloPetey 14:02, 4 November 2007 (UTC)
  1. Why do we need to know how many forms, genders and spellings there are? The dictionary isn't about how many words and forms there are, it's about what the words mean. There is also the table of contents that could tell you close to the same information.
  2. Listing only the form of information makes it necessary to "See x for more details" that really won't even be found in the article. "Ok, I'm here. What does alba mean? Something to do with white. But what does that ablative stuff mean?" It's probably a fact that most users will not be familiar with grammatical terms, although the case may be different with Latin, I'm not sure how it's taught. Even with example sentences, without some kind of gloss or link to something explaining the case, it's next to useless for everyday readers.
  3. It isn't difficult to read or edit, there's just more there so there's more to maintain.
  4. We don't have to format every language the same way. See huā. A few or missing but there do seem to be some definitions there, is it not so? — [ ric | opiaterein ] — 15:58, 4 November 2007 (UTC)
I don't understand half of what you're saying. How will the table of contents help? What happens to users who turn that off as their preference? How will the table of contents allow for comparison between entries? Latin is generally taught with all the cases, anyway. And it is difficult to read and edit because there is so much unnecessarily repeated information.
So... you've said what you don't want to see, so what do you want to see. Show us how you would structure the Latin entry for alba. --EncycloPetey 19:35, 4 November 2007 (UTC)
I think The dictionary isn't about how many words and forms there are, it's about what the words mean is a rather narrow vision of how best to serve our readers. Sure, definitions are a core deliverable for our project, but so are other aspects of lexicography. Minimizing such information would be a disservice to all but the most casual reader. Following is my example from Wiktionary talk:Translations/Noting lemma forms in WT:ELE of how I think the "all definition lines should mainly focus on semantics" camp would want to format alba:

(unindenting)

alba (nominative feminine singular, nominative neuter plural, accusative neuter plural, vocative feminine singular, vocative neuter plural of albus)

  1. white.

albā (ablative feminine singular of albus)

  1. white.

Even better would be if we could automatically link specific "form of" labels to a language-specific inflection appendix. Granted, {{see main}} displays more prominently than Ric would prefer, so there probably should be some reader customization options available, but the structure seems sound. So, grammatical details are an important element of our mission, but appears to be a reasonable way to provide them outside of definition lines. Rod (A. Smith) 20:32, 4 November 2007 (UTC)

I'd like to make it clear now that I'm already irritated by the world outside of Wiktionary, so if I come off as being unnecessarily unpleasant....oh well. :)
"Show us how you would structure the Latin entry for alba." — Being that I don't speak Latin, I wouldn't dare try to format alba or any other non-lemma Latin entry. To someone who isn't already familiar with Latin grammar, that entry is useless, as it will remain without some kind of indication as to how to use the word, in any of those senses. Sorry.
My main issue with this topic is verbs. Let's take "fusesem (first-person singular, pluperfect indicative form of fi)"
Required prior knowledge in absence of definition: First-person, pluperfect, indicative, fi.
  1. First-person - common enough, most people would probably know it.
  2. pluperfeect - I didn't know what this meant until I started studying Romanian. I had studied 3 languages before and never learned what this meant.
  3. Indicative - easy enough to figure out, I suppose.
  4. fi - going to the entry, you'll find that it means "to be". Next
  5. You have to know what the pluperfect "form" of "to be" is in English. The article pluperfect gives examples, but look, now you have to know the past particple of "be", if you don't know what a past participle is.
So instead of looking at fusesem and getting your definition, and fi for more information on the verb, you would have to look at up to 5 or 6 other articles, and even then how could you be sure you were translating correctly? Everyday readers are not linguists or grammarians and will not know what the hell is going on when you tell them that alba is 6 different forms of the word "albus" or that fusesem is the 1st person singular pluperfect form of "fi".
"Granted, {{see main}} displays more prominently than Ric would prefer" — It just sticks out too much for something that's repeated over and over. I also don't want it to be confused with the definition of the word.
"Even better would be if we could automatically link specific "form of" labels to a language-specific inflection appendix." — I was actually going to start doing this if I ever get around to writing up an appendix for Romanian noun declension.
"How many forms are there above? How many different genders? How many spellings?" — Why does it matter? Look at the inflection line. — "It's slow and tedious to answer those questions because of the formatting." — Not half as slow and tedious as figuring out what the hell the word means, in the case of noun and verb forms.
Continue. — [ ric | opiaterein ] — 21:09, 4 November 2007 (UTC)
OR you can create an Appendix:Romanian verbs to explain all of that in one place. Your way, the information has be repeated (and maintained) in every entry about a pluperfect form. --EncycloPetey 21:44, 4 November 2007 (UTC)
Appendix:Romanian verb conjugation. Even having that appendix and explaining the formation and usage of verb forms doesn't help if you don't know how to form the different tenses etc. of English verbs. — [ ric | opiaterein ] — 21:59, 4 November 2007 (UTC)

Wiktionary talk:Votes/2007-11/Lemma entries 2[edit]

Exceptions?[edit]

I think words and phrases like "ordinarily" and "though exceptional cases may need to be handled differently" should be sprinkled liberally. If the idea is that this duplication of information is usually detrimental, it might nonetheless be worthwhile in some unusual circumstances. (I actually think such duplication is never worthwhile, but clearly there is disagreement about this, and apparently wikis are supposed to be about compromise or something?) —RuakhTALK 04:24, 3 November 2007 (UTC)

That makes sense, so please sprinkle away. It's a wiki, so I can always just revert you.  ;-) Rod (A. Smith) 04:35, 3 November 2007 (UTC)

Links to main entries[edit]

I'm still a little off about the "See full entry x for additional details," at least where it is after the definition. It doesn't look/feel right. It might be better italicized, or in a usage notes section. I think that would probably be the best way, actually. If no other real "additional details" exist (like if a word really only does have one meaning) then those things don't need to be pointed out, but if they do, a usage notes section with a note to check the main entry would be a good little combination. I think it's safe to assume that if someone wants more information on a word (other forms, definitions or whatever) that they will go back to the main article to check it out, whether there's a note to do so or not. — [ ric | opiaterein ] — 15:57, 3 November 2007 (UTC)

Yeah. The link to from the secondary entry to the main entry doesn't look quite right to me, either. Currently, it uses the “form of” styles, which readers can customize using WT:PREFS or WT:CUSTOM#“Form of” definitions for various combinations of italics, bold, etc., but there is room for improvement. You also make a good point that it may be more effective for some secondary entries to put it in the usage notes section. If we do that, what wording can we use to ensure that readers know to check the main entry for additional senses, synonyms, and translations? Even if the main entry only has a single sense when the secondary entry is created, other senses are likely to be added to the main entry at some later date. So, where are appropriate places and what is effective wording to tell readers that the main entry may have more senses, synonyms, translations, etc.? Rod (A. Smith) 18:26, 3 November 2007 (UTC)
I agree with most of your comment, but disagree with this part: “I think it's safe to assume that if someone wants more information on a word (other forms, definitions or whatever) that they will go back to the main article to check it out, whether there's a note to do so or not.” On the contrary, I think it's safe to assume that if someone is visiting a non-main entry, it's because they don't realize it's not the main entry or don't know what the main entry is. Some editors here want to be uberhelpful and make non-main entries as useful as possible for such people, but the more we add to such entries, the less obvious it will be that they're not main entries and that the main entry has more information. (Granted, some editors apparently want to eliminate the main—non-main distinction completely, and this would also eliminate that problem; this comment isn't directed at them.) —RuakhTALK 18:33, 3 November 2007 (UTC)

Scope reduction[edit]

Trying to confine foreign term definitions (translations into English), any language synonyms, and further semantic relations to “main” entries seems likely to encounter heavy to opposition since some see that as an anti-freedom-wiki-joie-de-vivre restriction for problems that they think bots will some day solve. A more conservative approach that only limits translations (from English to foreign lemmata) would presumably encounter less opposition. Some contributors also oppose lengthening WT:ELE because of TLDR. To kill two birds with one stone, then, I propose the following tack:

  1. Update Wiktionary:Translations to explain that full translation tables should usually only be in “main” (lemma) English entries and should only list the foreign language lemma (e.g. fabulor) or, if the community prefers, only the “second language lemma” (e.g. fabulārī).
  2. Vote to upgrade Wiktionary:Translations to a policy page and to trim WT:ELE#Translations to one brief paragraph, directing further interested readers to Wiktionary:Translations.

What say ye? Rod (A. Smith) 05:36, 8 November 2007 (UTC)

Sorry, what's this "foreign language lemma = fabulor" vs. "second language lemma = fabulārī" thing? I don't get it at all. —RuakhTALK 05:48, 8 November 2007 (UTC)
Oh yeah, that would be good to explain. Some foreign language resources traditionally use a different grammatical forms from English as the citation/lemma form. For example, Latin dictionaries traditionally use the first person singular present tense indicative form of verbs as the lemma/citation form. In Wiktionary:Languages with more than one grammatical gender, a contributor recommended to translate English terms into the “second-language lemma” form, giving ambulare (to walk) as an example, as opposed to the traditional Latin lemma ambulo (I walk). My compact bilingual Latin-English dictionary uses that form as well, so it seems to deserve consideration. Rod (A. Smith) 07:22, 8 November 2007 (UTC)
Oh. I don't see that as an issue for Wiktionary:Translations, but for the relevant language-considerations page. As of right now, we do have lemma and non-lemma entries, and the language-considerations page should decide which word is the lemma. It would be rather sucky if translations tables linked to non-lemma entries; that would just be more work for the reader, with no benefits whatsoever. —RuakhTALK 18:29, 8 November 2007 (UTC)
Unfortunately, some editors ([2], [3]) don't think that we should restrict translations to lemma forms, but would prefer to encourage translations into each of the inflected foreign forms that might be used as a translation of a given English form. It seems important for Wiktionary:Translations or some other language-agnostic project page to restrict translation tables to lemma forms, and to explain somewhere why that restriction is important. Rod (A. Smith) 19:09, 8 November 2007 (UTC)

Other discussion[edit]

Including grammatical gender[edit]

Hello, everyone. I would like to include a link on this policy to the policy about including gender with language translations. (See Wiktionary:Languages with more than one grammatical gender.) This policy about the inclusion of gender could use some discussion, as well. Anyone who can comment on this policy should please leave a message on its talk page. Thank you.--El aprendelenguas 21:58, 17 April 2006 (UTC)

Managing complex translations[edit]

Most good foreign dictionaries will contain explanations on certain words such as "if" or "the" which are difficult to tranlsate into foreign languages (especially those of non-Indo-European origin). However, there is no room to do that in the present Wiktionary style for translations.

I propose having complex translation entries look something like this:

if

where the if/jp page would provide a detailed description of how to translate English "if" into Japanese, in English. Thoughts?

Lots of points to consider[edit]

Here are some points that might need to be included in a policy like this:

  • As far as Wiktionary jargon goes, language and dialect are synonymous. The more common languages, such as the Chinese dialects, have standard names to appear in language headings and translations sections.
  • In full dictionary entries, Translingual and English are listed first, in that order, and all other languages are alphabetized.
  • In translations sections, all languages/dialects are listed alphabetically at the same level.
  • Inflections should be provided in the entry for the foreign word, not for the English equivalents. Short, irregular forms that do not share a root are exceptions.
  • Gender labels apply to the grammatical gender of the word, which varies between languages and is defined in the appendix.
  • Transliterations are parenthesized and unlinked, and for each language Wiktionary chooses only one standard for the Translations section. This is generally the most common transliteration in modern use. If there is some ambiguity, those that have the simplest diacritics are preferred.
  • They aren't pronunciations, for goodness sake! Only in rare cases should additional information be given, such as when no gender-neutral term exists. (See cousin.)
  • For foreign entries, each sense of the word or phrase should have a separate definition line, just as it would on the foreign-language Wiktionary.
  • Usually translations match fairly closely. Even so, we need to be sure to indicate which sense of the word is meant if there could be ambiguity. (More often than not, there is.)
  • It's necessary to be more thorough when translations do not match. In some cases there may not be a direct translation of the foreign word. In the opposite case, a single sense of the foreign word may apply to more than one sense of the English word.
  • Any standard transliterations may be entered as entries, not just the ones we prefer.

And for Christ's sake, could we please archive this page! I mean, that this is a multilanguage dictionary is pretty much set in stone right now, unless we were to just throw the better half of it away. Why do we have to draw attention to that aspect by making the page live again? DAVilla 04:01, 14 October 2006 (UTC)

Some lay-out questions from a beginner[edit]

Copied from the Tea Room 20 Oct 2006

Hi all,

I recently discovered Wiktionary and am enthousiastically adding Dutch translations wherever I know them. However, I come across big layout differences between pages. So here are some questions:

  • In disambiguating between different meanings, the translations are listed in several {{top}}, {{mid}}, {{bottom}} groups. Often, the meaning is mentioned above it in bold, sometimes it is given as a parameter to {{top}}. Which is preferable?
  • Sometimes language names are themselves wikified. Is this desirable?
  • In Dutch, a lot of words are both masculin and feminin. Can I make my own {{m/f}} template?

Expect more questions of this sort. Are there more explicit guidelines? henne 15:54, 19 October 2006 (UTC)

The translation sections should be disambiguated by a bold heading at the top of each group. Older entries have numbers, which should be phased out. The new system used in some entries uses a different {{trans-top}}, {{trans-mid}}, {{trans-bottom}} format whereby the heading exists as a parameter to {{trans-top}}. These templates will probably supersede existing layouts. Language names should only be wikified if they are unusual or not generally known – that obviously doesn't apply to Dutch. As for gender, I'm not aware of any m/f template, so I see no reason why you shouldn't create one. There is some more info at WT:ELE and Wiktionary:Translations. Widsith 16:36, 19 October 2006 (UTC)
<EDIT CONFLICT>
  1. The use of a parameter in {{top}} is still considered "experimental" at this point. It would be better to follow the bolding convention until the new format has the various kinks worked out.
  2. Language names that might be construed as "exotic" are wikified. Basically, if the country name doesn't match the language name, it might be considered "exotic" to an English speaker.
  3. Please use {{mf}} for those.
Entry layout explained goes into pretty good detail regarding the current conventions. This is a fine place to ask similar questions. --Connel MacKenzie 16:40, 19 October 2006 (UTC)
Thanks both. {{m|f}} is what I was looking for, as is the template index. How can I know where the discussion about the {{top}} or {{trans-top}} templates is leading? henne 10:38, 20 October 2006 (UTC)
There is also {{c}} for "common gender", which I know is standard in Swedish, and I have heard used in reference to Dutch as well. --EncycloPetey 18:20, 20 October 2006 (UTC)
Well, Dutch dictionaries usually avoid the problem in that they only say ‘de’, i.e., the common article for m and f, and if it is explicitly m or f, then they add (m) of (f) after ‘de’. I am not sure what the right thing to do is. You are probably right that c is more appropriate, as they are referred to by masculine pronouns.

Ok, one more style question that wasn’t answered by the ELE: When subdividing the translations into groups corresponding to different meanings, a description of the meaning is to be given. Is there a style guide for this? Should it be concise, telegram-style, should it contain articles, should it start with a capital letter?

Not yet, but there's active discussion beginning on Wiktionary:Translations to work out basic policy and standards, which could then be used to update/modify the ELE. --EncycloPetey 18:20, 20 October 2006 (UTC)
Is that right? I was under the impression that we have always recommended entering a short gloss, containing enough common words from the definition above to link them unambiguously. --Connel MacKenzie 18:25, 20 October 2006 (UTC)
That's right, but there's still no written style guide. I'm thinking that I may help DAVilla assemble information on the Translation section, including format, style, and content information. She's already assembled a great new page for handling the Translations to be Checked, but we still need more than we currently have on the standard section itself. And what relevant info we have on the ELE is not well organized, in addition to missing some key points. --EncycloPetey 19:17, 20 October 2006 (UTC)

Templates {{t}} and {{trad}}[edit]

I have taken liberties with these. They both do almost exactly the same thing, so trad is now a redirect to t.

Note that no policy decision on the use of these has been made. Some people like them, but we may end up subst'ing the lot.

Template t takes three parameters: the word in language, the language code and the gender. The gender must be one of f, m, mf, c, or n. (Note that indicators such as pl are not supposed to be in the tables, the pointer is to the singular lemma form. All other information is at that entry in the en.wikt, and in the other wikt!) Robert Ullmann 12:58, 14 November 2006 (UTC)

strike that, we have lots of plurals, and there isn't any present reason to make them go away; 4th parameter is s or p. Robert Ullmann 15:27, 14 November 2006 (UTC)


Community Portal[edit]

Shouldnt the community portal be linking to this page instead of the translation of the week?

Bearingbreaker92 03:43, 21 January 2007 (UTC)

No, it's the project of the TOW that should be linked. However, the link name should be changed to be less confusing. I'll take care of that. --EncycloPetey 03:47, 21 January 2007 (UTC)
Thanks, it looks better. Bearingbreaker92 04:21, 21 January 2007 (UTC)


Improvements to the policy page[edit]

I suggest add in Wiktionary:Translations#Basic format for English entries, 3:

{{trans-top|explanation of translation (to differentiate between multiple meanings)}}
* translation 1
* translation 2
{{trans-mid}}
* translation 3 (second column)
* translation 4, etc
{{trans-bottom}}

This yields the pleasing result of:

Add a new 4 point :

4. If no sure about some translations, use:

=====Translations to be checked=====

<!--Remove this section once all of the translations below have been moved into the tables above.-->

{{checktrans}}

Finally, include these links in the see also section:

--Mac 16:25, 4 March 2007 (UTC)

This looks good, and then we can finally do away with Wiktionary:translations.--Williamsayers79 23:59, 16 May 2007 (UTC)

Translations for verb forms?[edit]

Hi, I don't remember how I got this information but, whatever, I've been believing that there should be verb translations only on the page of the infinitive form (for says, see say; for is, see be), so I have been removing them, informing "verb translations only on the page for the infinitive". So, what's the established policy here to be followed? Or is there any? -- Frous 11:38, 10 June 2007 (UTC)

Yes mostly, only the lemma form of the verb should have a translation to English. Other inflected forms should identify the inflection and point to the lemma. However, the lemma is not always the infinitive. In Latin and Ancient Greek, the lemma is the first-person singular present active indicative. Part of the reason for this is that Classical language dictionaries use that form as the verb headword, another reason is that Latin has six different infinitive forms (and the infinitives tend to function grammatically as nouns). In any case, there shouldn't be translations given for non-lemma forms of any verb, noun, adjective, etc. --EncycloPetey 16:34, 10 June 2007 (UTC)
Sorry, I misundertood your question. You were talking about Translation sections on English pages, yes? In that case you are correct. A translations section should only appear on the page for the English infinitive, since that is the lemma form for English. --EncycloPetey 16:36, 10 June 2007 (UTC)
Yes, I meant the English verbs. So I haven't done that much disaster when deleting translations of e.g. third-person singular present indicative forms, thank God... ;) -- Frous 20:37, 10 June 2007 (UTC)

Rename[edit]

I would call this plage Wiktionary:foreign word definitions --77.210.55.105 06:57, 6 October 2007 (UTC)

Animate noun gender inflection[edit]

The following was moved here from Wiktionary talk:Votes/2007-10/Translation into lemma only#Animate noun gender inflection. Rod (A. Smith) 18:10, 19 October 2007 (UTC)

The following references support the classification of words like chica as a non-lemma forms of lexemes with lemma forms like chico. If anyone can find some references that support the opposite position (i.e. that words like chica are distinct, gender-specific lexemes), please add them. Eventually, some of these may befome references list for Wiktionary:Languages with more than one grammatical gender:

  • 2003, Luis D. Casillas Martínez, Gender Mismatches in Spanish and French N1/A de N2 Affective Constructions: Index agreement vs. Morphosyntactic Concord [4]:
    An important distinction that I must make before delving into these constructions at depth is that between inherent gender classification and gender inflection.
    • An inherently gendered lexeme comes from the lexicon with a fixed gender value. Most inanimate common nouns in Spanish and French are of this kind—but there are exceptions.
    • An inherently ungendered lexeme does not have lexical gender. It may be inflecting, with a form for each gender, or noninflecting, with a unique, gender-unselected form. Many animate nouns are not inherently gendered, and show distinct inflectional forms; e.g. Sp. amigo ‘friend (.M, .F)’. Some ungendered adjectives and nouns don’t inflect (e.g. Fr. imbécile and idiot).
  • 2004, Spanish Concise Dictionary, Harper Collins, Grammar reference page 192:
    As in English, male and female are sometimes differentiated by the use of two quite separate words, e.g.
    mi marido   mi mujer
    my husband   my wife
    un toro  una vaca
    a bull   a cow
    There are, however, some words in Spanish which show this distinction by the form of their ending:
    • Nouns ending in -o change to -a to form the feminine → [1]
    • If the masculine singular form already ends in -a, no further -a is added to the feminine → [2]
    • If the last letter of the masculine singular form is a consonant, an -a is normally added to the feminine* → [3]
  • 2005, Greg Kobele, Agreement Bottlenecks in Italian [5]:
    Nouns in Italian also vary their form depending on their number. Moreover, there are also roughly two classes of nouns, with respect to the forms they take in the singular and the plural. A class I noun is one which inflects like a class I adjective (holding gender constant), and a class II noun inflects like a class II adjective. (striken as unclear Rod (A. Smith) 23:16, 19 October 2007 (UTC))
  • 2007, Ricardo Bermúdez-Otero, Against nominal class features in Spanish [6]:
    §14 Theme vowels are involved in exponence of gender, but are not predictable from gender:
    e.g. masculine nominals belong to the o-class by default, but the o-class also contains
    • feminine nominals man-o (F) ‘hand’
    • dual-gender nominals el testig-o (M), la testig-o (F) ‘the witness’
    • neuter demonstratives est-o (N) ‘this’, cf. est-e (M)
    feminine nominals belong to the a-class by default, but the a-class also contains
    • masculine nominals -a (M) ‘hand’ [sic. Bermúdez-Otero gives “hand” instead of “day”]
    • dual-gender nominals el artist-a (M), la artist-a (F) ‘the artist’

Rod (A. Smith) 01:43, 18 October 2007 (UTC)

The 2003 cite is quite clear, and the 2004 cite almost as clear.
However, I think you're misreading the 2005 cite; I think it's implying that a noun has exactly one gender. This is more clearly implied later on, where it says "In GAgr, each inflectional form of every adjective and noun is treated as a separate lexical item, which means that for every adjectival root there are four lexical items, and for every noun root there are two." (I think it's clear that at this point, the author considers adjectives to have four forms — ms/fs/mpl/fpl — and nouns to have only two, either ms/mpl or fs/fpl.) However, the article later goes on to say, "This grammar allows us to view class I pairs like zio/zia (uncle/aunt) as inflected forms of the same lexeme, zi." In other words, while the author by default considers these to be pairs of nouns, he says that his proposed approach makes it possible to treat them as single nouns. All told, I think this can be considered a reference for both points of view.
Similarly, I don't think the 2007 really goes either way; it uses the phrase "dual-gender nominal" to denote nouns that have a single form regardless of gender, and the phrases "masculine nominal" and "feminine nominal" to denote nouns that have a single form for a specific gender (regardless of whether there's an opposite-gender counterpart). Unless I'm missing something about the word "nominal", the implication seems to be that "artista" is one word regardless of gender, while "chico" and "chica" are separate words; but it's not a terribly strong implication, as, due to the topic of the paper, the author might simply have chosen the interpretation that lent itself best to his purposes (as should we do).
RuakhTALK 18:49, 19 October 2007 (UTC)
Hmm. I was surprised to ready your interpretation of the 2005 quotaton, but rereading the work, I see that the author isn't exactly clear about “whether grammar should be able to describe such ‘meta-paradigmatic’ relations”, saying such questions are “best resolved by the ability to account for psycholinguistic data”. In the area I and you quoted, he is discussing the general case of nouns (giving gallo and cane as examples, which he later describes as “inherently gendered”) in the context of why GAgr is “in a sense the worst possible case, and thus [only to] be adopted after all other avenues are explored.” He doesn't even mention animate nouns until further on, where he says, “we can represent non-inherently gendered nouns like zi in the manner shown below.” In the grammar he seems to prefer, he views zio and zia as two gender-inflections of *zi. In any event, he is not terribly clear, so I have stricken the example.
In the 2007 example, though, I cannot credibly imagine that "dual-gender nominals" to mean "pairs of lexemes that vary in gender". Rather, it almost certainly means "individual lexemes, each of which has two genders".
Since I struck the 2005 example as ambiguous, I found another couple of references:
  • 1977, William J. Ashby, Clitic Inflection in French: An Historical Perspective (ISBN 9062034691), page 14:
    For inanimate nouns, which are arbitrarily either masculine or feminine, the use of the determiner to mark gender serves only a classificatory purpose. For some animate nouns, however, gender is more meaningful, in that it may reflect the sex distinction. For a number of animate nouns there exist both masculine and feminine forms, which are distinguished by contrastive pairs of suffixes or by the addition of a suffix for one of the forms only. Examples are le fermier (masculine) - la ferière (feminine); le marquis (masculine) - la marquise (feminine).
  • 2002, Mark Harvey, A Grammar of Gaagudju (ISBN 3110172488), page 149:
    As is virtually universal in systems which mark gender, human nouns take concord according to the gender of their referent. Thus, a noun such as biibi ‘MF, MFZ’ will take Class I or Class II concord depending on the gender of the referent.
  • 1999, Francis Cornish, Anaphora, Discourse, and Understanding: Evidence from English and French (ISBN 0198700288), page 129:
    Gender, unlike number, is a lexical category for which most nouns in French are inherently specified. The a French NP's gender value derives from that of its head noun. Gender is therefore a category which, although ostensibly arbitrary—at least in the case of inanimate nouns—is related to the sense of the lexeme for which it is marked,15 and only derivatively to its reference; or, more accurately, to the reference of the entire NP of which the noun in question is the head, since nouns cannot on their own refer, but only denote. The main exceptions to this generalization consist of human-denoting nouns whose gender is not lexically fixed, but is rather determined by the lexeme in question's occurring as head of an NP used to refer to a male or female person: examples are nouns like concierge ‘caretaker’ and secrétaire ‘secretary’.
I also found this reference, which supports the position that in German, the gender-differentiated animate nouns are distinct lexemes:
  • 1997, Marion Kremer, Person Reference and Gender in Translation: A Contrastive Investigation of English and German (ISBN 3823349376), page 86:
    Most human lexemes are gender-differentiable in German, whereby the male-denoting and grammatically masculine lexeme is typically the morphologically simpler member, and the feminine counterpart is a more complex derivative marked by the suffix -in (cf. (Id)).
Anyway, it's obviously not a cut and dry area of lexicography (or psycholinguistics for that matter). We should probably just consult existing dictionaries (both monolingual and bilingual) and follow the majority treatment for each given language. Rod (A. Smith) 23:16, 19 October 2007 (UTC)
Re: 'In the 2007 example, though, I cannot credibly imagine that "dual-gender nominals" to mean "pairs of lexemes that vary in gender". Rather, it almost certainly means "individual lexemes, each of which has two genders".': I completely agree with this statement, but I'm afraid I don't see what point it makes? He uses "dual-gender nominals" in reference to things like "artista" (which can be either masculine or feminine), but not — so far as I can tell, anyway — in reference to things like "chico–chica" (where each has one gender). But regardless, to me this seems irrelevant: there's a lot of variation, as different papers use the interpretation that best serves their purpose. Wiktionary, too, should use the interpretation that best serves our purpose. —RuakhTALK 00:19, 20 October 2007 (UTC)

Calques/word-for-word translations?[edit]

How should one indicate or notate calques (word for word translations)? For instance, staircase wit is a calque from French l'esprit d'escalier, hence I've listed the etymology as "Calque of French l'esprit d'escalier." Is this exemplary, or is a different form preferred?

If so, we should gloss & template calque, as it is technical.

Nbarth 02:19, 27 January 2008 (UTC)

Saying "calque of/from ... " is just fine. --EncycloPetey 23:24, 29 April 2008 (UTC)

Wikilinking to foreign phrases[edit]

In translation sections, sometimes the foreign translation of an English term is a phrase. Sometimes that phrase is a Sum of Parts where an entry shouldn't exist for the whole phrase, just for each word in the phrase. In this case, we should wikilink each word instead of the phrase (See comment WT:RFD#maternal uncle). Should someone just use plain links, or {{t}}? I'd like to mention this general case in the main page, as whole phrases will get picked up by User:Tbot (or clicking newbies) and made into their own articles that shouldn't exist. --Bequw¢τ 23:15, 29 April 2008 (UTC)

Chinese translations[edit]

I created two templates recently:

Please consider the suggested methods for Chinese Mandarin:

2 examples of translations with the suggested result:

I suggest to use "Chinese" as a default language for Chinese languages/dialects. Everything else, separate by *:, e.g. *: Cantonese, *: Min Nan, etc. Anatoli 04:08, 17 April 2009 (UTC)

We don't have a language header called ==Chinese== so why would we want to list translations under that word? If the language header is ==Mandarin== then the translations should likewise be listed under * Mandarin. All of the languages/dialects are listed alphabetically, not grouped topically. There are languages much more closely related than Min Nan is to Cantonese that are listed separately. Why does everyone who enters Chinese translations have to think that they're an exception to that? DAVilla 06:26, 17 April 2009 (UTC)
Wiktionary like any dictionary works with the written forms. If I need a translation into Chinese, then it is in Chinese characters, identical in 95-100% across all the Sinosphere. Once the characters are given, then subgroups may follow under the heading Chinese: as :Cantonese, :Min Nan. The characters will be the same in most cases, if pronunciation is known can be given. So, 中國 and 中国 is in Chinese, applies to all dialects. Mandarin is the default standard pronunciation in China and Taiwan, that's why I suggest to omit the word Mandarin. The large differences in pronunciation across the dialects are less of imprortance, if the written form is the same and they have identical entries.

Currently:

  • Chinese:
    Cantonese: entry in Chinese characters (Cantonese pronunciation)
    Mandarin: entry in Chinese characters (Mandarin pronunciation)
    Min Nan: entry in Chinese characters (Min Nan pronunciation)
    Wu: entry in Chinese characters (Wu pronunciation)

OR less often:

  • Cantonese: entry in Chinese characters (Cantonese pronunciation)
  • Mandarin: entry in Chinese characters (Mandarin pronunciation)
  • Min Nan: entry in Chinese characters (Min Nan pronunciation)
  • Wu: entry in Chinese characters (Wu pronunciation)

...

Suggested:

  • Chinese: entry in Chinese characters (pinyin this is based on Mandarin or Standard Chinese)
    Cantonese: Cantonese pronunciation, add Chinese characters in Cantonese if different (rare)
    Min Nan: Min Nan pronunciation
    Wu: Wu pronunciation

...

This is not unsimilar to, even if the Arabic dialects usually differ more in writing, compared to written Chinese:

  • Arabic:
  • Egyptian:

Anatoli 06:47, 17 April 2009 (UTC)


Do not use these templates. Use template {{t}} properly.

Do not enter anything on a line starting with "Chinese:", "Chinese" is not a language, it is only used to introduce the grouo (if used). Use "Mandarin"

Language lines within a group start with ** not *: (and use the full language name, e.g. would be "Egyptian Arabic" above)

If grouped, it is like this (the same standard as all other languages).

* Chinese:
** Cantonese: {{t|yue|(word)|tr=...}}
** Mandarin: {{t|cmn|(word)|tr=(pinyin)}}
** Min Nan: {{t|nan|(word)|tr=(POJ)}}
** Wu: {{t|wuu|(word)|tr=...}}

Do use {{zh-sim}}: simpl. and {{zh-tra}}: trad. as qualifiers where needed.

Translation sections in non English entries (Translingual)[edit]

The WT:ELE states that "Variations for languages other than English... the translations section should be omitted."

The Wiktionary:About Translingual says nothing about translation sections.

My parser have been found several words with the translation box within Translingual section, namely: @, Felidae, unununium, Iris, Hymenoptera, Ericaceae, =, Cetacea, Bluetooth, Corvidae, Lycopodiaceae, Equidae, E440, SAAB, titanium oxide, Russula adusta.

So I have the following question: should we remain it "as is", or these entries should be changed? -- Andrew Krizhanovsky 09:25, 30 March 2010 (UTC)

I guess the logic is that Translingual entries are by extension usually English words as well. But since they're translingual, in theory the translations should mostly be the same word! @ is blatantly wrong however, translations moved to at-sign. Mglovesfun (talk) 09:39, 30 March 2010 (UTC)
Thanks for the fast reply. -- Andrew Krizhanovsky 10:04, 30 March 2010 (UTC)

source-targetLanguages (or sclang)[edit]

We could use Template:source-targetLanguages for translations in Wiktionary. --Diamondland 10:37, 19 November 2010 (UTC)

Options[edit]

I suggest an option in User's preferences, to show mainly the translation(s) to an specific language, the default elected by the user in the options.--Diamondland 10:39, 19 November 2010 (UTC)

Qotations[edit]

I suggest the user can also translate to a target language the quotations related with an specific entry.--155.54.178.240 07:43, 27 November 2010 (UTC)

Translation subdivisions[edit]

With the ongoing discussion at WT:BP#Subdivisions for different stages of a language in translations, I think it would be good to see if we can add a section on this page about subdivisions of languages. It would be good to have common or desired practice in writing after all. Right now, the following languages are commonly split (I may have missed some):

  • Albanian
    • Gheg
    • Tosk
  • Arabic
    • varieties of Arabic
  • Chinese
    • Cantonese
    • Mandarin
    • Min Nan
    • etc.
  • Norwegian
    • Bokmål
    • Nynorsk

I propose to add the following (at least), but we'd eventually want to include all possible cases? Please suggest more!

  • Armenian
    • Old Armenian
  • Dutch
    • Middle Dutch
    • Old Dutch
  • English (older forms) - should this always be the first translation listed, in the same way that we list English first on pages?
    • Middle English
    • Old English
  • French
    • Middle French
    • Old French
    • Anglo-Norman? - I don't think this belongs here... but maybe someone else does?
  • Frisian - Like Chinese, not a language!
    • North Frisian
    • Saterland Frisian
    • West Frisian
    • Old Frisian
  • German
    • Middle High German
    • Old High German
  • Irish
    • Middle Irish
    • Old Irish
    • Primitive Irish
  • Low German
    • Middle Low German
    • Old Saxon
  • Occitan
    • Old Provençal
  • Persian
    • Middle Persian
    • Old Persian
  • Portuguese
    • Old Portuguese
  • Spanish
    • Old Spanish
  • Welsh
    • Middle Welsh
    • Old Welsh

How is this? —CodeCat 16:29, 11 November 2012 (UTC)

I think that Anglo-Norman certainly does belong under French, since it was always was and still is frequently referred to as "French". --WikiTiki89 17:08, 11 November 2012 (UTC)
I can understand that, but it might confuse users because it doesn't have French in the name so they won't expect to find it there. When looking for candidates I always wondered to myself "is it likely that someone will look under one of the other words in the name"? So Old French might be considered "French, Old", but with Anglo-Norman that isn't so clear. Then again, maybe Old Provençal shouldn't be listed there either, nor should Old Saxon... —CodeCat 17:21, 11 November 2012 (UTC)
But people who are looking up translations for Anglo-Norman, Old Provencal, and Old Saxon are very likely to know of their connection to French, Occitan, and Low German, respectively, and will likely attempt to look there. Also, if it is not their first time looking for it, they will already know where it is and finding a subsection is much easier than searching through a list of Old XYZs that are all alphabetized to the same place. Another thing is that most of these languages have a plethora of other names and the only constant is their modern-day equivalent. --WikiTiki89 17:38, 11 November 2012 (UTC)
I'm not so sure. Some editors who did a lot of work here a long time ago seemed to think that Dutch descends from Old Saxon... On the other hand, we could just call Old Provençal Old Occitan, couldn't we? That is the name Wikipedia uses after all. —CodeCat 18:06, 11 November 2012 (UTC)
Isn't Old Provençal just one dialect of Old Occitan, but the best attested one? Then it's like Old Icelandic, which is sometimes considered a synonym of Old Norse, but more precisely is just the best attested dialect of Old Norse. And speaking of Old Norse, where does it go in the list above? My preference would be to alphabetize it as "Norse, Old", but others may disagree. As for older forms of English, I say put them in alphabetical order under E, not at the top. —Angr 22:20, 11 November 2012 (UTC)
I don't exactly know what Old Occitan is, but the Wikipedia article treats them as synonyms. The alphabetization could be changed but I would prefer to discuss that separately as it's a different issue that doesn't directly affect this one (although they both affect where languages are found in the list). —CodeCat 22:30, 11 November 2012 (UTC)

Custom sorting order for certain languages[edit]

Following on from the discussion above and in BP, I wonder if it is desirable and/or technically feasible to implement sorting keys for certain languages, so that they are not sorted by the first letter of their name. It could place, for example, Old Norse under N. This idea is separate from the one above, so even if we decide not to make subdivisions we could still decide to sort Old English under E and Middle Dutch under D. —CodeCat 22:36, 11 November 2012 (UTC)

Personally, I would prefer to continue using the alphabetisation we use now, where "Old Norse" is after "Occitan" and before "Portuguese", etc. If we don't group languages, either generally or specifically, I oppose changing their sort orders: so for example, if "Middle High German" were not grouped as a *:-sublisting under "German", I would oppose sorting it as a *-listing as "High German, Middle" (though I also don't think this is being suggested), and I oppose sorting Old Norse as * Norse, Old. Given that sorting "Middle High German" next to German as * German, Middle High is just another way of grouping the Germans, I might not oppose it, but I would prefer *:-type grouping. But a language like Old Norse, which can't be grouped (because it is just as much a predecessor of Icelandic as it is a predecessor of Norwegian, etc), should continue to be sorted by its (full) name, IMO. (I'm aware that Old/Middle English is as much a predecessor of Scots as of English, but we also don't have "English" in translations tables, so OE and ME are special cases anyway.) - -sche (discuss) 03:05, 16 November 2012 (UTC)
I didn't suggest that we change the names of the languages, only their sorting order. So it might look like this: —CodeCat 03:35, 16 November 2012 (UTC)
...
* Galician
* Georgian
* German
* Middle High German
* Old High German
* Greek
* Ancient Greek
* Greenlandic
* Guaraní
...
I object to that. My personal preference is to list "Middle High German" completely separately from "German", with other M-languages; but I'm O.K. with listing and sorting it as "German, Middle High", or even with listing it as "Middle High German" but nested under "German". (The reason I don't like the latter is that we use "German" to mean "Modern German", so nesting Middle High German under German would imply that Middle High German translations are Modern German, just as nesting Mandarin and Cantonese under Chinese implies that Mandarin and Cantonese translations are Chinese, and nesting Cyrillic under Serbo-Croatian implies that Serbo-Croatian Cyrillic translations are Serbo-Croatian. But despite my dislike, I could accept it.) What I'm not O.K. with is listing it as "Middle High German" but sorting it as "German, Middle High". That is very confusing. Someone looking for Greek should not have to recognize that it comes after Middle High German; they should be able to just scan the first letter of the language-names to find what they're looking for. —RuakhTALK 03:46, 16 November 2012 (UTC)
"Literary Chinese" is already grouped under "Chinese" in many (perhaps all? IDK) entries, such as helium, although Literary Chinese is a predecessor of modern Chinese, not a form of it the way Mandarin (which is also grouped under it) is. I think users looking at any translation, whether nested or not, need to have at least the minimal knowledge of what the name of the lect/variety they're looking at means (or need to look it up upon encountering it). - -sche (discuss) 04:09, 16 November 2012 (UTC)
Re: first sentence: We don't use ==Chinese== as an L2 header. We only use it in translations tables, and only as a group-heading for languages nested inside it. So even though Literary Chinese is qualitatively different from the modern topolects, I don't see a problem with including it under the "Chinese" umbrella. ==German==, by contrast, has a specific meaning here, and we define it as not including Middle High German.   Re: second sentence: Absolutely. "Imply" may not have been the best word-choice; I wasn't expressing concern that someone would be misled, but simply that it would be wrong. No one would be misled by nesting Japanese inside German, but it would be wrong, because Japanese isn't German. Likewise, Middle High German isn't German, at least as we use the term "German". —RuakhTALK 04:22, 16 November 2012 (UTC)
I agree with Ruakh that listing {{gmh}} as "Middle High German" but sorting it as "German, Middle High" would be a needlessly confusing bad idea. - -sche (discuss) 04:09, 16 November 2012 (UTC)