Wiktionary talk:Entry layout explained/POS headers

Definition from Wiktionary, the free dictionary
Jump to: navigation, search
Archive
Archives
  1. Comments from 2005

Part of speech headings[edit]

(Copied from a discussion begun in the Beer Parlour Sept 2006.)

In the entry layout explained (from Community portal > Entry layout) you can read:

====The part of speech or other descriptor====
This is basically a level 3 header but may be a level 4 or higher when multiple etymologies or pronunciations are a factor. This header most often shows the part of speech, but is not restricted to "parts of speech" in the traditional sense. Many other descriptors like "Proper noun", "Idiom", "Abbreviation", "Phrasal noun", "Prefix", etc.

I couldn't find a link to further details on POS headings and more specifically to questions as: which POS headings are accepted, and what do they mean? Connel MacKenzie told me that POS headings were discussed last year and again this year, and that an agreement was reached over these headings. Unfortunately I couldn't find the outcome of these discussions, though I looked for them in the Grease Pit, as Connel suggested.

A few examples:

  1. A traffic light was a Noun, till someone called it a Noun phrase. Then Rodasmith changed it back to Noun, but without explaining why or without referring to any guideline.
  2. Many verb or noun forms seem to have a Verb form or Noun form heading, but apparently these headings are deprecated. Probably only Verb or Noun can be used in these cases, though - as several people have written - the inflection templates are inappropriate for non-lemmata.
  3. What about the heading Plural noun? See http://en.wiktionary.org/wiki/Talk:stadia.
  4. Why is Romanian a Proper noun and Russian a Noun?


There seems need for an accepted and easy-to-find guideline on these POS headings. That could surely avoid discussions like the one on http://en.wiktionary.org/wiki/Talk:traffic_light.

Jan, 16 September 2006

Yes, you are quite right - this is something we do not yet have written policy on. "Russian" should be listed as a proper noun, but presumably whoever created the entry just wrote "Noun" and that has remained.
It can also be argued that the header "Adjective" for "Russian" should be "Proper adjective". We had a discussion some time back about restricting POS headers as they seemed to be proliferating unnecessarily.
Perhaps we should discuss and agree on a fixed set of POS headers. Some points to consider:
  • Ancient or modern? Traditional POS's are noun, adjective, verb, adverb, preposition, pronoun, interjection and article. Some modern dictionaries use terms such as "determiner", and "modifier". For example, "my" is traditionally a pronoun, but some dictionaries describe it as a possessive adjective. Some words do not fit conveniently into the boxes used by traditional grammarians: numerals are an example. "Two" (as in "two people") can be variously described as a numeral, a number, a cardinal number or an adjective, the last of these being the traditional POS. We generally use one of the other terms here, but it is debatable whether these headings are actually parts of speech.
  • Simple or precise? "Running" is an adjective (as in "running water" and "a running sore") but a more descriptive POS is "participial adjective", as "running" is also the present participle of "to run". In the verbal sense, "running" can be described as a "verb", "verb form", "verbal noun", "gerund" or "present participle". Similarly, nouns can be proper, common, abstract or collective, although dictionaries that make any distinction at all do so for the first of these only and label the other kinds as just "noun". "The" is an article, but is also the definite article.
Paul G 15:46, 16 September 2006 (UTC)
Perhaps not a huge issue, but I cannot find any justification for declaring demonyms (e.g. "Russian") to be proper nouns. They do not identify any specific individual, but rather one of a class of individuals, i.e. they are common nouns. English just seems to have been courteous enough to extend them capitalization from their proper noun roots. Rod (A. Smith) 23:55, 16 September 2006 (UTC)
But then Russian is also a language, wouldn't that make it a proper noun? Jonathan Webley 07:57, 19 September 2006 (UTC)

Yes, we need a policy. This should probably go in the policy discussion itself, but after calling things "noun phrase" and so on for awhile, I stopped including the word "phrase". I think most people can easily see that it's a phrase, and it just clutters up the list. In general, I think transitive/intransitive should use the templates {{transitive}}, {{intransitive}} in the definition line, so there should be no need for a "Transitive verb" heading. Countable and uncountable fits neatly these days into the inflection templates or the definition line and likewise shouldn't take up heading space. Likewise, I think there should not be a heading "verb form", just "verb", for simplicity and consistency.

I can offer a couple of arguments for having such a standardized list of headers. First, if you hover the mouse cursor over a header like "Noun" you'll get a tooltip that says "part of speech" or some such. If you hover it over a heading that's not on the list, you'll get some notice about "not a standard header". The list of headers for which tooltips exist might provide an excellent starting point for this discussion, and the list should be updated when our policy is agreed upon. Secondly, a standard list of headers will allow better automated access to the data, both for bot cleanup efforts and for exporting. It'll make things look cleaner and more consistent, besides. Also, consistent formatting should help propagate consistent formatting, since anybody copying from another article will be copying the correct thing.

Incidentally, when we do standardize on preferred headings, it will be the perfect task for our team of bot-runners to go help tidy up the inconsistencies. If we settle on "Alternative forms", say, rather than "Alternative form" or "Alternative spellings", the extra variations will be quick and easy to consolidate using bots. Perhaps we could try before then to establish the bot guidelines we want. —Dvortygirl 16:10, 16 September 2006 (UTC)

And in fact this task is precisely what User:ScsHdrRewrBot is intended to do, and -- lookie that! -- it's already been approved. I haven't run it much yet, but you can look at its contributions to see examples of the relatively few header cleanups it's done so far. —scs 12:52, 17 September 2006 (UTC)

The renewed converstions from this year can be found at WT:GP#Normalization of articles / User talk:Connel MacKenzie/Normalization of articles. Comments are still welcome. --Connel MacKenzie 22:09, 16 September 2006 (UTC)

Other classics, for those who like to read a lot: (Yes, I had trouble finding it because it was moved five times): Wiktionary talk:Entry layout explained/archive 2005BP#Uniform headings, and about eight other (more?) relevant sections of that same page. --Connel MacKenzie 22:21, 16 September 2006 (UTC)
On a side note, I think we need a policy regarding archiving. Most of the 2005 archive of talk:ELE is relevant, as is the 2004 archive. Ironically, the majority of those conversations were from this Beer Parlour, but were vandalously moved without leaving links behind. I don't think any of the conversations that were removed from WT:BP were completely finished with (as is evident by the same questions resurfacing one or two years later.) --Connel MacKenzie 22:25, 16 September 2006 (UTC)
On a side note, one thing that would help to (a) make these discussions easier to find and (b) not keep having them over and over again would be if we could all try to (c) centralize them on the talk pages for the relevant policies in the first place and (d) actually update the policy pages once we reach an actual consensus! —scs 13:48, 17 September 2006 (UTC) [Memo to self: wander on by to WT:ELE sometime soon and be bold in altering it to fit reality.]

Miscellaneous notes and opinions:

  • It would be useful to keep in mind why we're tagging words with their part of speech at all. Is it
    1. For the benefit of readers who are learning English or grammar
    2. To separate the definitions for entries that have senses in multiple parts of speech, and/or
    3. To satisfy our deep inner craving to rigorously categorize things?
For my own part, I'd like to focus on 1 and 2 (although I'm the first to admit that I've got the categorization bug, too; it's just one I try to keep it in remission). There shouldn't be any shame in saying, for the really weird and hard-to-categorize words, that their part of speech is "other". (Of course, there's a significant logistical difficulty here in that our entries don't say "Part of speech: _____". A hypothetical "Other" as a part-of-speech heading under our current scheme would be confusing and wouldn't really work at all.)
  • I'm probably starting to sound like a broken record on orthogonality, but part of speech is really orthogonal to qualifications like "phrase" and "abbreviation". That is, many phrases, abbreviations, and contractions have meaningful parts of speech (though of course many do not). An interesting example I came across recently is HEPA, which you see on more and more vacuum cleaners and air filters, which stands for High Efficiency Particulate Air, and which is therefore pretty much an adjective. My point here is that, strictly speaking, things like "Phrase", "Initialism", "Abbreviation", "Contraction", and "Idiom" are not parts of speech at all, and a mechanism which specifies or categorizes parts of speech should arguably not be overloaded with trying to capture these distinctions, too.
  • If the distinguishing quality of "Noun phrase" versus "Noun" is "has a space in it", that's a pretty useless distinction, because any reader can see this for themselves. If we maintain a distinction for "noun phrase", it should be for longer, true phrases, like "the weather in London". Things like "lawn mower" are, I believe, pure and simply nouns. (In this case it's easy to prove, given that the spelling "lawnmower" also exists.)
  • Personally, I agree with Dvorty and others that the transitive/intransitive distinction is of secondary interest and should appear (if it appears at all) in tags on the definition lines for individual senses, not prominently in the Verb header. Similarly for countable/uncountable (which we do tend to do that way), and for concrete/abstract nouns (which we don't tend to try to capture, which is probably a good thing, 'cos it ends up being not such a clear-cut distinction after all).
  • Yet another distinction is for proper nouns. Those I don't mind being called out in the p-o-s heading, though I could go either way.
  • A somewhat trickier case is for the several words we've currently got listed using variations on "Adjective and adverb", such as quite. I'm not sure what the best way to handle those is.
  • As came up in the "nouns used as adjectives" thread, it can be argued that parts of speech in English are not nearly as rigid as we think they are, such that their use in a dictionary like ours could profitably be abandoned or drastically reworked, although that's probably too radical a proposal for today. (But the idea, I think, would be that instead of saying "moo: noun: 1. the sound made by a cow. verb: 1. to make a mooing sound", we could instead say "moo: 1. The sound made by a cow. 1a. (noun) an instance of this sound. 1b. (verb) to make this sound.")
  • Yes, I have just completely ignored the suggestion I myself just made to keep long screeds like this one centralized on the relevant policy page's talk page...

scs 14:48, 17 September 2006 (UTC)

English POS[edit]

(Copied from a discussion begun in the Beer Parlour Sept 2006.)
  • I am of the opinion that the following are the only things that should be used as part-of-speech headings for English entries: Symbol, Noun, Verb, Adverb, Adjective, Pronoun, Interjection, Article, Conjunction, Abbreviation, Initialism, Acronym and the x phrase derivations (noun phrase, verb phrase etc). This excludes "Verb form" because I think that everything which is labelled as "Verb" is a "verb form" whether it is the infinitive or the second-person plural of the past participle. - TheDaveRoss 18:59, 17 September 2006 (UTC)


Um, For English I would add Preposition, as well as Cardinal number, Ordinal number, Idiom and possibly Phrase (though I haven't seen a clear example of the latter yet that couldn't be classified as something else). While it is true that headings like Noun form and Verb form have little utility in English, they have tremendous utility in highly inflected languages. I use them in Latin and Spanish when I am writing an entry for a non-lemma entry so that other editors will have a cue that the information about the word is not on that entry page and should not be added there.
People keep saying that "there has been discussion" but all the links that I can find to such discussion seems not to thve reached conclusion with even a partial list of acceptable POS headers. Could we create an entry layout page (and corresponding talk page) where a list of accepted, debated, and rejected options could accrue? --EncycloPetey 22:06, 17 September 2006 (UTC)
In the Category:Phrasebook there used to be a lot of entries like do you speak English? that used the heading ===Phrase===, as nothing else fit. All entries in that category should use this heading for consistency. --Connel MacKenzie 09:30, 22 September 2006 (UTC)
The link on WT:GP to User talk:Connel MacKenzie/Normalization of articles seems to be missing? Perhaps that's why you haven't seen some of these discussions? --Connel MacKenzie 09:34, 22 September 2006 (UTC) No indeed, that too was inconclusive. My apologies. --Connel MacKenzie 09:40, 22 September 2006 (UTC)
A generic ===Phrase=== might be necessary in rare cases, but let's put the ===X Phrase=== issue to rest. If you think phrase is a necessary modifier, please distinguish the following (from the BP) between nouns and noun phrases:
ice cream, guinea pig, legal tender, milk of magnesia, high spirits, prisoner of war, batchelor's son, palm tree justice, apples and oranges, law of diminishing returns, one of his majesty's bad bargains
DAVilla 06:51, 8 October 2006 (UTC)

The ideas of POS are a confused collection of ad-hoc description and tradition. They should be abandoned. The carefully reasoned system worked out in The Cambridge Grammar of the English Language by Huddleston & Pullum reflects modern thinking and the consensus of the majority of current linguists. It is much more accurate and consistent than the traditional view while being no more difficult; in fact, its consistency makes it simpler. Insisting on maintaining the anachronistic POS model is like ignoring non-Euclidean geometry, quantum physics, evidence-based medicine, or evolutionary biology. The Wikipedia entries for prepositions, determiners, etc. generally reflect the Huddleston & Pullum system, but the best thing to do is get a copy. Wiktionary should adopt this system.--BrettR 15:09, 7 January 2007 (UTC)

While I think our current implementation of "part-of-speech" headings is very poor, I would strongly object to replacing it with such an exotic system, unlikely to be familiar to any of our readers. --Connel MacKenzie 03:41, 8 January 2007 (UTC)
Have you actually had a look at this system? Based on what do you lable it "exotic"?--BrettR 12:12, 8 January 2007 (UTC)
I have not picked up a copy of the book, and have now only started on those Wikipedia articles.
I labelled it "exotic" because it seems to follow the trend of describing things with too much detail, for the average English reader. Our target audience is not the linguist. Our target audience is the typical English speaker, trying to find the precise spelling and meaning of a term. Certain information can be conveyed, without making it harder for the reader to navigate an entry...there is no need to make the article navigation more difficult than it already is.
I doubt very highly that our average reader could tell the difference between a postposition and a circumposition. Outside of linguistics, such terms are very unfamiliar. [Note: they are also marked as likely misspellings by Firefox' spellchecker.] (When I was in school, those were all lumped together with "prepositions" making no placement distinction.)
That is why for English parts of speech, I have continually campaigned for using only the "big eight" parts of speech: Noun, Verb, Adverb, Adjective, Pronoun, Interjection, Article, Conjunction and a smattering of (obvious) other headings Proper noun, Symbol, Abbreviation, Acronym, Initialism, Phrase, Idiom. Why? Because I am pretty certain I am not the only one who was taught that all words fall into one of those categories. I am inclined to believe that the overwhelming majority of en.wiktionary.org readers have had similar instruction.
That said, the barn door is wide open. Having other level-three headings in use for other languages has paved the way for many more headings in English than we should reasonably have. Should they be eliminated/normalized? I do think so. Should that information be lost from Wiktionary? Absolutely not. Using a tag to identify a {{determiner}} would still convey as much information to the linguist (who happens to be a reader too!) while still making the information accessible to everyone else.
--Connel MacKenzie 04:42, 13 January 2007 (UTC)
Note: You didn;t include preposition in your big eight. Traditionally, the articles are taught as adjectives, which is why you missed listing prepositions in the eight. You're forgiven for missing them ;) They were the last of the big eight that I learned, mostly because Saturday morning School House Rock only covered the other seven. --EncycloPetey 05:05, 13 January 2007 (UTC)
I apologize for copying the list TheDave had above, without examining it more closely. --Connel MacKenzie 15:04, 13 January 2007 (UTC)
I suppose a lot of readers did have the same instruction, and the "big eight" will ring a distant bell for a lot of people from their School House Rock days. But I think the majority of readers don't actually remember the difference between a pronoun and a preposition, or between an adverb and an adjective. These people are coming to Wiktionary primarily for the definitions, and they'll infer all the grammatical information they need from the example sentences. So we are targeting the people who actually pay attention to the POS headers, and if you assume that they appreciate the "obvious" difference between Nouns and Proper nouns, for example, they should also be able to handle Determiners vs. Adjectives. (They are already expected know what Articles are, but then there are only two of them in the whole language: a whole POS heading for 2 words???)
The Articles heading may apply to only two words in English (and three entries), but for highly inflected languages like Greek and Latin, it covers dozens of entries and becomes a very useful header for English speakers byquickly alerting the reader that the entry is equivalent to "a" or "the". It's also significant as a category because many languages have determiners, but lack words that function as articles. In particular, Hungarian and the Slavic languages do not have articles but do have determiners. --EncycloPetey 18:35, 13 January 2007 (UTC)
I was more trying to point out the strangeness of accepting such a numerically insignificant POS as obviously standard and resisting something like Determiner, which would be at least as useful. And I anticipate a debate about the continuing usefulness of "Article", if "Determiner" is adopted, because what is an article except a determiner that contributes only information about definiteness and basic quantification (number, partitivity), in a language that has some syntactic requirement for certain NPs to include a determiner? (Which, by the way, suggests that Classical Latin has no articles, but several sets of demonstrative determiners, and the numeral one. (But maybe you didn't mean Latin as a real example.)) CapnPrep 19:16, 13 January 2007 (UTC)
Ah, no! I had included Latin as an example of an inflected language only, not as an example of a language with Articles (which Classical Latin does not have). Greek happens to be both, but you are correct to clarify that point about Latin not having articles. Which is odd to think on, since its descndant languages (e.g. Italian, French, and Spanish) all have articles derived from the same roots. --EncycloPetey 19:49, 13 January 2007 (UTC)
It's a bit strong to say that the traditional POS categories "should be abandoned". Linguists use these traditional terms all the time, but they strive to provide explicit definitions. And once you do that, it turns out that there are some leftover words that just don't fit anywhere. So we need a few additional ("exotic") categories. But then, why should it come as such a surprise that the tools we were given in 4th grade are not quite sufficient for dealing with real life? CapnPrep 08:56, 13 January 2007 (UTC)
I'm not sure "abandoned" actually is too strong, rather it is misleading. For heading information, the only thing they have in common is "Definitions." I feel stonger with each iteration, that our current system of headings is far too inflexible. Limiting parts of speech to the definition lines (using our current {{cattag}} system or something similar) would regroup information in a useful manner, while allowing the linguistic debates to carry on much more freely. That is, the correction of a tag from noun, to determiner wouldn't raise an eyebrow, since the arrangement of the Wiktionary entry would not be changing. But that would only be acceptable if we had a solid mechanism in place for retaining the "POS" information on the definition lines. We have only the start of such a system, currently. --Connel MacKenzie 15:04, 13 January 2007 (UTC)
Abandoned is too strong, I agree. The system in the Cambridge Grammar of the English Language (CGEL) does take this as a starting place. In fact, the actual terms used are not much different from traditional POS. The lexical categories are:
  1. noun
  1. pronoun
  1. verb
  2. adjective
  3. adverb
  4. preposition
  5. determinative
  6. subordinator
  7. coordinator
  8. interjection
The major difference is the careful reasoning that goes into putting words into those categories and in keeping lexical categories separate from phrasal categories and grammatical function, unlike traditional POS which confounds the three. Phrasal categories are:
  1. clause
  2. verb phrase
  3. noun phrase
  4. nominal
  5. adjective phrase
  6. preposition phrase
  7. determinative phrase
Grammatical functions are:
  1. Head
  1. Predicate
  1. Dependent
  1. complement
  2. modifier
  3. determiner
A dictionary would only be concerned with assigning lexical categories. An understanding of the others will be useful in determining which lexical category to assign, but simply because a word functions as, say, a modifier in an NP, does not make it an adjective because it is recognized that a variety of lexical categories can modify nouns. Thus, function is kept distinct from category. Instead, a variety of tests (morphology, semantics, licensed dependents, etc) are used to determine function. For the vast majority of words, this system will result in the same basic headings as traditional POS.--BrettR 13:42, 13 January 2007 (UTC)

Tables[edit]

I have added/am adding tables describing all (that I can find ;-) of the current usage.

I've put the "[noun,verb,adjective] form" headers in as standard because they are in widespread use, including for English verb forms. And I do think they should be; while I understand this is being discussed. This is a first draft.

Note that the way the tables are set up, every header used should be listed there somewhere, even if in the last table, so please don't delete any; changes should involve moving them. (unless you eradicate, say, "Cmavo" ;-)

If you know of any others, please add them where you think is appropriate. Robert Ullmann 13:57, 18 September 2006 (UTC)

Ordinal number[edit]

Several bits of the discussion suggest that "Ordinal number" is used for, e.g. first and tenth. But it isn't! Those use the Adjective header. Searching on "Ordinal" doesn't show any entries using it as a header that I can find. (Not that the search is always that good!) It is a category. Robert Ullmann 10:50, 19 September 2006 (UTC)

It is being used in Latin (eg. quattor), and should be used in English as well, since first functions as both a noun and adjective. --EncycloPetey 17:35, 19 September 2006 (UTC)
In first, there are Adjective and Noun headers. What would you do with "Ordinal number"? ;-) Robert Ullmann 17:53, 19 September 2006 (UTC)
For English, I'm not sure. The entry for ten uses Cardinal number and Noun, but not Adjective. The entry for tenth uses Adjective and Noun, which is inconsistent with what we're doing for tenth. For Latin, I would simply use Cardinal number and Ordinal number without Noun or Adjective, because the grammar of these words is different from what is typical for nouns or adjectives and this fact shows up in grammar books that are detailed enough. For English, we don't usually think beyond the 8 standard POS taught in school. Personally, I would put most of what we mean by "first" into Ordinal number, with an Appendix page linked that explains general usage patterns of ordinal numbers in Egnlish. --EncycloPetey 18:13, 19 September 2006 (UTC)

I'd like to point out that there doesn't seem to have been, or be, any policy against using Ordinal number in English entries; it just so happens that it isn't used in any extant entry. Robert Ullmann 09:34, 20 September 2006 (UTC)

Is there anyone reading this who has an objection then? I'll wait a bit, and if I hear nothing will begin making the change-over. I need a good straightforward edit project that doesn't require much thought to execute, if only for the change from writing About Latin. --EncycloPetey 22:14, 21 September 2006 (UTC)
  • IFAIK, those entries are supposed to be listed as ===Ordinal number===s. Systemic changes in should be discussed as a separate 'bot request in WT:BP. --Connel MacKenzie 22:19, 21 September 2006 (UTC)
But a bot would only replace the headers. From what I've seen, there's also a bit of format tidying, linking, and normalization to be done as well. I've also started a file of quotations from literature that I would start adding. With all that other work, there's no need to bother with a 'bot, since I'd be going through the pages anyway. --EncycloPetey 22:25, 21 September 2006 (UTC)

It strikes me that English grammar recognizes only 8 parts of speech. In Dutch grammar there are 10! Nobody discusses Article or Numeral as POS. It is shocking for me to see ten or tenth treated as an Adjective. And I think this holds for many contributors whose mother tongue isn't English: look at the articles describing the translations for words like first or two and see what POS headings the non-English articles use. Not very consistent?
The problem with Cardinal number and Ordinal number is that they could be split by the heading Noun, if we accept that POS headings should be given in alphabetical order (and that one word could fall under these 3 headings). Perhaps the heading Numeral could solve this problem. It could then be split in {{cardinal}} and {{ordinal}}.--Jan, 22 September 2006 (UTC)

Erm... that's only a problem if Cardinal number and Ordinal number appear on the same page. As far as I know, they are never the same word in Indo-European languages, so what languages are you worried about? Can you give an example of a word where this will be a problem? --EncycloPetey 15:23, 22 September 2006 (UTC)

Number versus Numeral[edit]

I've done some checking. Grammars use number to refer to issues of singular, plural, etc. They use numeral when referring to a part of speech in a language. Should we revise our header names (and categories) to reflect this difference in usage? --EncycloPetey 00:26, 24 September 2006 (UTC)

Seems 50-50 on its own merits, but I'm all for it if it reduces confusion, and probably moreso if it's standard. DAVilla 06:36, 8 October 2006 (UTC)

Take a look at our own definitions. A number is "an abstract entity used to describe quantity", but a numeral is "a word or symbol representing a number". By our own definitions, numeral is the correct term to use here. --EncycloPetey 17:51, 19 October 2006 (UTC)

If you are referring to cardinal number and ordinal number, I have never heard of cardinal/ordinal numeral before. To me, seven is a number, and the numeral 7 represents it. The numeral representing the cardinal number seven is 7. —Stephen 04:46, 19 October 2006 (UTC)
Stephen and Connel, these headers are being used on the other Wiktionaries. For example, see pt:first, where ===Numeral Ordinal=== is the header, or it:primo, where they use ===Aggettivo numerale=== (Ordinale). My preference for Numeral is that it's a grammatical term, and English is the only language I have seen regularly use the term "number" is this situation. As a math instructor, I can say that 7 is referred to alternatively as a number or a numeral in math situations, with little distinction to the majority of Americans. The important issue here is the grammar. When you get into inflected languages outside of English, this becomes important because the inflection varies differently from other parts of speech. English doesn't have much inflection, so we don't see it in English.
And Connel, I'm not introducing a new header, since Ordinal number and Cardinal number are already in use (in some form) both on the English Wiktionary as well as those in other languages. Rather, I'm propsing that we standardize to a consistent form within our project that will match across other Wiktionary projects as well. --EncycloPetey 17:39, 19 October 2006 (UTC)
Google counts - "Cardinal number" 221,000 - "Cardinal numeral" 1,020. Sod what they do in other languages. Our average user has obviously voted !--Richardb 05:56, 23 May 2007 (UTC)
As someone trained in teaching statistics, I will note that your argument is fallacious. It is not "one entry, one vote." It is quite possible that one or a few users entered a very large number of entries. It is also possible that one or a few users edited many entries. Your count also does not cover "Number" versus "Numeral", which is a possibility for a header. It also does not consider the fact that an entry with a header of "Cardinal numeral" could be defined as a "cardinal number", and in fact I would do it that way (using Numeral for the part of speech, but number for the definition; just as I might use "Noun" for the POS but would not use "noun" as part of the definition). There is thus no such "obvious" conclusion in the flawed little survey you conducted. --EncycloPetey 15:22, 24 May 2007 (UTC)

Why include Cardinal / Ordinal?[edit]

In the voting regarding the choice between using Number or Numeral several people questioned the need to use the terms Cardinal or Ordinal at all in the header. The rationale for keeping them is that in many inflected languages, the Cardinal numerals and Ordinal numerals have different inflections, and so each requires its own inflection line. It becomes awkward when the inflection line always immediately follows the POS header, but two different inflections are required. Granted, many languages have different words for their cardinal and ordinal numerals, but I've turned up some cases where the words are identical. If someone out there can propose a format compatible with the current ELE that accomodates situations where a single entry will function as both cardinal and ordinal numeral and accomodate multiple inflection lines, please make the proposal.

For now, I'm playing around with formats on the assumption that we'll proceed with Cardinal numeral and Ordinal numeral, but I'm not moving too quickly on that. I want to be satisfied I've worked out the bugs before attempting any major editing. To see possibilities, keep an eye out for entries in Category:Cardinal numerals, Category:Ordinal numerals, andCategory:it:Cardinal numerals. I'm limiting my edits to a few articles for now until it looks right and works right before expanding my efforts. --EncycloPetey 04:25, 14 November 2006 (UTC)

X form[edit]

Adjective form, Noun form, Verb form ... note that there are several separate questions:

  • Should English use Verb form as a header? (English does not decline Nouns or Adjectives)
  • Should other languages use X form as headers?
  • Should the usual categories separate (e.g.) verbs and verb forms?

Note that the present document does not include Verb form in the section describing English, except for a note about it being discussed.

The table (standard POS headings for all languages) in this same revision includes X form lines as they are in widespead use in many other languages, with the corresponding categories, and have strong support from some people.

English has Category:English verbs and Category:Verb forms.

Other languages usually have both if applicable; this of course varies with language. Robert Ullmann 09:31, 20 September 2006 (UTC)

I oppose these forms, I prefer simple "noun" and "verb" type headings, even for declensions and conjugations. - TheDaveRoss 18:37, 21 September 2006 (UTC)
I support using X form for non-lemma entries from inflected languages, but have no strong opinion one way of the other for English and languages inflected to a similar degree. For inflected languages, using X form helps guide users to lemma forms. Consider that using only Verb would mean that the Spanish verb nadar would have 62 additional entries listed in the Verb category just from the simple tenses alone. Separating them puts the lemma in the Verb Category and all the others in the Verb form category. This also assists when editing or visually inspecting an article. There are some languages for which it will not be clear which form is the lemma. Latin is one such case. We seem to be using the infinite as the lemma form most often, but every Latin dictionary I've ever seen uses the first person singular present active indicative form as the lemma. By using "Verb form", we clue in later editors that the page should not be expanded because the page in question is a non-lemma. --EncycloPetey 22:23, 21 September 2006 (UTC)
I think you missed one of my points above: the categorization is separable from the POS header. Just as we use Noun for both singular and plural forms, but then cat in Category:English nouns and Category:English plurals we can use Verb and cat in (language) verbs, verb forms, and various specific verb forms. As to identifying the non-lemmata more clearly, I have an idea based on something that has been bugging me since I first started seriously working on the wikt. I'll expand on it infra a bit later today ... Robert Ullmann 13:44, 22 September 2006 (UTC)
As I have said elsewhere, but perhaps was not clear on, I too oppose any inclusion of " form" in headings, for English or for other languages. --Connel MacKenzie 22:28, 21 September 2006 (UTC)
In principle I am against the " form" POS headings, in English and other languages. For reason of simplicity! (By the way, we should try not to distinguish between English and other languages, also to not complicate matters.) The form vs form discussion should be linked with the corresponding templates. The lemma and non-lemma could both have the heading Adjective, Noun or Verb, but the distinction should be made by the added template. The lemma must be accompanied by the default template en-adj, en-noun or en-verb. The non-lemma must not be accompanied by one of these templates (for which they do not fit at the moment!), but one of the non-lemma templates like plural of| or past of|. After the creation of the needed templates, this distinction could be used for Latin and similar languages.
That's the theory. But look at a few (difficult) cases: stadia (various Noun headings under various etymologies?) and sheep (what about the singular and the plural under the same heading? and what about the appropriate template (for both?) ?). --Jan, 22 September 2006 (UTC)
Connel, have you taken a look at what will be required to have templates for all the Latin inflectional forms? A partial list includes (names of templates not decided):
* la-adj : (for the positive masculine or positive common or positive all genders)
* la-adj-pos-f : (positive feminine)
* la-adj-pos-n : (positive neuter)
* la-adj-comp-m : (comparative masculine)
* la-adj-comp-f : (comparative feminine)
* la-adj-comp-c : (comparative common)
* la-adj-comp-n : (comparative neuter)
* la-adj-sup-m : (superlative masculine)
*:...and this continues but only gets us the nominative singular forms of adjectives!.
We then still have to note genitive, accusative, dative, vocative (and some locative) for all of the above, just to finish the singular forms. Then everything is doubled to accomodate the plurals. It then proceeds to pos/comp/sup forms of adverbs, nouns in the singular and plural of the nominative, genitive, dative, accusative, vocative, (and locative for some). It includes inflected forms of all the pronouns, and we still haven't mentioned verbs or irregulars. And then there's Greek...
Either the system you propose will generate mountains of work for template writers, or we need to rethink it for simplicity. --EncycloPetey 14:53, 22 September 2006 (UTC)
Parameters, Sir, Parameters! {{la-adj-form|pos|f}} etc. Robert Ullmann 15:22, 22 September 2006 (UTC)
But that still requires a mountain of work for template writers who have to contend with all of this. Could we see a sample such template done for Spanish verb forms (and perhaps one for Latin nouns), in order to see how this will work? Remember that Latin nouns have both a display form with macrons and an entry form that lacks them. --EncycloPetey 15:26, 22 September 2006 (UTC)
Oh dear. The entry must always be as written. The display must always be as written. If these are different you are in a world of hurt, and will be there until someone cleans up the mess ... If Latin is written with the macrons the entry form is wrong. If it isn't, the display form is wrong (it belongs in pronunciation) Robert Ullmann 12:48, 23 September 2006 (UTC)
No, neither form is wrong. The macrons do not belong in the pronunciation section; they are present because (1) all textbooks and dictionaries use them as standard for entry header forms, but they are not used in the entry page name because (2) the Romans and medieval writers didn't use them. See About Latin, where I've started collecting information about Latin entries for a draft. Yes, the templates get fussy over the macrons, which is why RodASmith has submitted additional code (to wherever such proposals go) in order to make his draft {la-noun} template work. RodASmith understands the issue, as he knows a bit of Latin himself. We will have a similar issue with Ancient Greek, because of similar conventions regarding the differences between ancient texts and modern editions. --EncycloPetey 00:24, 24 September 2006 (UTC)
The discussion to this point on the page has been moved to Wiktionary:About Latin, since it is more relevant there. (Originally by EncycloPetey 17:49, 9 October 2006 (UTC), updeated by DAVilla 08:18, 10 October 2006 (UTC))
Way OT here, continued there. Robert Ullmann 17:57, 9 October 2006 (UTC)

Edit: (current revision) Since I have sort of assigned myself the role of editor ... (not of course exclusively or anything!) I should say here as I did on WT:BP, I'm fairly agnostic on X Form (do kind of like it), the document is intended to represent what is being used, what is not controversial, and some intermediate "take" on what is being discussed. (As the discussion progresses, someone has to edit the doc, else it goes on endlessly as it has before.) All that said, I've changed the tables to reflect more of where we seem to be. This is still very much a draft. Robert Ullmann 14:24, 22 September 2006 (UTC)

I'm also against "X form" headers. Ncik 22:29, 4 October 2006 (UTC)

Shortened forms[edit]

See the vote on this topic in Wiktionary:Votes/2006-12/POS headers.

The headings Abbreviation, Acronym, Initialism (and Symbol) should not be used as POS headings (or restricted to very few cases). Words as CD and TV must have a Noun heading. Linguistically there is a fundamental difference between an initialism and a noun: the first has to do with etymology, the latter gives us syntactic (and semantic) information. Priority should be given to real POS headings, because only these can tell us whether a word can replace the dots in e.g. I bought a new .... Many acronyms and initialisms are nouns, but ASAP, BTW, IMHO or LOL are not. Words like CD and TV can have a plural and should be accompanied by the en-noun template. But then the heading cannot be something like Abbreviation or Initialism. And surely not Initialism noun as the Norwegian CD. SMS now has an Initialism heading (which should go under the Etymology heading, perhaps as {{initialism}}), but should have a Noun and/or Verb heading to tell us which function it can have in an English sentence. --Jan, 22 September 2006 (UTC)

It never was a "POS heading" - the last time I looked, ELE was clear that it is a misnomer to call it a POS heading. I would strongly prefer that {{pos_n}} be used at the start of the definition line in the examples you gave above. --Connel MacKenzie 14:24, 22 September 2006 (UTC)
Connel, could you explain the meaning (and the advantage) of pos_n. There is no talk page for that template, and its "discussion" page was empty. --Jan, 23 September 2006 (UTC)
I didn't see this out-of-sequence comment earlier. I assume you followed the conversation, below? Hippitrail's "pos" templates mimic how you'd see the part of speech in a "normal" dictionary, not abbreviated only to save space, but to not distract from the definition given. Hovering over the italicized abbreviation with your mouse gives the expanded form as a "hint text." --Connel MacKenzie 02:47, 4 January 2007 (UTC)
Headers such as Acronym and Initialism say something about how the word is capitalized / spelled, which is important information. While the terms do name a POS, they should be used in parallel fashion to assist English learners with spelling and capitlization. --EncycloPetey 15:35, 22 September 2006 (UTC)
I'm not against acronym and initialism, but they shouldn't be used on the same level as POS headings (like Noun, Verb). What would you say if fagus had a Noun heading and quercus a Nominative heading (on the corresponding line)? Noun tells us that the word can be used as subject or object (and that it is a concrete or abstract entity), whereas Nominative doesn't offer the same kind of information (it could be Noun or Adjective). The same holds for Abbreviation, Acronym and Initialism: these indications offer valid information on etymology and/or spelling and/or pronunciation, but not on syntactic (and semantic) possibilities. How should a grammar checker know that sentences like Send your SMS to ... or SMS your answer to ... or Send me very SMS are grammatical if SMS is described as an Initialism? It can only be done with headings like Noun and Verb, excluding adverbial use. Abbreviation, acronym and initialism are valid indications, but should go under headings like Etymology, Pronunciation (UFO vs. ufo) or Usage notes. --Jan, 23 September 2006 (UTC)
And you're right that the POS should be included in the entry, but I agree with Connel's suggested format, such as:
==English==
===Acronym===
'''CD'''

# {{pos_n}}: [[compact disk]].
...though I do think that the cryptic output of {pos_n} ought to be expanded to (noun).
While most acronyms should link to the expanded form of the word, making it clear what POS is intended, it might be a good idea to specify how to handle the POS of acronyms and the like in the WT:ELE. --EncycloPetey 00:37, 23 September 2006 (UTC)
If SMS links to its expanded form, which seems to be a Noun phrase (most people would opt for Noun), what lets me conclude that SMS can (also) be a Verb? --Jan, 23 September 2006 (UTC)
I'm with Jan on this one. Ncik 22:27, 4 October 2006 (UTC)
I agree. Saying {{pos_n}} {{pos_v}}:... is unwieldy. Perhaps we should have a method similar to {{cattag}} for this type of approach? --Connel MacKenzie 22:34, 4 October 2006 (UTC)
Then how about a revised {pos} template that takes parameters for the name of the language, then the POS (one or more). The template would then display the POS in-line, and add Category tags for the correct POS in the included language.
  • Example: {pos|English|adjective} displays: (adjective) and adds the entry to Category:English adjectives.
  • Example: {pos|Dutch|noun} displays: (noun) and adds the entry to Category:Dutch nouns.
With a little extra finesse, the POS could be given in abbraviated form. --EncycloPetey 23:28, 4 October 2006 (UTC)
This proposal doesn't make it any better. Jan's criticism is much more fundamental. And I agree with him on that as a whole. Ncik 23:37, 5 October 2006 (UTC)
For those of us who don't see the problem, could you please elaborate? What is wrong with {{pos|English|noun|verb}} as the line qualifier? --Connel MacKenzie 21:26, 6 October 2006 (UTC)

First, let me note that no one is saying (e.g.) Initialism is a POS header. It is a standard non-POS level 3 header.

We don't need to go inventing a new tag for definition lines; we have perfectly good standard structure and syntax. If it is more than one POS/whatever it gets more than one POS header, right? Like we always do? (;-) SMS is an excellent example. The initialism, the noun, and the verb are three different things. Note in particular that the initialism refers to the service, while the noun refers to a message on the service. The entry should (like every other entry in the entire wikt!) use the appropriate POS/L3 headers: Initialism for SMS the service, Noun for SMS the message, Verb for the act of SMSing someone. (yes, that is a real everyday form of the verb SMS!) See SMS. Robert Ullmann 18:10, 7 October 2006 (UTC)

But why not put what you call the "initialism" under the "Noun" header as well? "SMS" when meaning "Short Message Service" is a noun, we surely agree on that, don't we? Moreover, "SMS" when meaning "text message" or "sending a text message" are initialisms too (well at least the uninflected forms). Just read the definition on "initialism". What I suggest is mentioning the fact that a word is an initialism in the pronunciation section, because it is the pronunciation that determines whether an abbreviation is an initialism or an acronym. This also removes the trouble with inflected forms that aren't initialism any more, because they have their own pages and hence their own pronunciation sections. Ncik 01:21, 12 October 2006 (UTC)
Because the 'thing' being described isn't the referent noun, it is the initialism itself. --Connel MacKenzie 06:28, 19 October 2006 (UTC)
But why should this justify a separate heading? It's not done like this in any other case, such as inflected forms for instance. Ncik 00:20, 22 October 2006 (UTC)
We don't? --Connel MacKenzie 20:43, 16 November 2006 (UTC)
Thanks! I don't know about everyone else, but for me that makes the issue and potential solutions make a great deal more sense. --EncycloPetey 17:41, 9 October 2006 (UTC)

Substantive adjective[edit]

(Please note that this is as much a policy question as policy discussion.) I have on several occasions used the header "Substantive adjective;" I was wondering if there is a better way to represent this or conversely if this ought to be added to acceptable parts of speech. These words serve basically as noun, but would not be labelled as such in dictionaries. For example: iuvencus is an adjective meaning "young," while the substantive adjective iuvenca commonly means a "heifer." It is not a noun, but functions as one and therefore seems to me that it ought not be present in the "Latin nouns" category. Medellia 20:41, 16 November 2006 (UTC)

I've been trying to decide how I would handle this issue as well. My leading inclination is to have an in-line {substantive} tag for the beginning of each substantive definition. The only other option I can think of uses subheaders with their own definitions, and so involves major restructuring of how we group definitions, and I don't like where that line of reasoning leads. --EncycloPetey 22:13, 16 November 2006 (UTC)
I suggest continuing this discussion on the talk page of Wiktionary:About Latin. Ncik 12:27, 17 November 2006 (UTC)
My concern is that it applies not only to Latin. While admittedly, substantive adjectives are much more common in Latin and Greek than any modern language I've encountered, they do occur in English. (The Good, the bad, and the ugly; the meek shall inherit the earth; &c.) Medellia 17:17, 17 November 2006 (UTC)
They also happen in Spanish, and I'd be surpised if they weren't common to most Indo-European languages, at the least. --EncycloPetey 23:49, 27 November 2006 (UTC)
The downtrodden, the short and the tall, the yellow, the red, the eccentric, the palatable, the absurd... The list goes on. This is not a particluar type of adjective, but a feature of all (I suspect) English adjectives; there is no point in making any mention of it in a dictionary. I suspect the same is true of Latin and the other languages mentioned, but don't really know enough about them to say.--BrettR 19:18, 7 January 2007 (UTC)
Not quite all (e.g. "the afloat" doesn't quite work), but nearly. Those adjectives which may only be used as predicable adjectives don't function as substantives. --EncycloPetey 03:27, 8 January 2007 (UTC)
No, other non-attributive adjectives do work: the drunk, the unwell. But other adjectives that are formed with the historical preposition an do not seem to: ablaze, afloat, afoot, afraid, etc.--BrettR 12:08, 8 January 2007 (UTC)
I think you've misunderstood my point. The drunk, the unwell, etc. work because the adjectives are not reuired to be used as predicable adjectives. You can say "the drunk man" and "the unwell cat". However, you can't say "the afoot plot", so afoot is a predicable-only adjective. It is these adjectives that cannot be used as substantives. Now, many of these are indeed formed in the way you described, so the two descriptions may be largely of the same group of terms; I just want to be sure you understood that I meant predicable-only adjectives. --EncycloPetey 18:39, 8 January 2007 (UTC)
The USN and others use afloat and ashore as ordinary (not predicable-only) adjectives all the time. (ashore certification, the afloat prepositioning force, etc, etc) OTOH there is "land of the afraid". Robert Ullmann 18:59, 8 January 2007 (UTC)
Yes, I understand what predicatble only means, but in my flavour of English drunk and unwell are in that category: A drunken man, not a drunk one; A sick cat, not an unwell one. Check out, for example, the Longman dictionary of contemporary English which lists both as "not before the noun".--BrettR 00:14, 9 January 2007 (UTC)
OK; it's a difference of personal usage then. --EncycloPetey 00:55, 9 January 2007 (UTC)
Probably for that reason precisely, they are derived from a prepositional phrase. (Acting C.J. Watanabe, Hilo Appeals court, 2005 No. 26741: "And those are indicative that something was afoot here and the afoot is that you acknowledged that you had been drinking alcohol." ... um, very rare sort of usage ;-) Robert Ullmann 13:22, 8 January 2007 (UTC)

Hebrew roots[edit]

At Wiktionary talk:About Hebrew, there seems to be some agreement to give Hebrew roots (sequences of letters that don't have their own pronunciations and that aren't themselves words, but that give rise to various related words, especially verbs) their own entries, in which case we'd use a non-standard "Root" POS header, and some sort of non-standard "Forms" subsection in which we'd list the various words that the root gives rise to. If y'all have input on the subject, please reply here or, better yet, at Wiktionary talk:About Hebrew. :-) —RuakhTALK 05:37, 8 February 2007 (UTC)

Noun form, Verb form, etc.[edit]

The project page says "Use of Verb form has been routinely changed to 'Verb' but is being discussed. Use of Noun form and Adjective form for other languages (they are inapplicable to English) is also being discussed", but then includes all three in the "Non-standard, deprecated headers" table. What's the story here? Is one of them out of date, or is one of them jumping the gun, or are they somehow compatible in a way I don't see? —RuakhTALK 05:54, 8 February 2007 (UTC)

We've been using Noun, Verb and Adjective without the "form" as the standard since the time I started editing here last year, possibly longer.. --Versageek 06:06, 8 February 2007 (UTC)
Some people were using X form, (and some people still are ;-). Should probably be settled. We've left the "form" out of the table in this doc for a while now. Probably time to edit the text. Robert Ullmann 06:10, 8 February 2007 (UTC)
This was originally set up as a place to find out what we were doing (descriptive), and thereby have a basis for drafting a policy (hence the wiggle language). This document has never been formalized as policy, but may be ready for that now. Robert is right; we should probably just go ahead and change the language since "X form" is not used by the regular community anymore and is routinely changed when it's found. Personally, I still like the idea of using X form in highly inflected languages, so that users can immediately tell that the page they're seeing is not a lemma form. --EncycloPetey 18:43, 8 February 2007 (UTC)

Policy?[edit]

How/when did this page actaully become policy?--BrettR 13:18, 21 February 2007 (UTC)

When Connel consolidated all the policy headers, "becoming policy" was an unanticipated side effect. --EncycloPetey 15:26, 21 February 2007 (UTC)

Verb form vs. Verb[edit]

Discussion moved here from User talk:Meco

WT:ELE states that the header Verb is correct for English language entries and that Verb form is not to be used. I've had to revert an entry of yours e.g. weaned.--Williamsayers79 12:20, 23 May 2007 (UTC)

Do note that while this is pretty well settled, no-one has walked it through the formal policy vote process. So by all means use Verb rather than Verb form, but don't worry about it too much. If and when the policy vote is done, we'll change all of them. But as noted, it is pointless to change Verb to Verb form ;-) Robert Ullmann 12:24, 23 May 2007 (UTC)
I find an injunction not to use the Verb form header distressing, and looking over WT:ELE I cannot find any such prohibition (other than the fact that it isn't listed in the list of applicable headers). If this is contentious I would like to participate in a discussion on whether we should prohibit or sanction its use. __meco 22:51, 23 May 2007 (UTC)
First of all, thank you for the heads-up, meco. As per the (now getting really old) WT:BPA discussions (which predate the WT:VOTE mechanism by years) there was never consensus to allow "Verb form" as a heading; there was only a reluctant agreement not to eliminate it immediately, as there were only a tiny handful of bot operators still learning the ropes. My javascript has automatically corrected them for a very, very long time, but I prefer to manually review all my JS edits, as there usually are several (minor) problems with entries that make such a basic error. I believe the topic resurfaced a year or two later on Talk:ELE with similar results. --Connel MacKenzie 21:51, 2 July 2007 (UTC)

Lexical Categories from the CGEL[edit]

A tree diagram showing the lexical categories recognized by the Cambridge Grammar of the English Language

I thought this might be a helpful reference.--Brett 17:21, 31 October 2009 (UTC)


POS headers statistics[edit]

See some statistics about parts of speech and polysemy. Information about the whole English Wiktionary and several selected languages: en, fi, ru, uk, fr, de, sr, tt, eo.

P.S. I selected WordNet format for more convenient comparison Wiktionary vs. WordNet. -- Andrew Krizhanovsky 06:51, 19 August 2011 (UTC)