User:Visviva/Category reform

From Wiktionary, the free dictionary
Jump to navigation Jump to search

Topical categories[edit]

The current Wiktionary category structure is dominated by Category:*Topics.

Problems with topical categories
  • Unlike encyclopedia articles, words don't have topics. Almost any word can be used in a discussion of almost any topic, depending on the specific context. A word like religious is not limited to religion, but might equally be found in discussions of music, art or politics. Yet no satisfactory superordinate category exists that can account for this fact, unless we were to categorize all non-specialized terms directly under Category:*Topics.
    • In general, categories work best when they reflect an intrinsic property of the word described (such as part of speech or pragmatic constraints), rather than extrinsic properties (such as appearance in a particular work or wordlist). But there is no intrinsic relationship between a word and any specific topic.
  • Topical categories are often confused with terminological and semantic categories. Thus it is unclear whether Category:Anatomy is meant to be used for technical terminology in the field of anatomy, or for any terms referring to parts of the body. Likewise it may not be clear whether terms used in talking about cats, such as meow, should be in Category:Cats, or whether this category should instead only hold words for specific types of cat, such as tabby and Siberian.
Advantages of topical categories
  • User value: convenience of searching
  • Editor value: compatibility with other projects
  • Editor value: inertia
Other applications of topical categories in lexical works

Terminology/usage categories[edit]

Advantages of terminology and usage categories
  • Editor value: Systematic labeling and treatment of specialized terminology
  • Reuser value: Easing the extraction of specialized glossaries
  • User value: Eases the task of someone trying to familiarize themselves with the terminology of a particular field.
Problems with terminology and usage categories

Semantic categories[edit]

Previous applications
  • Princeton WordNet
    • The WordNet system is based on hypernymy rather than separate categories, so would need to be tailored to Wiktionary needs.
    • The WordNet system is released under a license that is not strictly compatible with Wiktionary. Thus, if WordNet were used as a model, care would need to be taken that no outright copying takes place.
Advantages of semantic categories
  • User value: Language learners and teachers can use these to develop lexical sets for focused learning.
  • User value: Semantic categories make it much easier to tease out rough synonyms, hypernyms and candidate translations, whose relationships may not yet be fully documented in the respective entries.
  • Editor value: editors can more efficiently focus on groups of words sharing common semantic properties, which therefore call for similar treatment and interlinking. Working through all the nouns, or even all the concrete nouns, in a language is quite tedious, so people seldom bother; working on a focused lexical set can be much more rewarding and effective.
Problems with semantic categories
  • Current policy calls for non-templated categories to be placed at the bottom of the language section. Thus any "bare" semantic categories will be separated from the sense to which they apply, hampering editing and reducing user value.
    • Counterpoint: A compact template could be used, similar to {{context}} but with no visible display text. This could be placed at the end of the sense line.
  • It has been argued that these are more appropriate for Wikisaurus, where the most intensive treatments of semantic relations are hived off to. WordNet is the only major dictionary with such a category structure, and it is is at least as much a thesaurus as a dictionary.
    • Counterpoint: Unlike most dictionaries, we do seek to have a fairly comprehensive treatment of semantic relations in the main entry. Further, given that we have existing categories based on a word's etymological, phonological, graphological, grammatical and usage/pragmatic properties, there is no obvious reason why semantic properties should be excluded.
  • Semantic categories could potentially be extended to an undesirable level of detail. Mimicking WordNet would result in a category for every hypernym.
    • Counterpoint: This could apply to topical and terminological categories as well. But the few cases of severe category bloat on Wiktionary have been dealt with satisfactorily. In practice, sanity usually prevails.

Proposals for synthesis and reform[edit]

Proposal 1
Do nothing; continue to use topical categories and discourage semantic categories.
Proposal 2
Eliminate all topical and semantic categories.
Proposal 3
Retain topical categories as a holding and compatibility structure, while moving most content into terminological or semantic subcategories.