Wiktionary talk:Votes/2009-12/Proposed inclusion of words and abbreviations with meanings established by recognized international bodies and formally adopted by multiple national governments

From Wiktionary, the free dictionary
Jump to navigation Jump to search

Initial discussion[edit]

It's not a terrible idea, but I don't think I'll support it. Still, hasn't even started yet. Mglovesfun (talk) 23:04, 18 December 2009 (UTC)[reply]

Care to give reasons? Conrad.Irwin 23:10, 18 December 2009 (UTC)[reply]
I'd like to see this generalized. Perhaps we choose to accept all such technical vocabulary prescribed by standards bodies, and consistently label it as such. On the other hand, maybe we accept none. I'd like to see us articulate a the principal for treating this kind of stuff, and then how to treat it should become more obvious.
In either case, and in light of Wiktionary's other policies on inclusion, I see no reason to make a particular exception for the SI. Michael Z. 2009-12-18 23:32 z
The best reason is that otherwise, what we're going to end up with - what we already have, to a degree - is a very messy hodgepodge of those terms/abbreviations that can be attested in accordance with the CFI, and those that can not. We will have permanent red links in any table covering all the terms, and we will have some gaps, where a larger unit or a smaller unit meets the CFI, but an intermediate unit that would properly be placed between them does not. We will also likely have instances where we have an abbreviation for a term, but not the term itself, or the term itself, but not the abbreviation. If we end up importing definitions from our foreign language counterparts which include all these terms, we will have the terms in other languages, but not the English translations. bd2412 T 03:14, 19 December 2009 (UTC)[reply]

I plan to vote yes. If it's prescribed in the SI, then the word is usable and understandable. —Internoob (Disc.Cont.) 01:16, 19 December 2009 (UTC)[reply]

I don't really see unused SI words as any different from other theoretical words that "might" be used but aren't. Equinox 01:23, 19 December 2009 (UTC)[reply]

Neither do I. Some of these are completely hypothetical and not even allowed by laws of physics. --Ivan Štambuk 01:51, 19 December 2009 (UTC)[reply]
What "other" theoretical words do you mean? In any case, these are words that exist as a result of an important international convention. Just because the technologies and inventions that would use them have yet to be developed does not make them theoretical, they are simply not yet in use. There are numerous verb forms in the Spanish and other conjugation paradigms that are not in use, but it would be silly to try to purge those forms that aren’t used. They exist and are available to any author who someday might need them, and we have them as we should have.
If someone can show that certain of these units are not allowed by the laws of physics, those could be put into an index of impossible terms, with an explanation of why they are not allowed. —Stephen 01:56, 19 December 2009 (UTC)[reply]
I surmise that under "other theoretical words" Equinox, as I, was referring to the class of words coined by applying any other set of well-defined prefixes and a lexical stem. Even lots of these SI prefixes (e.g. micro-, mega-) are in fact normally applied to a variety of common adjectives and nouns, usually intensifying/weakening their "normal" meanings, in effect coining new words by themselves, which abide by CFI independently of their original stems/senses.
What I was referring to as "hypothetical" was a concern that some of the words coined by all possible combinations of SI prefixes and units are not only unlikely to ever be attested in usage, but also make no physical sense whatsoever, except maybe in some fringy theoretical considerations (like string theory). There are both lower and upper bounds for physical properties of objects and measures in terms of units that we assign to them. You can't get charge smaller than that of electron, or speed higher than that of light. There is also so-called Planck scale under which it makes little sense to speak of "time" or "length" in ordinary senses. However, if some the terms coined in this way can be attested in actual usage (e.g. in SF writings or some silly post-modern physics), that's completely another matter. But I'd rather see them substantiated with citations than simply automatically generated under the assumption that one day humans might develop supraluminal propulsion or alter the fundamental properties of universe. --Ivan Štambuk 03:02, 19 December 2009 (UTC)[reply]
It doesn't matter if, for example, the smallest physically possible object is many times more massive than a zeptogram; if someone were to ask you to quantify the mass of a theoretical object of that specific magnitude, you couldn't just make up a new word for it, because one already exists. The only reason you are able to tell me that nothing can ever be as small as a zeptogram, or as short as a zeptometer, or absorb a zeptogray of radiation, is because these words have specific set meanings that can be contrasted with phenomena at higher scales. bd2412 T 03:23, 19 December 2009 (UTC)[reply]
...you couldn't just make up a new word for it, because one already exists - It "exists" in the same sense that words listed in Appendix:Invented phobias "exist". There are countless philias, phobias and similar theoretical constructions that can be coined in a unique way with a strictly-defined meaning, and which, if the necessity for their introduction to the language arises, will likely be coined as such. I see absolutely no difference between these constructs for which CFI is applicable and these hypothesized combinations of SI prefixes and units. --Ivan Štambuk 03:45, 19 December 2009 (UTC)[reply]
One very important difference is that those philias and phobias were not agreed upon by an international committee of representatives designated by different national governments, and they were not then formally adopted as correct terms by almost every nation in the world. bd2412 T 03:58, 19 December 2009 (UTC)[reply]
No international committee or world government dictates words' meanings in their usage. They might define their meanings in certain formal registers (e.g. legal, military, or physics in this case; purism is in the same category), but it's up to the people whether they would use it or not in the prescribed senses, or whether they used the word at all. The meanings of SI prefixes and units is in no way "less defined" or "less important" than either of those equally unlikely phobias or philias. If the word not only has no CFI-passing attestion, but is also unlikely to ever be used, I see no point in including it in a dictionary as it would serve no purpose. --Ivan Štambuk 04:14, 19 December 2009 (UTC)[reply]
That is exactly the question at issue here. Does formal international adoption of a meaning for a word matter if the word is not demonstrably used? The words still have correct definitions, and are of a limited set (I can think of very few other examples of words whose meaning has been established in this way), so why should we ask for more? bd2412 T 14:39, 21 December 2009 (UTC)[reply]
Does formal international adoption of a meaning for a word matter if the word is not demonstrably used? - As I said, there is no such thing as "formal international adoption of a word meaning. These are some pieces of papers signed by politicians that suggest that certain professional lingo should adopt certain words/meanings as a form of international convention. Whether they'd be actually used or not is not up to them, but the language end-users (i.e. speakers/writers) to decide. Wiktionary is principally a descriptive dictionary and adopting this breach of CFI would seriously undermine its basic policy (all words in all languages), because for the first time we'd be adding non-words. I don't see any fundamental difference between this and e.g.
  1. Addition of names of chemical compounds not yet discovered
  2. Additions of various philias/phobias coined for fun
  3. Addition of words appearing in any other kind of prescriptive documents published by some kind of "authority", but which have never gained usage. This esp. concerns never-to-be-used neologisms appearing in some purist dictionaries publish by language academies.
I see this as a dangerous precedent with implications that are likely to be abused in the future. --Ivan Štambuk 16:38, 21 December 2009 (UTC)[reply]
Countries that have not officially adopted the SI Units are in red.
How about ununennium and unbinilium? They exist as a words only because of an international agreement that they are the placeholder names of the as-yet undiscovered (possibly undiscoverable) 119th and 120th elements on the periodic table. I know of no recognized international body that has decreed a list of "coined for fun" philias and/or phobias (perhaps there is such a body that has formally established names for recognized phobias such as claustrophobia and agoraphobia, but we do not question the legitimacy of these words). You are reacting to what is not being proposed here, regarding "purist dictionaries" and "prescriptive documents" by "some kind" of authority. It will generally be very difficult to demonstrate that a word has been coined by a recognized international body and (more importantly) has been formally adopted by national governments (that's governments, in the plural. I'd gladly entertain refinements of the language as needed to clarify what constitutes a recognized international body, or a national government, or to specify some minimum number of adopting governments as necessary to make it clear that words "just made up for fun" are not within our reach. I also happen to think that a word formally recognized as having a particular meaning by even a handful of governments would merit inclusion in the dictionary. However, that is not this case. The SI Units have been formally adopted by well over a hundred countries. bd2412 T 17:52, 21 December 2009 (UTC)[reply]
No, ununennium and unbinilium exist as words not because of some kind of international agreement, but because they're used as words. Thousands of internationalisms are shared among world's languages, with or without some body regulating their meaning, but the only thing that matters with regards to their adoptions is actual usage. If ununennium and unbinilium are not used as words, they should also be deleted.
but we do not question the legitimacy of these words - no but it's perfectly valid analogy. Both of them are 1) unique coinages 2) not used. There is no difference in whether they're coined by Joe Sixpack or some board of big-shot scientists. None of them decides which term will gain actual usage.
You are reacting to what is not being proposed here, regarding "purist dictionaries" and "prescriptive documents" by "some kind" of authority. - I'm drawing analogies to what you're proposing here. Principally, there is no difference between them. If this passes, someone will sooner or later suggest some absurd idea that we generate tens of thousands of imaginary words following the same logic, using this vote as a precedent.
It will generally be very difficult to demonstrate that a word has been coined by a recognized international body and (more importantly) has been formally adopted by national governments - As I've repeatedly stated, there is no such thing as "formal adoption of words by national governments". They're not the ones deciding on whether the word would be used or not. Hence, such proof cannot exist, and doesn't exist here either. Word cannot "exist" just because some politician put his name on some sheet of paper. Governments are not people, and their signature and decisions are from a lexicographical perspective worthless. --Ivan Štambuk 19:37, 21 December 2009 (UTC)[reply]
How would ununennium and unbinilium come to have exist as words at all if not for their coinage by international agreement? They certainly didn't enter the lexicon through popular use, and clearly it did make a difference that those "big-shot scientists" coined the word. As a matter of personal integrity, I reject the slippery slope fallacy offhand, as nothing in this proposal will compel anyone to vote for anything more expansive in the future. bd2412 T 14:11, 22 December 2009 (UTC)[reply]
Chemical elements' names are usually coined by scientists who discover them. The act of coining (assigning a name to an object) itself in no way guarantees that the nascent neolgisms would actually be adopted at some time in the future by language speakers. And you again abuse terminology: Governments or international officials don't "coin" words, the only thing that those ignorant politicians do is signing some sheets of paper whose actual affect might be adoption of some new words in the language, or it might not. This vote is major circumvenction of our CFI policy because for the first time we're including non-words, sequences of letters which have never been used by anyone, an thus have absolutely no meaning at all. --Ivan Štambuk 00:17, 25 December 2009 (UTC)[reply]
I don't think it makes sense to describe a term as "not even allowed by laws of physics". Perhaps there's no particle whose mass would be given in zeptograms, but any particle's mass could be, and we can certainly make sense of a sentence like, "For his invention to be usable, it would have to weigh less than a zeptogram." (In other words, zeptogram is impossible in the same way that when pigs fly is, but not in a way that should have any bearing on whether we include it.) —RuakhTALK 02:27, 19 December 2009 (UTC)[reply]

Modify WT:CFI, or supersede it?[edit]

BD2412, please clarify: Would this vote modify Wiktionary:Criteria for inclusion, or supersede it? If the former, then what exactly is the proposed textual change? If the latter, then is the idea just that we could point people to this vote page when they try to RFV or RFD an affected term? —RuakhTALK 23:40, 18 December 2009 (UTC)[reply]

  • My intention is to supersede - this would simply place SI Units and their abbreviations outside of the CFI altogether. If it's a unit created in accordance with the SI standards, it's an attested word that gets included in the dictionary. bd2412 T 03:07, 19 December 2009 (UTC)[reply]
I'd much rather we took a more generic approach (as below) and fixed WT:CFI at the same time. Conrad.Irwin 03:12, 19 December 2009 (UTC)[reply]
I agree that this would be better if it modifies CFI rather than supersedes it. --Yair rand 05:16, 22 December 2009 (UTC)[reply]
I have no problem with that, but what's the difference between saying that such words are exempt from the CFI, or that such words will simply be deemed to meet the CFI? bd2412 T 05:18, 22 December 2009 (UTC)[reply]

Over-specific?[edit]

I find this category of words to include too narrow - it reminds me of the slightly icky "exclude all Modern English possessive forms" criterion (which we should probably overrule so that we can regain our lead on fr.wikt before 2,000,000? :p). I thought about counter-proposing "all words that are generated by a well-followed convention" but that seems too broad, with numbers and chemistry easily generating an infinite number of entries; how about "all words in managably finite sets generated by a well-followed convention". This leaves well-followed convention and managably finite for a bit of argument (should be qualified further before voting, probably using some % of items on the list are citable?), but I'd say things like ISO language codes, paper sizes, SI units, airport codes, unicode symbols and similar things that seem to be implicitly includable anyway. (No doubt someone will find a bazillion counter examples, I'll just redefine "well-followed convention" and "managably finite" until they go away). Conrad.Irwin 03:12, 19 December 2009 (UTC)[reply]

In this case, we could go with "words with meanings established by recognized international bodies and formally adopted by multiple national governments". bd2412 T 03:16, 19 December 2009 (UTC)[reply]
Yes, that would be much better. --Yair rand 07:14, 21 December 2009 (UTC)[reply]
I agree. There will be a fairly small set of words that meet this criteria (other than SI Units and some medical terms, I can't think of any). I have changed the proposal accordingly. bd2412 T 14:44, 21 December 2009 (UTC)[reply]
In the case of, say, zΩ, how do we know that it was formally adopted by national governments? —RuakhTALK 14:48, 21 December 2009 (UTC)[reply]
It is part of the specific and precisely defined set which has been adopted - z as an abbreviation for zepto- and Ω as an abbreviation for ohm are within the set of terms that have been adopted, so (by the terms of the adopting bodies, at any rate) there is one, and only one, abbreviation for a measure made in units of 10−21 ohms, and that is the . In this regard, it it no different than a km or an mmol. bd2412 T 15:02, 21 December 2009 (UTC)[reply]
How do we know that? —RuakhTALK 19:03, 21 December 2009 (UTC)[reply]
To expand a bit: firstly, I'd like to figure it out what our evidence is for the belief that the SI abbreviations have all been formally adopted by national governments, because if we're pinning our CFI to it, then ideally we should be able to cite it to a reliable source; and secondly, I'd like to know what else was covered in the same adoption(s). The English and French names? The Finnish and Hebrew ones? —RuakhTALK 19:08, 21 December 2009 (UTC)[reply]

That is a fair question, and one that I think is addressed by the materials hosted by the National Institute of Standards and Technology, which is an agency of the United States government within the Department of Commerce. The NIST has a collection of pages discussing various aspects of the SI Units. They identify the seven SI base units and 22 derived units, and they identify the twenty exponential prefixes with the specific note that:

It is important to note that the kilogram is the only SI unit with a prefix as part of its name and symbol. Because multiple prefixes may not be used, in the case of the kilogram the prefix names of Table 5 are used with the unit name "gram" and the prefix symbols are used with the unit symbol "g." With this exception, any SI prefix may be used with any SI unit, including the degree Celsius and its symbol °C.

They also have a neat checklist of usage rules (which, being in the public domain, we could copy directly into Wiktionary as an appendix), and a discussion of the historical context of the SI, stating:

The International System of Units, universally abbreviated SI (from the French Le Système International d'Unités), is the modern metric system of measurement. The SI was established in 1960 by the 11th General Conference on Weights and Measures (CGPM, Conférence Générale des Poids et Mesures). The CGPM is the international authority that ensures wide dissemination of the SI and modifies the SI as necessary to reflect the latest advances in science and technology.
The CGPM is an intergovernmental treaty organization created by a diplomatic treaty called the Meter Convention (Convention du Mètre, often called the Treaty of the Meter in the United States). The Meter Convention was signed in Paris in 1875 by representatives of seventeen nations, including the United States. There are now 51 Member States of the Meter Convention, including all the major industrialized countries. The Convention, modified slightly in 1921, remains the basis of all international agreement on units of measurement.

So, taking the NIST as a reliable source (although doubtless others could be found), these units are the law of the nations that are party to that treaty, although they have been adopted by almost every country - note that they are used in the sciences in the United States, and by U.S. entrants into international commerce, even though we still measure things in miles and pounds and degrees Fahrenheit. With regard to the matter of translations, note that the abbreviations are truly translingual, used everywhere in the world, including in countries that do not use the Roman alphabet (bearing in mind that the abbreviations themselves contain a µ for micro- and an Ω for ohm). The unabbreviated prefixes and base units are generally translated for different writing systems, and there are some local differences - Americans write meter (along with the Dutch, the Swedish, and Tatars, among other), the British write metre, the French write mètre, the Spanish write metro; the Germans capitalize all the unit names as nouns. Aside from variations due to differences in writing systems, there is much less variation among the prefixes, with some languages substituting a "k" for the English "hard c" in pico- (piko-) and hecto- (hekto-) and the like. I take the U.S. government website as evidence of the accurate prefixes, base units, and derived units in English, and note that there are plentiful resources on the web to find the official translations (as here, the prefixes in French (English and French seem to be the most popular in terms of results), here, in German (with a notation, "Um die Zahlenwerte in einer praktikablen Größenordnung zu halten, hat man Vorsätze zur Bezeichnung dezimaler Vielfache und Teile von Einheiten geschaffen" indicating that the prefixes are attached to the units to designate decimal multiples and parts of units), and two in Mandarin (here is the second). Here's a Russian documentation of prefixes and units, but I can't tell whether it's an official government document.

Since the CGPM is headquartered in France and their website reports all of their official proceedings in English and French, we have authority for the terms in at least those two languages being recognized (if not adopted) by the adopting countries. The links to government websites in other countries indicate the local variations, but in every instance where the Roman alphabet is used, at least some of the prefixes and units are identical to the English. Note also that the Chinese government websites use the English prefixes and then translate them into Chinese, but also use only the Roman alphabet abbreviations. Does that suffice to address your concerns? bd2412 T 21:39, 21 December 2009 (UTC)[reply]

Do I move the page now?[edit]

Since the proposal has changed, should I move the proposal to a name that reflects the new scope? bd2412 T 17:53, 21 December 2009 (UTC)[reply]

Yes. I would also advise removing the words that have been crossed out. (They show up in the table of contents, which looks really weird. ) --Yair rand 02:04, 22 December 2009 (UTC)[reply]

Specific change[edit]

This vote doesn't seem to have the specific text for what will be added/changed in CFI. This vote starts in ... 6 hours. Should the vote be delayed until the specifics are drawn up? --Yair rand 18:43, 24 December 2009 (UTC)[reply]

I have no objection to that, although I think the change to be made is just a reorganized statement of what would be included thereby. The CFI would be amended to state that attestation includes "words and abbreviations with meanings which are established by recognized international bodies, and which are then formally adopted by multiple national governments". bd2412 T 20:36, 24 December 2009 (UTC)[reply]
But where, specifically, in WT:CFI would that be? Under one of the options in the "Attestation" section? In the paragraph below it? (I know that the location is really trivial, but it's good to figure it out in advance anyway.) --Yair rand 20:44, 24 December 2009 (UTC)[reply]
I think that it should be in the "attested" list:
“Attested” means verified through
  1. Clearly widespread use,
  2. Usage in a well-known work,
  3. Appearance in a refereed academic journal,
  4. Usage in permanently recorded media, conveying meaning, in at least three independent instances spanning at least a year, or,
  5. Establishment by recognized international bodies, and adoption by multiple national governments.
Internoob (Disc.Cont.) 21:11, 24 December 2009 (UTC)[reply]
That makes sense to me. bd2412 T 21:14, 24 December 2009 (UTC)[reply]
That'll work. Should it be displayed in the vote page itself? --Yair rand 22:57, 24 December 2009 (UTC)[reply]
Just the last line, I think. I'll do that now. bd2412 T 23:11, 24 December 2009 (UTC)[reply]

Why this vote is wrong and dangerous[edit]

For the first time we're allowing to circumvent our descriptive CFI policy by including "words" which

  1. Have never been used by anoyone.
  2. Are not likely to ever be used by anyone.
  3. Whose meaning is purely an intellectual masturbation and which have no practical meaning at all: All basic and derived SI units have ranges in which they make physical sense, beyond that they're meaningless. We live in a quantum and finite universe folks, you cannot shrink and magnify everything ad infinitum at your whim.

This vote is especially delusive because of its fancy legalish new wording "words and abbreviations with meanings established by recognized international bodies and formally adopted by multiple national governments". This is misleading in several levels:

  1. Words and meanings cannot get established in any other way but by actual usage. Signing some sheet of paper by some politician in no way actually obliges people who he formally represents that they start using new words with new senses. Our current CFI policy proactively forbids neologisms, protologisms and other *isms confined to author's imagination, and even certain cases of actual usage which are not widespread. There is no difference between them and the neologisms this vote aims to include.
  2. Arguing that e.g. zeptojoule is unlikely to ever mean anything else than 10−21J is as meaningless to argue that aurophobia is as unlikely to mean anything else other than "fear from gold, aversion to gold". But until they're attested in actual usage, we really can't know for sure. Both are equally proper coinages ("established" prefix and a unit, "established" combining forms), but none of them wouldn't really pass CFI. You can create millions of such coinages in any language. But until they gain actual usage, they're not real words.

Another major concern for this is the precedent this vote sets (if it passes): What if some day someone wants to similarly create thousands of such pseudo-words, by combining some "legally established" affixes and lexical stems? From a lexicographical perspective it's all the same whether the word-coining document is officially adopted in only one country, or 200 of them, as the only things that matters is its influence, i.e. whether these words will actually be used one day or not. --Ivan Štambuk 00:45, 25 December 2009 (UTC)[reply]

We actually have a policy point addressing the slippery slope argument: Wiktionary:Criteria for inclusion#Attestation vs. the slippery slope. It states: "There is occasionally concern that adding an entry for a particular term will lead to entries for a large number of similar terms. This is not a problem, as each term is considered on its own based on its usage, not on the usage of terms similar in form." I would also point out that the CGPM states, and the NIST affirms, that the SI prefixes and units may be combined in any way. A zeptojoule is a zeptojoule, irrespective of whether anyone ever puts it in a book or a paper, and we are a less complete dictionary for lacking it. bd2412 T 01:34, 25 December 2009 (UTC)[reply]
No, that pertains to individual terms, this vote pertains to entire classes of terms, possibly encompassing millions new entries being added. The very CFI quote you cite "This is not a problem, as each term is considered on its own based on its usage, not on the usage of terms similar in form." is contradictive in its nature to this vote, because according to this vote none of such terms are considered on their own usage, but on the basis of external legality criteria.
"A zeptojoule is a zeptojoule, irrespective of whether anyone ever puts it in a book or a paper," - Well until someone actually puts it to the paper, it's as meaningful as aurophobia or billions of other unused coinages that are formed by the application of elementary well-defined morphological devices.
and we are a less complete dictionary for lacking it - As I said above, by not including words which are not used by anyone we're not losing much. In fact, we're not losing anything, because nobody will ever look up such words. OTOH, Wiktionary might even influence their creation and propagation elsewhere by including them, which would be very problematic. --Ivan Štambuk 01:44, 25 December 2009 (UTC)[reply]
A zeptojoule is a standard unit of measure, whether or not is has ever been used. Standard units are clearly words, which means that Wiktionary should include them. --Yair rand 01:48, 25 December 2009 (UTC)[reply]
Word is a minimal meaningful unit that is used in spoken or written language. If some sequence of letters/sounds is not used in a meaningful speech for communication purposes, than it is not a word, regardless whether it's in the international law or not. If someone someday used a word like zeptojoule it is likely to be in the meaning its straightforward morphological decomposition would indicate, but until it's used it's not a word any more than aurophobia. --Ivan Štambuk 02:17, 25 December 2009 (UTC)[reply]
I'd just like to add that Wiktionary is still Wiktionary. If these particular entries are permitted, given their nature they can all be added by an approved bot, with a usage note indicating that they are governmental constructs and have not been used otherwise (to the extent this is the case). They can therefore be put in with minimal human effort. They will certainly not make the dictionary less complete or less accurate. I doubt that anyone will make the effort of hunting down some more questionable class of comparable government constructs and fritter away their time making entries for the same. bd2412 T 02:04, 25 December 2009 (UTC)[reply]
Not governmental constructs but our own constructs. No government created these words. Politicians signed some papers served to them by committees who agreed on the points addressed in an attempt to establish an international convention. Which might or might not be succesful. Our inclusion criteria is independent of legislative constructs as we are strictly a descriptive dictionary. We must not add imaginary words used by no one, and especially not allow a policy that would allow inclusion of possibly infinite number of such coinages. --Ivan Štambuk 02:18, 25 December 2009 (UTC)[reply]
There is definitely a difference, while an established scientist could happily use yoctometer in a scientific journal without anyone blinking an eye, the same is not true of ablutophobia or w/e. As far as I can see this vote merely codifies current practice, we include any unicode symbol, regardless of whether we could cite it (chances are there's no way we have the technical ability to cite some characters); we include ISO codes, possibly citable, but very very hard to do so. These terms have a well-defined meaning already, we don't need citations to show it.
Again this descriptivism ideal, I agree with it and see this vote as perfectly fine with that. Providing we can justify the limits of what we are interested in describing, we can include any set of words we like - if we were going to be purely descriptive (as I understand the linguistics term) we would have to abolish {{misspelling of}}, {{neologism}}, and abolish the arbitrary exclusion of the internet as a source for citations. Yes it is ok to be able to describe how words are actually used, yes it is also ok to describe how standards bodies say words should be used.
I will acknowledge one problem with this proposal and that is that for the above to be true we need to maintain a clear distinction between "this is a word, it is used", and "this would be the word, according to ISU" - we do this for ISO codes cr and airport codes DCA, but not really for units zeptometre, zm. Conrad.Irwin 02:07, 25 December 2009 (UTC)[reply]
There is no difference if the word is used by Einstein or by Joe Sixpack, in a prestigious scientific journal or local agricultural newspaper. Scientists regularly coin thousands of neologisms, especially in the ever-emerging fields such as medicine, on the basis of combining well-defined lexical stems, combining forms and affixes. aurophobia might as well one day discovered as a psychological disorder, and described in some peer-reviewed journal without anyone blinking an eye.
The current practice of including Unicode symbols is a completely different and unrelated issue. This are not words, but symbols. There is a great difference between the inherent word (spelled in whatever script, orthography or encoding) and its representation. This is not comparable to any of our existing practices in that area.
Misspellings redirected by menas of {{misspelling of}} are not words (they do not categorize). We might as well completely abolish it one day if the Wiktionary search-feature acquires the capability to redirect such misspellings. Neologisms are words, actually used and citeable, but only recently being coined (or in another meaning of that term, coined whenever for some particular purpose, e.g. puristic, but never really spread). {{neologism}} is a context label describing word's meaning from a standardological perspective, similar to {{archaic}}, {{obsolete}}, {{are}} etc.
"it is also ok to describe how standards bodies say words should be used." - That is OK only if the words being described are use. In which case, that becomes a particular register of meaning. Many common terms have specialized well-defined meaning in law, science etc., and it's perfectly OK to include them if they are used. If they are not actually used they are empty meaningless coinages. --Ivan Štambuk 02:37, 25 December 2009 (UTC)[reply]
Who are we to decide what is a misspelling? How can we using purely descriptive analysis determine whether a word is mispelt or alternatively spelt? I don't think it's possible - however it would be much less useful to our readers to class all alternatives the same way. Same with neologism, what makes a neologism, where is the boundary? We can easily decree such a boundary but this is imposing an artificial definition on neologism.
What is wrong with saying "cr is the ISO 639-1 code for Cree"? - it's a factually true, provably true sentence. It may not be what you would like in a dictionary, in which case vote against this proposal. I do agree with you that these words cannot be assumed to have a meaning, but it certainly can be said that they have been assigned a meaning. Conrad.Irwin 03:06, 25 December 2009 (UTC)[reply]
We primarily determine that on the basis of some prescriptive works (e.g. orthography books, as well as other dictionaries). Misspelling is usually a typo, non-reoccurring hapax, which can be sampled against other written works of the same writer, or other writers in order to determine its spread and whether it qualifies for inclusion. We include misspellings only as an educational device that would point users to "proper" spellings (as well as to prevent creation of commonly misspelled terms). This is a relatively complex issue very dependent on language in question. I seriously doubt that there are any entries on Wiktionary which are doubtful as misspellings vs. alternative forms, especially in languages with morpho-etymological orthography such as English. In languages that are "maintained" by some language institutions, usually of national character, it's very easy to determine what is a misspellings. In some other instances it would require some research.
What constitutes a neologism - well, that is a vaguely defined term by itself, mutable in time and space. It's undeniable that some coinages are strongly felt as neologisms in certain circles, and that there is some utility to that label (e.g. user might wanna browse a category of terms coined in the last X years).
I cannot really see the connection in the discussion of misspellings and neologisms to this vote. Both misspellings/alternative forms (however you perceive them) are in a big advantage here, because they're actually used as words. The status of a work being a misspellings pertains to its usage regularity and "properness" with respect to some prescriptive work (if it exists for the language in question), and the status of neologism pertains to the date of the term being coined, its spread and status with respect to other synonymous or nearly related terms. Both of these categories are completely mutually orthogonal (word can be both neologism and a misspelling), and orthogonal to this proposal (e.g. misspelled neologism could be included on the basis of this policy).
There is really nothing wrong with saying "cr is the ISO 639-1 code for Cree". There is also nothing wrong with saying "zeptojoule is smaller than a joule". But until words such as zeptojoule can be actually verified in that usage, they don't exist. Words don't exist until they're used. I would not vote against the proposal because it would include real words that can be used, I'm voting against it because it would include words that are not used, and are unlikely ever to be used. In the proposed wording, any number of such terms could be added in the future on the basis of external, non-usage criteria, and that is something that I see as very problematic. No one can "assign meanings" to words other than speakers/writers. --Ivan Štambuk 04:24, 25 December 2009 (UTC)[reply]
I'm not so sure there's consensus to allow all Unicode characters and ISO codes. See WT:RFV#≡ where a programming-only character probably will fail as it's not used in natural languages. If it fails, the same will probably be true of the whole APL block of Unicode. Just before this vote started, I thought about RFV'ing an ISO language code to see what the community thought. I'm leaning against including symbols and codes not used (as opposed to mentioned) in natural language. While there are definitely facts and meanings associated with those entities, I don't think a dictionary is where those facts should be spelled out. The issue at hand with this vote, however, is slightly different. --Bequw¢τ 09:10, 25 December 2009 (UTC)[reply]
I don't think we'd be that "less complete". None of the major English dictionaries (AHD, OED, MW) include any zepto- term except zeptosecond and they all think they're pretty complete. --Bequw¢τ 09:42, 25 December 2009 (UTC)[reply]
No, no dictionary thinks it's complete (or I hope so, because it's impossible).
I would simplify this vote by extending it to all words (including set phrases) with an official or legal status (either internationally or in some country, or region, or town...), or proposed or recommended by an official body, even before they are actually used. Even if they are never used, including them is interesting for historical reasons. We want to include all words, and knowing that a word has been proposed and defined officially is something interesting.
An example of a word which could be accepted with such a rule and might be refused with current CFI: the French word vionnet (meaning small path). This is a very rare regional word derived from the Latin word via, and used in vionnet de l'Eglise in a French village, which makes it an official word (vionnet de l'Eglise is an official odonym). Lmaltier 10:18, 25 December 2009 (UTC)[reply]
Note that I don't propose that governments should be considered as lexical authorities. Not at all. We must be descriptive. But we have to reason not to describe words with an official status (with all appropriate comments explaining that they are not in actual use, if needed). And I don't propose to remove words one they lose their official status. Once a word is included (with a good reason), it must be kept, if only for historical reasons. Lmaltier 21:16, 28 December 2009 (UTC)[reply]

Withdrawal of the proposal.[edit]

Perhaps it would be appropriate at this point to withdraw this proposal. Would anyone object? bd2412 T 16:39, 31 December 2009 (UTC)[reply]

I wouldn't. Perhaps the original proposal to just have an exception for SI units was better... --Yair rand 20:00, 31 December 2009 (UTC)[reply]

And it is too complex (see KISS principle). Why multiple national governments? Is a word more a word when an additional government adopts it? The objective of CFI should not be to select which words are worth inclusion (as we accept all words), but to prevent people to include "words" they have created themselves. Independent attestations is a good enough reason to include a word. And when a word has an official status (anywhere), it's also a good enough, simple, easy-to-verify, reason to include it (see above). Lmaltier 09:42, 2 January 2010 (UTC)[reply]

Compromise formatting[edit]

Since this vote did not pass, I would like to propose an alternate format for these types of terms. As I mentioned on the main page, I'd would be listing the terms on the prefix and base unit pages (so they can be searched for) and then linking to an existing entry if it exists and otherwise the Appendix. See peta-#Derived terms for a trial. This could work for full terms and symbols. In cases where the entry already exists for some other sense, a link can be left on a See also section. --Bequw¢τ 22:00, 24 January 2010 (UTC)[reply]