User talk:Dan Polansky/2022

From Wiktionary, the free dictionary
Jump to navigation Jump to search

Bare links outside of definition lines[edit]

Your edit here uses a bare link in the see also section. I thought the standard was to use {{l}} for that kind of thing. Vininn126 (talk) 12:57, 24 August 2022 (UTC)Reply

That's a job for a bot, isn't it? --Dan Polansky (talk) 12:58, 24 August 2022 (UTC)Reply
I suppose it can fix the old ones. Going forward it would be nice to not have to send a bot. Vininn126 (talk) 13:03, 24 August 2022 (UTC)Reply

Senses for particular individuals or specific people[edit]

Inclusion of senses for particular individuals is controversial, but widespread. Category:en:Individuals contains many entries with senses for particular individuals, e.g. Aristotle: "An ancient Greek philosopher, logician, and scientist (382–322 B.C.E.), student of Plato and teacher of Alexander the Great." More examples include philosophers (Plato), poets (Keats), politicians (Churchill), writers (Emerson), playwrights (Shakespeare), composers (Chopin), explorers (Cook) and scientists (Darwin). Multi-word examples include Jesus Christ, Alexander the Great, Darwin's Bulldog, Attila the Hun, Genghis Khan, Mary Magdalene, Joan of Arc, Robin Hood, Mother Teresa, Desert Fox, Donald Trumpet, John the Baptist, Korea Fish, Pharma Bro, Mango Mussolini, Queen Anne, Nanny of the Maroons, Saint George, Saint Nicholas, and Son of the Morning Star. Some particular individuals can be supported by WT:LEMMING, e.g. per Aristotle”, in OneLook Dictionary Search.. Past deletion attempts are at Talk:Gogol (2009), Talk:Xenocrates (2010), Talk:Hitler (2010), Talk:Jesus Christ (2010), Talk:Saint George (2012), Talk:George VI (2018), Talk:Cook (2019), Talk:Joan of Arc (2022); a poll is in Wiktionary:Beer parlour/2010/December#Poll: Including individual people. An attempt to delete the category was at Category talk:Individuals. Some individuals are excluded per WT:NSE: "No individual person should be listed as a sense in any entry whose page title includes both a given name or diminutive and a family name or patronymic." We also have some nicknames of specific people: User talk:Dan Polansky/2016#Nicknames of specific_people, e.g. Woz. A successful 2009 deletion is at Talk:Gogol; U.S. presidents were being deleted and then added again per Talk:Washington. In 2009 and 2010, there was the attributive use rule that allowed senses for particular individuals to fail RFV, but that was removed via Wiktionary:Votes/pl-2010-05/Names of specific entities. Sometimes, the person does not have a separate sense but rather is defined as part of "especially" of a sense not specific to a person, e.g. in Jefferson: "An English surname originating as a patronymic; (US politics) used specifically of Thomas Jefferson (1743–1826), the third president of the United States, principal author of the US Declaration of Independence (1776), and one of the most influential founders of the United States". The practice about this is inconsistent: sometimes the person has a separate sense and sometimes they are defined as "especially" or "specifically". --Dan Polansky (talk) 18:21, 30 August 2022 (UTC)Reply

Proper noun sense vs. only figurative sense: Some want to remove proper noun senses and replace them with figurative senses. An example removal is Talk:Joan of Arc, which was removed without consensus but rather using plain majority of 4:3 for deletion. A related discussion is at Wiktionary:Beer parlour/2021/August#Fictional characters as proper nouns, but with very little participation. A case in point will be archived at Talk:Mother Teresa. I argue that the figurative noun senses are not needed and that they should be covered in the proper noun sense, on the model of "Specific person so and so, noted for being such and such". It is a usual feature of proper names that they can be used sometimes figuratively, by which some characteristics are picked, and it arguably does not make them common nouns per se. I know of no dictionary that follows Wiktionary's practice of having common noun senses for proper names. Sherlock Holmes”, in Merriam-Webster Online Dictionary, Springfield, Mass.: Merriam-Webster, 1996–present. does what I disagree with, having only noun senses and no proper noun sense, whereas Teresa”, in Merriam-Webster Online Dictionary, Springfield, Mass.: Merriam-Webster, 1996–present. has Mother Teresa as a biographical name, having a proper noun sense; Joan of Arc”, in Merriam-Webster Online Dictionary, Springfield, Mass.: Merriam-Webster, 1996–present. has a biographical name, not a common noun; Mussolini”, in Merriam-Webster Online Dictionary, Springfield, Mass.: Merriam-Webster, 1996–present. has a biographical name, not a common noun. It follows both practices have some support in MWO. --Dan Polansky (talk) 08:56, 3 September 2022 (UTC)Reply

Comparison to place names and taxon names: The effort to exclude specific entities was lost when large coverage of place names was approved. Large coverage of taxon names (species, etc.) was never questioned, from what I remember. In terms of numbers, CAT:en:Individuals has 751 entries. By contrast:

--Dan Polansky (talk) 17:10, 7 September 2022 (UTC)Reply

Central location for synonyms: A dedicated sense, separate from "surname", makes it possible to list nicknames as synonyms. Thus, we can have Trump: Cheetolini, Donald Trumpet, Drumpf, God Emperor, Mango Mussolini, Orange Man, Trumperor, Forrest Trump, Twitler; and Putin: Pootin, Pooty-Poot, Putler. Entries for nicknames seem supported even by some of those who oppose senses in surnames; having a dedicated sense in the surname entry makes them discoverable. --Dan Polansky (talk) 15:04, 20 September 2022 (UTC)Reply

Quotation marks around the linked term in reference templates[edit]

See Wiktionary:Beer parlour/2022/August § Straw poll: quotation marks around the linked term in reference templates. Dan Polansky (talk) 10:44, 3 September 2022 (UTC)Reply

English -ing forms or gerunds or gerund-participles or present participles[edit]

-ing form entries are subject of debate: should a noun section be there and should an adjective section be there. One discussion will be archived at Talk:bathing. Related pages:

Traditional grammar classifies -ing form uses as present participles vs. gerunds, the former being for adjectival uses, the latter being for noun uses. CGEL proposes to abolish this distinction and use the single category of gerund-participle; see the quotations in the gerund-participle entry. Wiktionary -ing forms currently have Verb sections featuring present-participle definition lines, and often have no section covering the gerund. One change could be to change Verb to Participle and change present-participle to gerund-participle, terminologically siding with CGEL and covering both phenomena on the definition line. The behavior of gerund-participles could then be explained in the glossary, linked from the definition line. It would cover adjectival and noun uses on one definition line.

Arguably, the part of speech of a sense intended to subsume multiple parts of speech should be Participle and not Verb. And since uses of the present participles are arguably grammatically all adjectival, including swimming in she was swimming, even Adjective would be better than Verb as far as the grammatical function is concerned.

Wiktionary:English adjectives contains tests that should distinguish adjectives-beyond-participles from present participles. Especially relevant is section Words ending in -ing. The tests include modification by "more", "most" and "very". It remains unclear why the participle can account for ungradable adjectival behavior but not for specifically gradable adjectival behavior.

Also at User:Dan_Polansky/Notes#Gerund and User talk:Dan Polansky/2014#English -ing form and gerund.

Links:

  • W:Participle
  • participle”, in Merriam-Webster Online Dictionary, Springfield, Mass.: Merriam-Webster, 1996–present. - has a grammar note, including 'The participles are words that "take part" in two different word classes: that is, they are verb forms that can also act like adjectives ("the spoken word," "a moving experience").' Noun uses are not covered by the note.
  • gerund”, in Merriam-Webster Online Dictionary, Springfield, Mass.: Merriam-Webster, 1996–present. - covers noun uses of -ing forms
  • Gerunds, owl.purdue.edu - "A gerund is a verbal that ends in -ing and functions as a noun."
  • Participles, owl.purdue.edu - "A participle is a verbal that is used as an adjective and most often ends in -ing or -ed", does not cover noun uses as participles
  • H.W. Fowler, The King’s English, 2nd ed. 1908, section "PARTICIPLE AND GERUND"

Dan Polansky (talk) 14:24, 9 September 2022 (UTC)Reply

Block[edit]

@Vininn126 Please reduce my block from one week to one day. I will refrain from posting to the Beer parlour thread from which you have forbidden me to posting. I would like to continue to work elsewhere on Wiktionary, and I think I can so do without creating anything like similar thing that you see as a problem. Dan Polansky (talk) 20:01, 3 September 2022 (UTC)Reply

Have three hours. Cool off. Please take this time to reflect on how discussions and debates are carried out. You are a productive editor but an unproductive discusser. You raise good points but in an absolutely terrible way that is not open to any outside input. Vininn126 (talk) 20:04, 3 September 2022 (UTC)Reply
@Vininn126 Thank you. The blocking tool also makes it possible to block someone from specific pages, e.g. Beer parlour, for longer time.
I don't understand how my discussion posts are not open to outside input; can you please clarify? I am much more detail and evidence oriented than many others, and write much longer posts, and argue things in detail, and often do not give up; many present extremely brief and inarticulate posts. In Wikipedia, discussions feature much longer posts than here. People seem to be afraid of articulation and discussion here, except for some. I had some unpleasant exchanges with one particular editor, who repeatedly assumed bad faith on my part, which disrupted my emotions, which could have negatively affected my behavior, but with others it seems to be not that bad. I think I should learn to ignore the bad posts toward me from that editor and stop responding to that editor. I engage with specific arguments made in a way that many don't, in RFD, on the assumption that RFDs should be discussions and exchanges, not just votes. --Dan Polansky (talk) 20:18, 3 September 2022 (UTC)Reply
I am aware. This was more a metaphorical block. I am also still frustrated by your unwillingness to change your editing habit upon request and also by your behavior's i.e. on Hergilei's talk page. I do not think that these are the reasons why editors are frustrated with you - it's the manner in which arguments are presented. Take for example your argument about someone's claim not making sense in Latin. I apologize, but that seems rather non-sequitor. What I think many of us would appreciate is the openness to outside commentary - this does not necessarily mean changing your core values, but rather adopting a somewhat more cooperative approach. It takes a lot of courage and wisdom to have the strength to know you're wrong and when you're right the strength to stand your ground. I think the best approach for you would be to wait to respond - let people respond to message a and then read it, think about it, and literally compose a draft of your response in a more organized manner. Your input is of course important but it's going to be disregarded if you respond to everything emotionally. Vininn126 (talk) 20:27, 3 September 2022 (UTC)Reply
@Vininn126 Re: "your argument about someone's claim not making sense in Latin": That was someone else's argument, not mine, and was indeed very bad; please check. Hergilei has entered many errors in Czech entries; I have caught some of them. In some ways, he is extremely careless editor. He entered erroneous Czech pronunciations without knowing what he was doing, relying on cs-IPA template doing the work for him, which it is does not entirely--he knows no Czech--and if I did not catch that, we would have many errors, and some were found by some random Czech visitors. --Dan Polansky (talk) 20:33, 3 September 2022 (UTC)Reply
No editor is perfect. It's more about how we correct them. Vininn126 (talk) 20:35, 3 September 2022 (UTC)Reply
I have learned that entering errors in volume into Wiktionary, showing objectionable recklessness toward the requirement of accuracy, is perfectly fine here; but to call that out gets one into trouble. This is one of the most infuriating aspects of the English Wiktionary. By my lights, Hergilei should have received a block for carelessness long time ago, but that will not happen. We had editors who showed persistent pattern of very bad work, creating a lot of nonsense, and we were not able to do anything about them even via a vote; finally, the editor I have in mind was blocked after a vote to block them failed. Everyone is oh so nice toward all contributors and the accuracy suffers. "No editor is perfect" is not a very good excuse for easily avoidable recklessness: If I don't speak Czech and do not know Czech phonology, I won't be entering Czech cs-IPA pronunciation since I would not catch mistakes: that is not rocket science. One could be checking each cs-IPA output against a source, but that is not what the editor did. --Dan Polansky (talk) 20:42, 3 September 2022 (UTC)Reply
Dan, you're missing the point again. Go cool off. Vininn126 (talk) 20:44, 3 September 2022 (UTC)Reply
Needless to say, I welcome all people to post feedback on my behavior on my talk page. If you don't like something, my talk page is a great place to talk about it, much better than Beer parlour where there is another subject at hand. I love specific quotes of what I said serving as evidence, and diffs. You (who is reading this) can even post anonymously from an IP; that's perfectly fine. --Dan Polansky (talk) 20:26, 3 September 2022 (UTC)Reply
@Vininn126 I did not know that striking out comments that are completely incorrect was inappropriate. To the contrary, I saw representations to the effect that RFDs should be administered based on the strength of arguments. My strike out was undone and all is fine; I don't see why a 3-day block is warranted. --Dan Polansky (talk) 11:44, 9 September 2022 (UTC)Reply
Striking out votes has NEVER been the precedent and you're still acting somewhat harassive by doing so, and does not show that you have changed too much in relation to the previous event. Vininn126 (talk) 11:47, 9 September 2022 (UTC)Reply
It surely was not a common practice, but a change of practice has to start somewhere. I said that I "boldly" strike out, indicating uncertainty of the action. I wondered what was going to happen. It has been undone, and all is fine. --Dan Polansky (talk) 11:50, 9 September 2022 (UTC)Reply

OED treatment of proper nouns[edit]

See Wiktionary:Beer parlour/2022/September § OED treatment of proper nouns. Dan Polansky (talk) 06:36, 7 September 2022 (UTC)Reply

Derived-adjective principle[edit]

The following seems to be a useful principle to complement, not override, WT:NSE: When an adjective is derived from a proper name and the adjective definition features a specific individual or other specific entity, that entity should also be listed as a sense in the base proper name. Rationale: 1) it makes intuitive sense; 2) the fact that the entity has generated a dedicated adjective is predictive of that entity being picked by the proper name as the default referent out of context, and therefore, that particular entity is as much a part of the meaning of the proper name as anything else.

Some examples:

The practice of other dictionaries supports the principle in part, inconsistently. Charles Darwin is covered in Darwin entry of M-W, AHD[1], and Collins[2], whereas Dickens is in M-W and Collins; further dictionaries having Darwin as a biographical name covering Charles are vocabulary.com, yourdictionary.com, dictionary.com, and infoplease.com. Dictionaries do not make this an automatic principle. However, dictionaries often do feature specific individuals in the derived adjectives, but would not have to: they could define, say, Dickensian, using the generic "Of or relating to a notable individual named Dickens", and claim that anything else is for an encyclopedia. And yet, Dickensian”, in OneLook Dictionary Search. shows a definition featuring Charles in oxfordlearnersdictionaries.com, Macmillan, dictionary.cambridge.org, and Collins; M-W has Dickens as a biographical name and Dickensian listed next to it without a definition. Popperian”, in OneLook Dictionary Search. shows Collins treatment similar as M-W for Dickens: Collins defines Popper as the philosopher and lists Popperian next to it as an adjective without a definition; M-W:Popperian references the philosopher in the adjective definition; M-W has no entry for Popper, only for popper. OED only defines -ian adjectives in terms of the particular persons and usually does not define surnames at all, not even as surnames; the same is true of Macmillan, Oxford Learner's Dictionaries, and Cambridge Advanced Learner's Dictionary. By doing so, these dictionaries have holes in the derivation network: Dickensian is derived from Dickens and -ian yet they do not cover Dickens at all, not even for etymology. All this suggests that dictionaries that cover surnames at all tend to cover then as referring to notable individuals. In fact, I found no single dictionary among OneLook dictionaries and no single dictionary entry to cover a surname as a surname and not as a biographic name covering some particular person; the "surname" definition does not seem to be covered by any OneLook dictionary at all.

It seems uncertain that the principle will find a consensual support (2/3 is quite a bar to pass), but in RFD for proper name senses covered by WT:NSE, editors need to pick uncodified principles to make the decision, and this principle is one of the candidates. Dan Polansky (talk) 07:04, 7 September 2022 (UTC)Reply

A claim has been made that the derived-adjective principle invokes notability. Maybe so, but not in a bad way. First, it does not rely on any decision procedure for who or what is notable and what is not. More importantly, it is much more exclusionist than the following notability principle: Each Wikipedia-notable person should have a dedicated sense line in the Wiktionary entry for their surname. The requirement that the person has to be so notable as to generate an adjective is so much stronger than that.

The principle does not invoke necessity but rather appropriateness. There is no doubt we can define the adjectives without depending on the specific entity being defined in the base proper noun. Rather, the existence of the adjective is used as a predictor of notoriety of the referent, of the referent being picked out of context as the sole default referent or one of multiple few default referents by the proper noun. --Dan Polansky (talk) 13:50, 9 September 2022 (UTC)Reply

There is a broader principle here, one of derived-term principle. The supporting derived term does not need to be e.g. adjective Darwinian but rather noun Darwinism.

Microsoft is perhaps an entry with a stronger rationale for the principle than surname entries. Microsoft currently has a Proper noun section, containing plentiful derived terms including Microsoftie. These are derived from the proper noun, not from the common noun. Placing them to the Noun section would be misleading. To make things minimal, we do not need the common noun section: we may instead describe Microsoft as "X, noted for Y" in the proper noun definition, and expand the proper noun headword line with plural. This stronger rationale does not apply to surnames: both the surname and the specific sense for an individual are part of proper noun section. However, one can argue that the individual-specific sense is the basis of derivation, semantically anyway, so it should better be represented in the base entry.

Lexicographical practice 2: Something like the derived term principle seems to be mentioned here:

  • Proper Names in General (Purpose) Dictionaries: Necessity *, researchgate.net - a survey of dictionary practices conducted by interviewing lexicographers
    One participant: '“Proper names are included: If they are needed as base words for derivations or compounds (Röntgenstrahlen, Basedowkrankheit, Creutzfeld-Jakob-Krankheit; italienisch, amerikanisch, Pariser, Römer). (Participant 5)'

However, the above may cover including the entry in some form at all; it does not force a sense for a specific entity. On the other hand, general dictionaries usually do not feature surnames but rather biographical names featuring specific people, so Participant above may include Röntgen in their dictionary as particular person.

See also #Senses for particular individuals or specific people. --Dan Polansky (talk) 07:36, 20 September 2022 (UTC)Reply

Meaning of proper nouns and proper names[edit]

By my analysis, the meaning of proper nouns and proper names is the referents they name. For surnames, "surname" is not a meaning but a function. We define surnames as surnames since this is the only practical thing to do; we cannot list all human individuals they ever referred to in 3 attested instances and even covering all Wikipedia-notable individuals would be an overkill and unnecessary duplication of Wikipedia. Nor can we list all individuals under their first names. By contrast, we do not define place names merely as "place name": we list specific entities they name as separate senses. And "place name" is an equivalent of "surname". We do not define astronomic names as "astronomic name". We define proper noun Sun by defining the specific entity, and we treat Moon similarly. In a sense, whether the referents are part of meaning or not, we treat specific entities as separate senses for many classes of proper nouns and proper names.

A source seems to disagree by stating that proper names do not have sense: 'Let us now consider the semantics of proper nouns, an issue much discussed from Mill (1867) onwards. They are diachronically motivated, and a meaningful etymon is found in most cases: e.g. family names derive from elements of common vocabulary referring to parentage (son of Richard > Richardson), or occupation (miller > Miller). But they are synchronically opaque; as stated by Lyons (1977: 198), "it is widely, though not universally, accepted that proper names do not have sense".'[3] However, here a distinction can be made between sense and reference (Frege, Sinn und Bedeutung, Wikisource:On Sense and Reference), both species of meaning. Indeed, proper nouns do not have a sense (concept), but they have reference (individual instance), the individual entity that carries the name.

SEP says "Proper names are familiar expressions of natural language, whose semantics remains a contested subject."

Some would answer that the meaning of a proper noun is its etymology. If we reject etymology to be the meaning, or proper meaning, of a common noun, I see no reason to accept this for proper nouns. For a surname, the function "male surname" is much closer to a meaning than the etymology.

To support the claim that the meaning of a proper noun is the set of referents, I argue that the meaning of a term needs to tell us whether an individual entity (an instance) is referred to by that term, or comes under that term. Thus, in the first approximation, the meaning of a term is its decision procedure. This is merely approximate since there are meanings and specifications for which it has been proven that there are no algorithmic decision procedures for them, but that is just a minor technical complication. For proper names, the only decision procedure there is is one that relies on an enumeration of referents. As said, it is impractical to list all attested referents for given names and surnames, but enumeration of referents is the only thing there is to the meaning of a proper noun.

Synonymy supports the notion that each referent is a separate meaning or sense of the name. For example, Trump, Donald has synonyms Drumpf, Orange Man and other. These work of referent level, not on "surname" level. On the other hand, diminutives and nicknames such as Pete of Peter work as synonyms on "first name" level. With first names and surnames, one may admit that the reader generally first recognizes the name by its function, as a first name or a surname, and then proceeds to determine the referent given context. This may not be always so: Hitler may be so prominent that the hugely notable referent gets activated sooner than the generic "surname" fucntion, but that is speculative and not obviously testable. I for one knew Aristotle was a philosopher before I knew it was just a Greek name given to many; I should have known better, but I was a child and was that way. --Dan Polansky (talk) 21:14, 21 September 2022 (UTC)Reply

Links:

Dan Polansky (talk) 16:48, 7 September 2022 (UTC)Reply

Expaned. --Dan Polansky (talk) 21:14, 21 September 2022 (UTC)Reply

What is encyclopedic content and dictionary material[edit]

Some deletion rationales for proper names say: delete as encyclopedic content. Such a rationale is nearly content-free, as per the following, depending on no surface observable properties that are consistently applied across different classes of proper names in the English Wiktionary.

There is a necessary overlap between a dictionary and an encyclopedia. To begin with, many common noun definitions found in dictionaries are also found in the corresponding encyclopedia article. There is a further overlap in coverage of proper names, including geographic names, names of biological taxa and astronomic names. A dictionary that includes some specific entities in its Newtown entry thereby in part duplicates an encyclopedia. And inclusion of names of taxa duplicates Wikipedia and Wikispecies. The overlap cannot be avoided; the question what to include in a dictionary cannot be resolved by the adjective "encyclopedic" since it means almost nothing and is not a surface observable property of a term.

The question whether something is encyclopedic content should be replaced with the question whether something is dictionary material. How do we know what is dictionary material? An extensional answer is: it is the kind of material that is found in dictionaries. And this is where WT:LEMMING comes into play: if a term is in multiple general monolingual dictionaries, it is dictionary content, by definition.

A related consideration is #Senses for particular individuals or specific people. They are dictionary material since they are included in multiple general dictionaries. The claim that they are encyclopedic because they are already covered in the encyclopedia has no force: common noun definitions are also already covered by the encyclopedia and yet this does not lead to their exclusion; more significantly, biological taxa are covered in Wikipedia and Wikispecies, and yet are included in Wiktionary in large numbers, and yet here it would make perfect scoping sense to exclude such material as covered elsewhere and merely duplicating the content there.

Terms that are included per voted-on specific policies and yet appear to be encyclopedic in some sense include geographic name Auburn Lake Trails, taxon Salamandra salamandra, and astronomical name Small Magellanic Cloud. None of them is being actively questioned in RFDs. These names are much less of lexicographical significance than single-word proper name since the latter feature etymology and pronunciation, whereas the etymology and pronunciation of multi-word proper names are compositional, derivable from the parts. There do not seem to be any characteristics of the notion of encyclopedic that the mentioned terms do not have yet terms like George VI and Mother Teresa have. The exclusion of the latter as "encyclopedic" seems to be based on the notion that names of individual persons are encyclopedic without explaining how geographic names and celestial names are less so. It can be admitted that multi-word names of people are not especially valuable as dictionary material since they are already covered in the encyclopedia, but the same is true of the cases just mentioned. What drives the notion that something is dictionary material rather than purely encyclopedic content is the fact that dictionaries tend to include these things. One might want to have lemming-free principles to include such terms, but none are obvious; what are the lemming-free principles driving inclusion of Auburn Lake Trails, Salamandra salamandra, and Small Magellanic Cloud? The supporters of inclusion of these terms did not provide any lemming-free rationale. The idea that we can follow lemmings in our inclusion is one of inclusionism combined with humility: we do not know what the lemmings criteria are, but we do not need to and we follow their lead. We can err on the side of inclusion to pull in content that is going to be useful even if we thereby also pull in some content that is less useful. And the latter surely takes places with multi-word geographic names, taxa and astronomical names.

See also User:Dan Polansky/Name. Dan Polansky (talk) 08:41, 8 September 2022 (UTC)Reply

Lemming test, lemming principle or lemming heuristic[edit]

A lemming principle is one that invokes other dictionaries' practice in support of an action to be done in Wiktionary, most often inclusion of an attested term that would otherwise be excluded. Its shortcut is WT:LEMMING, created in 2014‎.

The principle arrived to Wiktionary:Idioms that survived RFD via diff on 7 September 2007 in the following form: "Terms that have entries in other dictionaries, especially specialized ones", italics mine. The term "lemming test" occurred in a 2007 discussion at Talk:genuine issue of material fact. Further discussions can be found from Special:WhatLinksHere/Wiktionary:LEMMING.

A modification of the principle was proposed in 2014, about general dictionaries, not specialized ones: Wiktionary:Beer parlour/2014/January § Proposal: Use Lemming principle to speed RfDs. I created Wiktionary:Votes/pl-2018-12/Lemming principle into CFI, which resulted in no consensus, 14-16 support-oppose. The vote used the following formulation, in part based on feedback to the vote: "An attested term that appears to be a sum of its parts yet is included in at least two professionally published general monolingual dictionaries should be included. Such dictionaries include but are not restricted to Merriam-Webster, OED, AHD, Cambridge, Collins, Macmillan, Longman, German Duden and Spanish DRAE; dictionaries that do not count include WordNet." In 2018, I changed the wording of the principle in the Idioms page in diff to match the 2014 proposal, from specialized to general monolingual dictionaries.

As formulated, the principle can be used to bolster additional inclusion of sum of parts entries, and not for exclusion. However, a more general version can be used for broader purposes: it says, consider what the professionals are doing. It can be used in discussion of -ing forms and it can be used to bolster inclusion and treatment of proper names, which arguably are not sum of parts and therefore, the principle as formulated does not apply to them, strictly speaking.

The general version of the principle generally supports inclusion of notable individuals in their surname entries as senses, but for that there may be no consensual support. The general version was not applied in Talk:Joan of Arc and Talk:George VI and does not enjoy wide support in a discussion to be archived at Talk:Star Wars. Dan Polansky (talk) 12:13, 9 September 2022 (UTC)Reply

Lemming principle for proper names: A version of the lemming principle can be applied to multi-word proper names even when they are not sum of parts, strictly speaking. It can be very useful for the purpose since palatable inclusion criteria based on simple observational characteristics of the terms can be hard to come by, and no one has any. "Encyclopedic" is not a simple observational criterion that distinguishes Star Wars from United Arab Emirates, or even New York for that matter. Instead, we have referent-based criteria for geographic and astronomical names, and no specific criteria for many other kinds of specific entities. There is more in #Include attested proper names that are lexicographically interesting. The lemming principle would pull in United Arab Emirates, United Kingdom and Small Magellanic Cloud, which are included anyway based on referent-based criteria. It would pull in World War II and Thirty Years' War. It may be opposed for its inclusion of George VI to avoid redundancy to Wikipedia. To make the principle more palatable as a policy, we could 1) make it merely optional for RFD voters, which it currently is anyway and is not forbidden by WT:NSE and 2) explicitly exclude regnal names like George VI from its application. The loss of George VI is not a lexicographical one; it is a loss of the lemming principle. National Aeronautics and Space Administration is not in most of the traditional dictionaries but is in Collins, dictionary.com, vocabulary.com, yourdictionary.com, and thefreedictionary.com, and does not seem to have much going for it given how transparently named it is; it is very notable. Federal Aviation Administration is in M-W, dictionary.cambridge.org, Collins and dictionary.com. Food and Drug Administration is in M-W, Collins, dictionary,com and vocabulary.com. Federal Bureau of Investigation is in M-W, Collins, and dictionary.com, and survived a RFD. National Aeronautics and Space Act is in M-W. North Atlantic Treaty Organization is in Collins and Dictionary.com. Central Intelligence Agency is in Collins, dictionary.com and Oxford Learner's Dictionaries. United Kingdom of Great Britain and Northern Ireland is in Collins, Macmillan and Dictionary.com. This would support the idea that overridability is preferable. Alternatively, one may accept the apparently arbitrary choice these dictionaries make. --Dan Polansky (talk) 14:34, 13 September 2022 (UTC)Reply

Objections to using lemmings for proper names:

  • O: The principle has us depend on arbitrary whims of others, without knowing which principles they used. R: To describe the decision processes of professional dictionary makers as "whims" is derogatory and unlikely to be accurate: the dictionaries use editorial principles and policies. We do not know their principles, but if we assume they are pretty good, we can rely on results of their application. By requiring at least 2 other dictionaries, we guarantee some independence of assessment. In case of doubt, we could require 3. We could maintain lists of explicitly approved dictionaries. The dictionaries are probably not wholly consistent, but we can extrapolate principles from what they do and use them in a systematic manner to cover terms that they do not include.
  • O: Dictionaries copy from each other so there is no real independence. R: Checking how the covering dictionaries differ between different proper names suggests they are actually independent; different names generally yield different lists of dictionaries. And again, requiring 3 would reduce this problem.
  • O: The arbitrariness of depending on others is worse than the arbitrariness of driving the criteria by classes of referents. R: From a purely lexicographical standpoint, making differences based on classes of referents is inferior and arbitrary; we do not do that for nouns and adjectives. Which of the two sorts of arbitrariness is greater or worse does not seem to be a matter of easily observable fact, more a matter of a subjective opinion. Including all the "X County" terms is non-lexicographical and not what others do; paying attention to others during the design of the inclusion criteria would have been advisable. It is redundant to Wikipedia as well.
  • O: The principle does not remove the fuzziness, it just shifts it further. R: True, but the rationale for lemmings is not to remove fuzziness, but to guarantee an included predictable core, for which the RFD discussion is made easier, and for which entry creators do not need to fear they are losing their time.
  • O: The principle does not improve the current arbitrariness of RFD process; it replaces it with another sort of arbitrariness, one that can no longer be corrected if the principle is accepted as rigid. S: Our RFD process for proper names often looks like depending on "whims" of the participants. Often, the rationale is "not dictionary material" or "encyclopedic", which says almost nothing. Under these conditions, those who would want to expand the dictionary with, say, names of organizations have zero guarantee that what they create will not be deleted. With lemmings, the creator could be sure that a lemming-covered term's chance of deletion is minimal.
  • O: Wiktionary is not like Wikipedia, which is a tertiary source whereas we are a secondary source. S: Why does it matter? A Wikipedia analogue would be if Wikipedia said that any topic notable enough for a Britannica article is also notable enough for Wikipedia. And this is very likely to be true: Wikipedia is hugely more inclusive than Britannica. Similarly, we tend to be hugely more inclusive than other dictionaries. We could be keeping what others keep as an included core.
  • O: Wikipedia AfD is full of controversy and debate despite its reliance on authoritative sources. S: What is the significance of it for Wiktionary and lemmings? The application of lemmings is pretty straightforward; at worst, we would need to discuss which dictionaries to accept.
  • O: The principle would lead us to include terms that present avoidable redundancy to Wikipedia. S: That is a real downside. But it is nowhere as bad as duplicating a million entries for biological taxa from Wikispecies in Wiktionary. At worst, we would have some arguably redundant entries, but no incorrect ones.

--Dan Polansky (talk) 13:08, 15 September 2022 (UTC)Reply

What is consensus?[edit]

This question is of some importance since RFDs are specified in RFD header to be closed based on consensus and the project is supposed to operate based on consensus. Where could the meaning come from?

  • It can be a general meaning of the word, documented in dictionaries: consensus”, in OneLook Dictionary Search.. Webster 1913 PD definition is "Agreement; accord; consent." M-W sense 1 a is unanimity and this is Macmillan sense 1, the only sense Macmillan has and the sole meaning Oxford Learner's Dictionaries have and the sole meaning in Britannica Dictionary. M-W sense 1 b is weaker, speaking of agreement by "most concerned". AHD has two definitions, not wholly clear whether refering to M-W 1 a or 1 b; Collins has a single definition, not clear whether M-W 1 a or 1 b; ditto for Cambridge Advanced Learner's Dictionary. Unanimity is not applicable to Wiktionary, so what can be applicable is "agreement by most" for some value of "most". The definitions suggest that consensus does not mean plain majority; the fact that some dictionaries only define consensus as unanimity suggest it could be something like near-unanimity, or perhaps less than that, say, 80%.
  • The term "consensus" can be redefined by the project. This happened on Wikipedia: W:Wikipedia:Consensus: "Consensus is ascertained by the quality of the arguments given on the various sides of an issue, as viewed through the lens of Wikipedia policy." This does not match the general meaning at all, and implies that a minority can define "consensus" as long as they hold the best arguments. Obviously, a lone dissenter does not have their way in Wikipedia, no matter how strong arguments they have. It follows that Wikipedia redefinition is doubly Orwellian: 1) it grossly violates the common meaning of the word "consensus"; 2) it pretends to define a workable process, which it does not since it does not allocate the authority of assessment for strength of arguments. To make it work, editors engage in creative interpretation of an unworkable specification. A redefinition happened in English Wiktionary as well when it defined "consensus" for the purpose of votes as 2/3 supermajority, which arguably does not match the common meaning of the word "consensus" either, which would require a higher threshold. A higher threshold would make decision making impractical: much fewer votes would pass and the decisions would be done more often by maniocracy, the rule of those who are most maniacal and stubborn about a subject of disagreement. This naturally leads to edit wars and wheel warring, as had happened in the English Wiktionary when two particularly stubborn admins wheel warred in many iterations during a single date, both of them having a history of maniocracy. This redefinition is not so bad a deviation: it lowers the threshold to something manageable but does not become the opposite of consensus, the rule of the philosopher-king who has all the best arguments and advisors. And it still involves discussion before voting and during voting. If the voting threshold were set flexible, to be lower for matters of mere preference, I would say that 2/3 is a fine overridable default.
  • The term may be left undefined but the actual practice and use of the term on the project may differ from the dictionary definitions. And this seems to be the case of the English Wiktionary for RFD discussions. These are decided predominantly by vote tallying, and the threshold is not higher than 2/3 supermajority. Since the project did not codify a redefinition of consensus, the dictionary definitions still have some force; anyone who uses the general word "consensus" without a project specific redefinition can be fairly accused of violating the meaning of the term when that happens. When the RFD header uses the word "consensus", it loads the meaning from the general dictionaries unless something else takes precedence, and there is no codified thing that would take precedence.
  • The redefinition may be implied to be loaded from the sister project, Wikipedia. But since Wikipedia policies are not automatically taken over to Wiktionary, there is no reason to think Wikipedia consensus redefinition is loaded into Wiktionary.

See also Wiktionary:Beer parlour/2022/September § Closing RFD discussions using the strength of the arguments and Wiktionary talk:Requests for deletion/Header.

A key element of a proper consensus exercise is that lone opposes need to be heard, allowed to argue their case at length and not dismissed as a tiny minority. People's being a tiny minority must not be disparaged. The original supermajority can switch to the opposite as examination of arguments and evidence develops. The relevant movie is W:12 Angry Men (1957 film).

Some quotations:

  • What is Consensus Decision Making?, lucidmeetings.com
    "Consensus is a decision-making approach that seeks to secure the support of the whole group for the decision at hand. Many people believe that consensus is the same thing as unanimous agreement, but this is not necessarily the case. Unanimity is when everyone agrees. Consensus is when no one disagrees."
    Note: There is no way this could work in Wiktionary; there is no chance we will argue so long until everyone agrees or at least abstains.
  • What is Consensus?, creativemanitoba.ca
    "Consensus is a decision-making process that works creatively to include all persons making the decision. Instead of simply voting for an item, and having the majority of the group getting their way, the group is committed to finding solutions that everyone can live with. This ensures that everyone’s opinions, ideas, and reservations are taken into account. There is an agreement not to move forward in a direction or take action unless all parties agree."
    Note: Ditto: No chance we could make all opposers abstain. Not realistic.
  • What does it mean when a decision is taken “by consensus”?, ask.un.org
    "When a vote is taken and all Member States vote the same way, the decision is unanimous. When a decision is taken by consensus, no formal vote is taken. A 2005 Legal Opinion distinguishes consensus as follows: consensus “is understood as the absence of objection rather than a particular majority” (UN Juridical Yearbook 2005, page 457). Resolutions and decisions adopted by consensus are considered as “adopted without a vote”, although they are distinct from decisions made under the without-a-vote procedure."
    Note: Ditto: No chance we could make all opposers abstain. Not realistic.
  • What is consensus?, quora.com
    "An outcome that all members of a group can live with." --Flemming Funch
    Note: Ditto: No chance we could make all opposers abstain. Not realistic. Confirms the unanimity interpretation of M-W 1 a.
  • What is consensus?, webopedia.com
    "In blockchain technology, the consensus mechanism formalizes an agreement in which the majority, or at least 51%, of network nodes agree that a particular digital transaction was executed and completed in the blockchain network."
    Note: Very unconventional, but here it is, a plain majority.
  • What Is Consensus?, consensusclassroom.org
    'For the purposes of this site, we have agreed on this definition: consensus is conscious agreement by everyone. Under this definition, a decision is not final until everyone in the group agrees with it. We call the interaction that leads to consensus--or at least attempts to find consensus--the “consensus process.”'
    Note: Ditto: No chance we could make all opposers abstain. Not realistic. Confirms the unanimity interpretation of M-W 1 a.

Dan Polansky (talk) 20:27, 9 September 2022 (UTC)Reply

Expanded. --Dan Polansky (talk) 07:12, 30 September 2022 (UTC)Reply

RFD tallies AKA vote counting in RFD and language of votes in RFD[edit]

Let me take on record some RFDs that contain tallies: Talk:horror movie, Talk:acute-angled triangle, Talk:less-than-stellar, Talk:boon companion, Talk:pissed as a fart, Talk:instant karma, Talk:I'm pregnant, Talk:about that life, Talk:about to, Talk:suꝑficialis, Talk:on TV, Talk:neuntausendneunhundertneunundneunzig, Talk:little boy, Talk:wait for, Talk:bus route, Talk:mackinaw coat, Talk:three hundred, Talk:two hundred, Template talk:bor+, Talk:instant karma, Talk:Merry Christmas and a Happy New Year, Talk:wait for. I searched for "keeps" to find them. Many of them were by me; I seem to be an outlier in making tallies explicit, but there are some other people as well.

An interesting note from 2009: "I see four deletes (Visviva, DCDuring, Algrif, and myself) and two keeps (SGB and Lmaltier). I don't feel qualified to delete this on such a slim majority of which I'm a member, so I'll leave it and hope someone else deletes it." What an honest admin; there was 2/3 supermajority.

Some comments from RFD discussions using the language of "vote":

  • "This vote is now almost even: 5 deletes: GoldenRowley, EP, MSH, Angr, and DAVilla; 1 abstain: Ruakh; 4 keeps: Widsith, Stephen, Algrif, DCDuring".
  • "Unless you give a reasonable rationale, I vote keep. —Μετάknowledge"
  • "include-all-place-names party, so I vote keep) — [ R·I·C ] opiaterein"
  • "Used as an idiomatic (and I'd vote "Keep" even if it wasn't) Purplebackpack8"
  • "My vote is now Keep, unless a native German speaker suggests otherwise. Smurrayinchester"
  • "I would vote to weak keep, given Paul's objection, provided that, as suggested by Raifʻhār ..."
  • "I vote keep as well as per Siuenti."
  • "I voted Delete in the last 25 discussions about this, and I still vote the same."
  • "I'm on the fence and will wait for more comments before casting a !vote. - -sche" -- the ! plays the Orwellian game that these votes are non-votes, and yet they are votes.
  • "Keep Bad-faith nomination, no RfD grounds, and I'd vote to keep nonetheless as it's unclear whether gospel refers to religious music"
  • "I feel obliged to not vote as I've never heard of it, and the definition makes very little sense"
  • "I vote keep because it is not just a geographical description, but an identifier"
  • "I have to agree with Shinji here and vote keep."
  • "Because this is arguably the highest authority for modern Latin, I vote to immediately remove the RFD. --Robert.Baruch"
  • "Why didn't I cast a vote on this? It seems completely useless to me. --Per utramque cavernam"
  • "For completeness, I vote keep for the literal sense. Mihia"
  • "I'll vote keep to avoid any future headaches. --Robbie SWE"
  • "Looks compelling. I would vote keep if it counted for anything. DAVilla"
  • "I voted to keep Chinese food, and would vote to keep Mexican food or Indian food,"
  • "Equinox voted to delete this, see his comment in the "Mona Lisa" section. note placed by - -sche"
  • "Having said that, we usually ignore the rules and just have a 'vote', unlike RFV where we rely on evidence. Mglovesfun"
  • "Separately noting that my keep !vote above also extends to this entry, which is now attested. !Votes to "Keep all" by User:Metaknowledge," -- The !vote game again.
  • "By the way, I'll preemptively vote keep for -woman and -person."
  • "I vote to keep it. —Μετάknowledgediscuss/deeds 20:14, 22 March 2020"
  • "Since it's now proven to be a greeting, I'll change my vote to keep. We could also add it to the phrasebook. Thadh"
  • "My vote for now is delete as SOP."
  • "I vote to keep. It is a reasonably common variant. Kiwima"
  • "I'm changing my vote to "keep". We have something similar at shoe-leather. — Cheers, JackLee"
  • "Feel free to re-RFD it in the hope more people will vote this time. - -sche (discuss) 00:35, 15 October 2012"
  • "But that doesn't seem to be meant here, so I vote keep. -- Liliana • 17:26, 27 May 2013"
  • "Abstain for now (I'll probably !vote later). - -sche (discuss) 06:47, 2 July 2020"
  • "Nevertheless I still vote to keep the compounds. Ƿidsiþ"
  • "To sum up, I change my vote to neutral. --Hekaheka 17:41, 2 December 2009"
  • "I once again vote "weak keep." --Connel MacKenzie 19:17, 13 July 2007"
  • "This was enough to convince me to change my vote to keep. Ungoliant MMDCCLXIV 18:38, 31 March 2012"
  • "Which makes my vote on this matter a definite and (to me) obvious keep, by the way. — Kleio"
  • "I've created the entry, so I'll vote Delete Chuck Entz (talk) 15:33, 14 October 2012"
  • "The separate sense is silly and should not support keep votes. Equinox ◑ 07:56, 22 March 2021"
  • "I would vote for deletion. BedfordLibrary"
  • "I'm upgrading my vote to a full keep. - -sche (discuss) 23:11, 26 June 2014"
  • "It beats me how this got a nonzero number of keep votes. Imetsia (talk) 18:25, 25 December 2021"
  • "If an entry to be made and no evidence beyond the current two cites were provided I would vote to delete that. - TheDaveRoss 12:40, 25 July 2022"
  • "I suppose so. I've changed my vote to a weak keep. Andrew Sheedy (talk) 17:14, 8 October 2016"
  • "@Khemehekis: is this a vote to undelete? If so, please indicate that. — SGconlaw (talk)"
  • "Four votes for deletion. No dissents. Deleted. Mihia (talk) 01:04, 30 August 2019"
  • "I'm changing my !vote to abstain. - -sche (discuss) 17:14, 27 August 2020"
  • "Now that standalone usage has been found, see my vote below. - -sche (discuss) 21:54, 7 March 2014"

And so on; the occurrences of "vote" in RFD discussions go on and on, from all sorts of parties and ranges of years. Some of the occurrences are of the Orwellianism or Wikipedianism "!vote", but not too many. Dan Polansky (talk) 17:16, 10 September 2022 (UTC)Reply

Yes, Dan: we can take into account how people vote whilst also taking into account other factors too. You are missing the woods for the trees. Theknightwho (talk) 18:00, 10 September 2022 (UTC)Reply
Where is your evidence? --Dan Polansky (talk) 18:01, 10 September 2022 (UTC)Reply
The very fact that people vote with differing "strengths" demonstrates that we all acknowledge that votes are not simply a matter of quantity - and nor is there an appetite for that to be quantified, either. Rather than obsessing over this, I would spend your time more constructively on contributing to the dictionary. Whatever you think you're doing here, it is clear that there is no consensus to change things. Theknightwho (talk) 18:10, 10 September 2022 (UTC)Reply
What I am doing in this thread is an exercise in collection of evidence. If you have nothing to contribute to the exercise--discovery of interesting evidence, not evidence-free pronouncements--I would prefer you stop responding here. You already made your representations to the above effect multiple times, and never provided any evidence, always just evidence-free pronouncements. This is just my preference and I cannot prevent you from responding, and do not intend to, but my preference is clear. --Dan Polansky (talk) 18:15, 10 September 2022 (UTC)Reply
Sure thing, Dan. Whatever you say. Your own evidence usually suffices to prove my point, in any event. Theknightwho (talk) 18:18, 10 September 2022 (UTC)Reply

As far as more evidence, I found this: "If we look at the actual votes, only two people (Nicodene, Benwing) voted to delete the English entry (perhaps a third if you count PUC) while one (Theknightwho) expressed uncertainty as to whether it should be kept. This is obviously not sufficient consensus to speedy the English entry. [...] it went under the radar of many editors who frequently participate in English RFDs and who I could reasonably imagine voting to keep this (@AG202, BD2412, Binarystep); their potential votes should of course not be ignored. [...] makes it look as though these people voted to also delete them which is not the case and which cannot be tracked once the discussion has been archived to the talk page. — Fytcha〈 T | L | C 〉 12:25, 10 July 2022 (UTC)" Again, the language of "vote", very recent one. And yet, we read elsewhere "You have mistaken discussion pages for votes, but they are not votes — they are discussions." Nonetheless, the evidence is overwhelming. --Dan Polansky (talk) 19:00, 10 September 2022 (UTC)Reply

Please read motivated reasoning again. This is just confirmation bias. Instead of assuming that you are the only person who has spotted something, reflect on why nobody else seems to agree with you. Theknightwho (talk) 19:09, 10 September 2022 (UTC)Reply
If I am only selecting evidence to suit my purpose in a one-sided manner, it should be possible for others to find evidence that points against my purpose. I have found no such evidence. Others can be better motivated, if only they tried. In fact, the above evidence suggests that many people implicitly do agree with me about what the actual practice is. And the "nobody else" claim is refuted by the RFD threshold vote, and by the following: 'Our RFD system is primarily based on votes (a fact that can't be ignored), so its helpful to “solidif[y] the consensus standard” if we count votes at all (which I think we do, even if we ignore some particular votes).' --Dan Polansky (talk) 19:17, 10 September 2022 (UTC)Reply
There's no point, because (a) you never accept other points of view anyway, and (b) the opinion against you is overwhelming. It's not "one side" versus the other - it's you versus everyone The fact you're still conflating how people believe things are with how people believe things should be from that thread is hopeless. Theknightwho (talk) 19:29, 10 September 2022 (UTC)Reply
Really? It's me vs. everyone despite the vote above and the quote above? As for should vs. is, I see no should in "Our RFD system is primarily based on votes" and "I vote keep". I hope the reader sees the same things I do. --Dan Polansky (talk) 19:36, 10 September 2022 (UTC)Reply
The fact you're still conflating how people believe things are with how people believe things should be from that thread is hopeless. You seem to go into every discussion with the idea that your opinion is self-evidently true. "Primarily" does not demonstrate your view, which is that votes are only real factor. Also, stop calling every opinion that disagrees with you Orwellian - it reminds me of people that call opposing opinions "gaslighting": it's just a way to ignore cognitive dissonance. Theknightwho (talk) 20:02, 10 September 2022 (UTC)Reply
Am I conflating? How so? I see no should in "Our RFD system is primarily based on votes" and "I vote keep". Why all the talk about "me" and not about the subject matter? Why all the ad hominem all the time? Is the reader going to discard all the evidence only because I am such and such? --Dan Polansky (talk) 20:10, 10 September 2022 (UTC)Reply
And what happened to "Sure thing, Dan"? Or did you post any evidence? --Dan Polansky (talk) 20:11, 10 September 2022 (UTC)Reply
You said The evidence that RFDs are tallies is overwhelming. One person saying that they are "primarily based on votes" does not support that opinion. Cherrypicking evidence that only supports your view is not a reasonable way to investigate something: if you truly wanted to gauge opinion, you'd be collecting that which went against your position, too. Unless, that is, when you said you were "engaging in the collection of evidence" you actually meant "engaging in confirmation bias".
And Dan, maybe I'm talking about you because you keep using language like "Orwellian" or implying that admins are being dishonest, which is itself an ad hominem attack. You have a lot of double standards when it comes to yourself, it seems, and never seem to consider that I might have a point. If you want to talk fallacies, by the way, you also love the fallacy fallacy: saying "that's ad hominem" is not an excuse to ignore something. Theknightwho (talk) 20:23, 10 September 2022 (UTC)Reply
More ad hominem. This was an exercise in evidence collection. It turned out badly. If I only could stop responding. --Dan Polansky (talk) 20:26, 10 September 2022 (UTC)Reply
You very obviously didn't even read it. You aren't as smart as you think you are, Dan. Theknightwho (talk) 20:27, 10 September 2022 (UTC)Reply
Wow. --Dan Polansky (talk) 20:28, 10 September 2022 (UTC)Reply
If you can't understand the basic argument that only collecting evidence in favour of one opinion is not genuine evidence collection, it's difficult to conclude otherwise. Theknightwho (talk) 20:30, 10 September 2022 (UTC)Reply

Include attested proper names that are lexicographically interesting[edit]

The inclusion of proper names is an unresolved topic for many classes of them. Not fully regulated are names of individuals including nicknames, organizations, events, literary works, and so on. Regulated are geographic names, astronomic names, and some two-part names of individuals.

The following principle seems reasonable to me:

(1) An attested proper name that is lexicographically interesting shall be included.

But what does it mean, lexicographically interesting? It means the entry can carry some interesting lexicographical classes of information. One class that cannot count is the meaning since proper names are generally not sum of parts and have meaning in the form of the referent. This leads to a more refined principle:

(2) An attested proper name that has interesting pronunciation, etymology, or inflection shall be included. These classes of information do not count if they can be derived from the components.

The above may be deemed too stringent as not catering for translation. From this we can derive a more specific principle:

(3) An attested proper name that is a single word and is not a capitalization of a common noun shall be included.

The above relates to the slogan: all words in all languages, but not all multi-word terms in all languages. Furthermore, the above seems to be an application of the current CFI by combining the slogan of CFI with WT:NSE's many shall be excluded while some shall be included: unless a specific regulation such as WT:FICTION excludes a specific attested word, it is included via the general slogan and WT:NSE does nothing to give editors discretion to override the slogan.

We may decide to include derivation as a relevant property:

(4) An attested proper name that has produced includable derived term shall be included.

A related principle is not about inclusion of names per se but rather of senses:

(5) If an attested proper name has produced figurative uses or is to be included as something else than a proper noun, say, a common noun or interjection, the proper noun section should be included as well.

The above is not a direct consequence of (1) but rather based on the notion that it is odd to exclude a form that is predominantly used as a proper noun while including a common noun section for it which sees much less use.

The above criteria stand in sharp contrast to "encyclopedic", which for proper names gives us no observables to work with.

The generalized lemming principle would lead to inclusion of some names not supported by the above:

(6) If an attested proper name is present in multiple general mono-lingual dictionaries, it shall be included.

Some object to the lemming principle as being based not on observable characteristics of names but rather deferring to an external authority. However, they do not propose comprehensive observational criteria to replace the principle with; "encyclopedic" is no such criterion. In the analysis of examples below, I will not invoke the generalized lemming principle since it is not an elaboration of (1). However, the following analyses of (7) and (8) show how extremely useful the lemming principle is: it is trivially easy to administer and complements our efforts to devise authority-free inclusion criteria. It pulls in some terms listed below under (7) and pulls in United Arab Emirates, explicitly excluded by (8). It pulls in World War II. It pulls in George VI via M-W, AHD and Collins, which is disliked.

One may want to be more lenient toward two-word proper names, and introduce the following vague principle:

(7) The more words a proper name is composed of, the less defensible is its inclusion. Especially names consisting of 3 or more words are liable to be excluded; there may be more lenience toward two-word names.

The above is not particularly principled, but could save the likes of United Nations, European Union, United Kingdom, Great Britain, Atlantic Ocean, Yellow River, Mississippi River, Red Cross, Red Crescent, Orthodox Church, Catholic Church, Democratic Party, Republican Party, and George VI, especially if they are supported by the lemming principle (and all the listed cases are supported by it), while still excluding Albanian Orthodox Church and Japan Socialist Party. As a downside, this would save very many two-word person names, which we very likely do not want, and we probably do not want to have all attested English streets. This principle would therefore have to be combined with another one, like prominence or notability of an organization or with the authority-based lemming principle. Democratic Party and George VI were deleted; the support for the lemming principle for proper names is currently fairly weak, and a vote for the principle resulted in approximately 50:50. Lemmings sometimes save 3-word names, as they do for Small Magellanic Cloud; they do not save United Kingdom of Great Britain and Northern Ireland. In the following, the above will not be invoked, in part for the said reasons, in part since it is not an elaboration of (1).

To augment a possible application of (7), we may introduce the following principle:

(8) A multi-word proper name that names the referent in a transparent manner, and thus is quasi sum of parts, shall be excluded, an example being National Basketball Association.

NBA is in fact a national basketball association. The above excludes United Kingdom of Great Britain and Northern Ireland (in fact a united kingdom of the two entities, not in dictionaries) but not United Kingdom (ambiguous), United States (ambiguous) and Red Cross (not a red cross). While giving us confidence to exclude countless transparently named associations, it seems too weak as a sole exclusion filter: the multi-word nontransparently named organizations may be too numerous and are covered by Wikipedia. And it may have consequences some will dislike: United Arab Emirates are in fact united Arab emirates and would be excluded. The distaste for excluding UAE may be driven by encyclopedic or referent-based intuitions, like include all names of states even if they are transparent or achieve some consistency from the point of view of classes of referents rather than purely on lexicographic grounds. Further excluded by (8) would be Federal Republic of Germany but not German Democratic Republic (not democratic) and not Holy Roman Empire (not holy, not Roman, and probably not an empire). One may weaken (8) by saying "are more liable to be excluded" instead of "shall be excluded", to cater for arbitrary inclusion tastes.

A principle probably de facto used is this:

(9) An attested multi-word nickname that is not the official name shall be included.

This is an elaboration of (1) and the encyclopedia article would not necessarily cover such trivia as nicknames. This would cover North Atlantic Terrorist Organization, Orange Man, God Emperor, Pharma Bro, Korea Fish, Vegetable English, and Elongated Muskrat.

One feature of the above principles is that the size or prominence of the referent plays no role. Atlantic Ocean is large and Milky Way and Small Magellanic Cloud more so, but that does not save them; Dubya is just one person, but that is fine.

There is nothing lexicographically principled about the current WT:CFI for geographic names and astronomical names: they list admissible classes of referents as if it were referents that mattered for the dictionary, not their names. And they separate rules for geographic names and astronomical names into two sets of rules as if the differences in the kinds of referents had any lexicographical significance. This lexicographically unprincipled approach naturally leaves other classes of referents such as organizations and literary works without coverage, implying that further referent-class-specific rules enumerating various subclasses of referents will be required. The opposition to names of people that sometimes arises is similarly lexicographically unprincipled: it seems concerned with the smallness and insignificance of the referents rather than with the names. The principles proposed and analyzed here do not suffer from this defect.

Some applications of the above are uncontroversial, including first names, surnames, patronymics and matronymics, as well as single-word names of large geographic entities.

Let us examine some applications to the cases that are more controversial and those that are not but illustrate how the principles would apply to them. Where the result is that the name does not need to be included, it can perhaps still be included by some principle not covered, still following from (1).

  • Great Pyramid of Giza, name of a geographic feature, does not need to be included. It was RFD-kept; it has interesting non-compositional translations.
  • Colossus of Rhodes, name of a geographic feature, does not need to be included. It was RFD-kept.
  • United Kingdom of Great Britain and Northern Ireland, name of a political entity, does not need to be included, as is excluded by (8).
  • Czechoslovak Socialist Republic, name of a political entity, does not need to be included and is probably excluded by (8).
  • Federal Republic of Yugoslavia, name of a political entity, does not need to be included and is probably excluded by (8).
  • People's Republic of China, name of a political entity, does not need to be included.
  • Holy Roman Empire, name of a political entity, does not need to be included.
  • United Kingdom, name of a political entity, does not need to be included.
  • Great Britain, name of a political entity, does not need to be included unless one argues that British is derived from it and is thus saved per (4).
  • United States, name of a political entity, shall be included per (4) via United Statian.
  • New York, city name, shall be included per (4) via New Yorker.
  • New Jersey, state name, shall be included per (4) via New Jerseyan.
  • Orange County, name of a county, does not need to be included.
  • Mississippi River, name of a river, does not need to be included.
  • Yellow River, name of a river, does not need to be included.
  • Atlantic Ocean, name of an ocean, does not need to be included.
  • Neuschwanstein, name of a geographic feature, shall be included as a single word.
  • British English, name of a language, does not need to be included.
  • Multicultural London English, name of an idiolect, does not need to be included.
  • Lysistrata, name of literary work, shall be included as a single word that has etymology and pronunciation and has Lysistratan.
  • Decameron, name of literary work, shall be included as a single word that has etymology and pronunciation, and has Decameronesque.
  • Clouds, name of a literary work, does not need to be included: it is a capitalization of a common noun.
  • Ali Baba and the Forty Thieves, name of a literary work, does not need to be included, provided we include Ali Baba.
  • Iliad, name of a literary work, shall be included. It is a single word, has etymology and pronunciation.
  • Odyssey, name of a literary work, shall be included. It is a single word, has etymology and pronunciation.
  • King James Bible, name of a literary work, does not need to be included.
  • Blackadder, a British sitcom, does not need to be included since the proper noun is covered on other grounds.
  • Sirius, name of a star, shall be included as a single word with etymology and pronunciation.
  • Small Magellanic Cloud, name of a galaxy, does not need to be included.
  • Triangulum Galaxy, , name of a galaxy, does not need to be included.
  • Ku Klux Klan, name of an organization, shall be included. It has etymology and pronunciation.
  • Greenpeace, name of an organization, shall be included as a single word. The etymology and pronunciation seem transparent, though.
  • Transparency International, name of an organization, does not need to be included: it results from combination of two capitalized common nouns.
  • United Nations, name of an organization, does not need to be included as per above. It can be included on other grounds. Its not being the full name of United Nations Organization makes no difference for the criteria proposed.
  • United Nations Organization, name of an organization, does not need to be included and is excluded by (8).
  • United Nations Economic and Social Council, name of an organization, does not need to be included and is excluded by (8).
  • UN Security Council, name of an organization, does not need to be included and is excluded by (8).
  • United Nations General Assembly, name of an organization, does not need to be included and is excluded by (8).
  • United States Marine Corps, name of an organization, does not need to be included and is excluded by (8).
  • British Broadcasting Corporation, name of an organization, does not need to be included and and is excluded by (8).
  • National Aeronautics and Space Administration, name of an organization, does not need to be included and is excluded by (8).
  • Orthodox Church, name of an organization, does not need to be included
  • Albanian Orthodox Church, name of an organization, is even less includable than Orthodox Church and is excluded by (8): it is Albanian branch by Orthodox Church.
  • General Motors, name of a company, does not need to be included: it results from combination of two capitalized common nouns.
  • American Airlines, name of a company, shall be included per (5) via poker sense.
  • Verizon, name of a company, shall be included as a single word. It has etymology and pronunciation. Due to the non-phonetic spelling of English, pronunciation is generally unpredictable. A request for undeletion failed; too many people want to exclude nearly all company names.
  • Lufthansa, name of a company, shall be included as a single word.
  • Finnair, name of a company, shall be included as a single word.
  • Aeroflot, name of a company, shall be included as a single word.
  • Air France, name of a company, does not need to be included.
  • Iberia, name of a company, does not need to be included since it is named after a geographic object of the same name that is included.
  • Microsoft, name of a company, shall be included as a single word with etymology, pronunciation and in infected languages, inflection, and also per derived form Microsoftie, not derived from the common noun section but from the proper noun.
  • Strategic Defense Initiative, name of a defense system, does not need to be included.
  • Monty Python, name of a British group, shall probably be included per (4) via Pythonesque, unless one argues that this is covered by Python.
  • London Bridge, name of a landmark, does not need to be included.
  • Lisp, a programming language name, shall be included.
  • Governator, a nickname of a person, shall be included as a single word. Emphemerals. Similar are Zizou and Woz.
  • RPattz, a nickname of a person, shall be included as a single word. Emphemerals. This was deleted and then undeleted.
  • J-Lo, a nickname of a person, is harder to analyze because of the hyphen. But J and Lo alone are not words, so one could argue this is not transparent two-word combination. It survived RFD.
  • Crouchy, a nickname of a person, does not need to be included if analyzed as a capitalization of adjective crouchy. But since it is Crouch + -y, it may be included as a generic nickname, and that's what we do.
  • Orange Man, a nickname of a person, shall be included per (4) via Orange man bad.
  • Pharma Bro, a nickname of a person, does not need to be included.
  • Captain Swing, name of a person, shall be included per (4) via Swingism.
  • Jesus Christ, name of a person, shall be included per (5): it has a common noun section and an expletive section.
  • Saint Mary, name of a biblical character, does not need to be included.
  • Mary Magdalene, name of a biblical character, does not need to be included.
  • Mother Theresa, name of a person, has produced figurative uses and shall be included per (5).
  • John Lennon, multi-word person name, has produced John Lennon glasses and shall be included per (4). Currently forbidden by CFI.
  • Albert Einstein, multi-word person name, shall be included per (5) via uses like "the Albert Einstein of". Currently forbidden by CFI.
  • George VI, a regnal name, does not need to be included.
  • Star Wars, name of a franchise, has produced figurative uses and shall be included per (4) via Star Warsy.
  • Star Trek, name of a franchise, shall be included per (4) via Star Treky.
  • Ungoliant, name of a fictional character, shall be included as a single word with etymology and pronunciation.
  • Shelob, name of a fictional character, shall be included as a single word with etymology, pronunciation and even translation.
  • Sleeping Beauty, name of a fictional character, shall be included per (5), by producing non-proper noun parts.
  • Pinocchio, name of a fictional character, shall be included per (2), (3) and (5).
  • Pinocchio, name of a fairy tale or story, shall be included per (2), (3) and (5), although including it as a fictional character makes this much less urgent.
  • Eeyore, name of a fictional character, shall be included per (2) and (3).
  • Asterix, name of a fictional character, shall be included per (2) and (3).
  • Obelix, name of a fictional character, shall be be included per (2) and (3). Was deleted.
  • Winnie-the-Pooh, name of a fictional character, shall be included per (2): the pronunciation does not seem to be derivable from the components.
  • Little Red Riding Hood, name of a fictional character, does not need to be included.
  • Pippi Longstocking, name of a fictional character, shall be included per (5).
  • Harry Potter, name of a fictional character, shall be included per (4) via Harry Potter glasses.
  • Mickey Mouse, name of a fictional character, shall be included per (4) via Mickey Mouse cap and per (5) via its adjective section.
  • Count Dracula, name of a fictional character, shall be included per (5) via figurative uses.
  • Rin Tin Tin, name of fictional dogs, shall be probably included per (2) via pronunciation.
  • Tom Sawyer, name of a fictional character, shall be included per (5) via its verb section.
  • Indiana Jones, name of a fictional character, shall be included per (4) via Indiana Jonesesque and per (5) via figurative uses.
  • Ali Baba, name of a fictional character, shall be included per (5) via figurative uses.
  • Clifford the Big Red Dog, name of a fictional character, does not need to be included.
  • Sherlock Holmes, name of a fictional character, shall be included per (4) via Sherlock Holmesish.
  • King Kong, name of a fictional character, is less clear: if Kong is taken as a base word, there is no support for inclusion.
  • Pražákova, Czech name of a street, shall be included as a single word with etymology, pronunciation and inflection.
  • Ringstraße, German name of a street, shall be included as a single word.
  • Luční, Czech name of a street, does not need to be included: it is a capitalization of adjective luční.
  • Draconus, name of a computer game, shall be included per (2) and (3) as a single word.
  • Manic Miner, name of a computer game, does not need to be included: it is two capitalized common nouns.
  • Legend of Zelda, name of a computer game, does not need to be included as per above.
  • Zelda, a short name of a computer game, shall be included per (4) via Zeldaesque.
  • Sempron, a name for processors, shall be included as a single word.
  • Athlon, a name for processors, shall be included as a single word.
  • Pentium, a name for processors, shall be included as a single word.
  • Itanium, a name for processors, shall be included as a single word.
  • PowerPC, a name for processors, shall be included as a single word.
  • Haswell, a name for processor architecture, does not need to be included since it is covered by geographic name.
  • Archimedes, a name for computers, does not need to be included since it is covered by person name.
  • Speccy, a nickname for computers, shall be included as a single word.
  • Miggy, a nickname for computers, shall be included as a single word.
  • Amiga, a name for computers, could be includable for pronunciation unless argued to be covered by amiga.
  • Amiga, a Czech name for computers, shall be included as a single word for the inflection; amiga is not a Czech word.
  • Commodore, a Czech name for computers, shall be included as a single word for the inflection and pronunciation; komodor is a Czech word pronounced the same.

Let us consider some objections:

  • O: This will open the floodgates for proper names. R: It will, but the floodgates are already wide open for proper names that are not lexicographically interesting, including two-word species names and multi-word geographic names: name Auburn Lake Trails, taxon Salamandra salamandra, and astronomical name Small Magellanic Cloud. The sets of geographic names and names of species are huge; species are well covered by Wikispecies, including vernacular names in multiple languages. Other class of included lexicographically rather uninteresting proper names are initialisms for associations such as AAAANZ; they are rather numerous, uninteresting from etymology and pronunciation standpoint and are covered by Wikipedia. They can be argued to be sum of parts as for etymology and pronunciation despite the components not being separated by space; they are not sum of parts as for semantics, but that is the usual case with proper names in general. Per Wikipedia, 1.9 million species have been identified and described: that's what we shall call floodgates wide open. Wikispecies currently has 815,596 content pages.
  • O: This will open the floodgates for commercial promotion for company names. R: Having a name in the dictionary does not promote the company if many single-word company names are included. Wikipedia runs much more substantial risk of promoting companies by allowing extensive statements about them and their products and services. A simple definition line identifying the line of business does very little promotion work. And the attestation requirement still requires independence, which involves having 3 authors.
  • O: This will open floodgates for names of fictional characters. R: For one thing, as said, the gates are already wide open, and if the material is lexicographically interesting, why not. For another thing, attestation still requires three different authors to use the name. Admittedly, we could seek some compromise, e.g. by requiring various degrees of spread beyond the original works. For instance, if the names appear not only in the original literary works but also in a film adaptation, that could signify prominence. The current criteria for inclusion of fictional characters are very similar to the infamous attributive-use rule and exclude lexicographically interesting material. Like it or not, fictional characters from Tolkien's works have as good a penetration of the common mind as characters from the Bible; there surely are Biblical characters (not Jesus) unknown to many who are familiar with Tolkien characters.
  • O: This will open floodgates for geographic names of minor features, including street names in many languages, including Czech and German. (Less so in English, where the name often contains "street" separated by space.) R: This is a fine objection and the current rules for inclusion of geographic names are lenient enough. On the other hand, street names in Czech are lexicographically interesting at least for inflection: they are often formed as possessive adjectives, e.g. Pražákova. If we are interested in lexicography, we are forced to admit that Pražákova is more lexicographically interesting than Small Magellanic Cloud. Some street names can still be excluded: Luční is a capitalization of adjective luční. The definitions for street names would only say "street name" and no more, or say "street name, most notably referring to X", without listing all the streets as sense lines.
  • O: Inclusion of nicknames of individuals opens the gates for trivia and ephemeral objects. R: It is true, but from a lexicographic standpoint, it does not seem to matter. If one wants to extract Wiktionary data for a spell checker, should not Wiktionary include Governator as a valid and well attested word form? Furthermore, Microsoftie and IBMer are equally ephemeral without being proper names. Something being ephemeral does not seem to be a valid lexicographical criterion.
  • O: Translation is also a lexicographical class of information and should be considered. R: Maybe, but it is hard to assess how much this will open the floodgates even more for multi-word proper names. The proposed principles are controversial enough even without considering translation.
  • O: Including Albert Einstein via "the Albert Einstein of" uses adds no value beyond what one finds in an encyclopedia; there, one learns about the characteristics of the person. R: Fair enough. Explicitly forbidding such names seems fine, even if not particularly principled.

For ease of tracking this subject, there is now Category:RFD result for proper names (failed). Dan Polansky (talk) 12:42, 12 September 2022 (UTC)Reply

Expanded. --Dan Polansky (talk) 17:03, 12 September 2022 (UTC)Reply
Modified. --Dan Polansky (talk) 11:57, 13 September 2022 (UTC)Reply
Expanded. --Dan Polansky (talk) 09:17, 14 September 2022 (UTC)Reply

Further reading:

--Dan Polansky (talk) 11:39, 17 September 2022 (UTC)Reply

Whether names are words, requirement of figurative use and all words in all languages[edit]

Single-word names are words. Multi-word names are not words. Single-word names are being uttered, have pronunciation, etymology, inflection, translation, they occupy grammatical positions in sentences and succeed in referring to referents, albeit in a manner different from common nouns. If we are to include all words in all languages, we need to include all attested single-word names, including brand names, names of literary works and names of fictional characters. They too get uttered and take positions in sentences. Dictionaries define "name" as a word or phrase with certain characteristics; that fits.

Single-word names do not need to become "lexicalized" (wordized) because they already are words, even without figurative use. The figurative use requirements in CFI have no linguistic basis; they are an arbitrary limit on inclusion of names as names. The names are supposed to earn their place in the database of words by acting not as names but as common nouns. It has never been explained why this needs to be done; it seems to be a case of "I don't like it" or "real dictionaries don't do it". But real dictionaries do not declare inclusion of all words in all languages as their ultimate aspirational aim.

Traditional dictionaries restrict their coverage of names for practical reasons, not because they are not words. Traditional dictionaries do cover some names, including some geographic names; even the name-averse OED does it to a very limited extent. Names are not generic tools of language, unlike common nouns. Biological taxa are often not words but rather two-word terms and are arguably out of a natural scope of a dictionary, better covered by a taxon database.

Dictionaries usually do not include company names and brand names. Historically, they acted under practical constraints of being published in paper. Including more words also increases the use of human resources. Wiktionary does not have this limitation. Names are not covered in Wikipedia primarily as names; they are covered as subject headings. Wikipedia may include etymology, but that is not its core business; it is the core business of a dictionary.

Take Lysistrata. Almost no OneLook dictionary has it. But it is non-commercial, has etymology and pronunciation. There is no linguistic basis for excluding it, and not much of a practical one either. OneLook dictionaries nearly all include NASA, which is etymologically boring, unlike Lysistrata. They do not cover EASA.

Our inclusion of names can be so much better. WT:NSE gives us discretion for some kinds of names; we should use it. And we should oppose further figurative-use or attributive-sense rules as unnecessary and arbitrary restrictions on coverage of all words in all languages.

Links:

Dan Polansky (talk) 15:36, 15 September 2022 (UTC)Reply

Please rollback your new categories[edit]

Categories such as Category:English initialisms for organizations are just the intersection of a language category and a topic category, and are therefore not needed and just add pointless bureaucracy. Please roll them back.

These kinds of changes should also have been discussed first, too. Theknightwho (talk) 09:08, 16 September 2022 (UTC)Reply

I responded at Wiktionary:Requests for deletion/Others#Category:English initialisms for organizations. It is not our usual practice to discuss creation of new useful categories first. I am surprised that something so eminently useful is now controversial. --Dan Polansky (talk) 09:47, 16 September 2022 (UTC)Reply
It is our usual practice not to create intersection categories, so this should have been discussed first. Theknightwho (talk) 09:50, 16 September 2022 (UTC)Reply
Once completed, this would not be an intersection category since the items would no longer be in the Organizations category. Utility is key. --Dan Polansky (talk) 09:56, 16 September 2022 (UTC)Reply
Yes it would be, because it’s literally in the name, and you’ve categorised the new category in both. All this means is that anyone looking for organisations who doesn’t know whether something is an initialisms or an acronym (or is not expecting the division) has to look in multiple places. Theknightwho (talk) 10:01, 16 September 2022 (UTC)Reply
Let's continue the discussion about the merits in RFDO. --Dan Polansky (talk) 10:10, 16 September 2022 (UTC)Reply
Funny how you only say that when you realise your bullshit won’t fly. Theknightwho (talk) 10:18, 16 September 2022 (UTC)Reply
Ouch. Having one place is good. --Dan Polansky (talk) 10:36, 16 September 2022 (UTC)Reply

Criteria for inclusion of multi-word names of organizations[edit]

This is a follow up on #Include attested proper names that are lexicographically interesting. Wiktionary and lemming practice about names of organizations are confusing.

A weak exclusion rule is this:

1) Any multi-word organization name that transparently names the organization shall be excluded.

This is too weak a filter; too many organizations have names from which one cannot determine what they are about, an example being Alavi Foundation, found in Wikipedia:Category:Foundations based in the United States. Names like "X Foundation" for some person name X are too numerous. It excludes "National Basketball Association" and many other associations, but that is far from enough.

A stronger exclusion rule is this:

2) Any multi-word organization name that has an entity type in its name shall be excluded.

This would exclude "X Association", "X Foundation", "X Institute", "X Party", "X Trust", "X Center" and the like, regardless of naming transparency or misnaming. It would exclude "United Nations Organization". It would exclude "Democratic Party", which lemmings tend to keep. It would exclude "European Union": it is a union. It would exclude "Bank of England" despite being of the whole U.K. It would exclude the kept Royal Navy. It would keep Red Cross and Red Crescent. It is probably still too weak a rule, including "Code for America", "Wheels For Wishes" and who knows how many relatively insignificant organizations that do not say they are organizations in their name. It would exclude Catholic Church, Anglican Church and Church of England.

One of the simplest inclusion and exclusion rules is this:

3) A multi-word name of an organization can only be included if it has a lexicographical saving grace, which the meaning/reference isn't, while non-compositional pronunciation or etymology are.

This fairly exclusionist rule would keep Ku Klux Klan, but not United Nations, European Union, European Central Bank, and International Monetary Fund. Greenpeace is out of scope of this rule and of this thread. This is not a bad rule, but rather exclusionist, and not in line with what we do for geographic names, for which we keep the full name of the U.K. and all those "X County" entries. It is perhaps a good idea to do "better" for organization names than what we do for geographic names and not try very hard to be aligned with geographic names. One could amend rule 3) by allowing very important organizations as exceptions, which would allow United Nations, European Union, European Central Bank but also Democratic Party and Republican Party as the major parties governing one of the most powerful countries on Earth. It would include the full NATO name: NATO is as important as countries are. This exception rule is not very principled and depends on an unspecified threshold for "very important"; unmodified rule 3) is. A trivially administrable rule for allowed exceptions to rule 3) is the controversial lemming principle. International organizations are in Wikipedia:List of intergovernmental organizations; the list is perhaps not terribly long to accept. The list says "For a more complete listing, see the Yearbook of International Organizations, which includes 25,000 international non-governmental organizations (INGOs), excluding for-profit enterprises, about 5,000 IGOs, and lists dormant and dead organizations as well as those in operation (figures as of the 400th edition, 2012/13)." 25,000 organizations sound like too many, but it pales in comparison to the million taxa from Wikispecies. The list further says "A 2020 academic dataset on international organizations included 561 intergovernmental organizations between 1815 and 2015; more than one-third of those IGOs ended up defunct"; 561 is not too many. The challenge would be to set a criterion for choosing a reasonably small authoritative list outside of Wiktionary; an objector could still say that this is as arbitrary as the lemming principle. Search for "most important organizations" turns up small lists like leverageedu.com/ one, including e.g. North Atlantic Treaty Organization and of course United Nations Organization. The phrase "one of most important organizations" is not terribly well administrable but gives us something to work with: if any organizations are on the list, UN and NATO are. Then there is W:List of largest political parties. If UN and EU are to be included, something like an arbitrary criterion of importance needs to be taken into account. One can include EU by arguing that it is a semi- or quasi-federation and that it should therefore be regulated like a country such as the U.S., not an organization.

Using arbitrary thresholds for importance is what general dictionaries do for geographic names: they include names of large cities but not of every village. Using a threshold of importance for organizations would be similar; the challenge is to find an administrable threshold. Wikipedia succeeds in finding such thresholds for their list pages. Letting lemmings do the threshold setting work for us is a neat if controversial trick.

A further inclusion criterion to override 3) is this:

4) A multi-word name of an organization that is not the full official name shall be included.

This would include Black International, No Such Agency and North Atlantic Terrorist Organization. It would still exclude Official Monster Raving Loony Party: humorous or not, this is still a full official name. It would still exclude European Union.

Another approach is to require figurative uses of the name, like what is done for fictional characters. That seems very exclusionist, especially if applied to single-word names of organizations as well. --Dan Polansky (talk) 17:50, 16 September 2022 (UTC)Reply

Let me repeat and add distinguishing properties:

  • multi-word organization name, e.g. United Nations, National Basketball Association and Ku Klux Klan but not Greenpeace
  • name with an interesting etymology, e.g. Ku Klux Klan
  • transparent name, e.g. National Basketball Association and European Central Bank but not Red Cross and Democratic Party
    • transparent name of a national association: these are particularly numerous
  • misnomer, e.g. Bank of England and the full name of NSDAP, but not Red Cross
  • name containing an entity type, e.g. United Nations Organization, World Bank, Bank of England, Alavi Foundation, Catholic Church, Anglican Church, but not United Nations, Red Cross and Red Crescent
    • name indicating it is a regional branch of a larger organization, e.g. Albanian Orthodox Church and probably Greek Orthodox Church
  • name that is not the full name, e.g. United Nations vs. United Nations Organizations, Warsaw Pact vs. Warsaw Treaty Organization, possibly North Atlantic Alliance vs. North Atlantic Treaty Organization and World Court vs. International Court of Justice, but not European Union
  • name of an important organization, e.g. Democratic Party, NASA, FBI, CIA, Royal Navy, Bank of England, Church of England
    • name of an important international organization, e.g. United Nations, World Health Organization, full NATO name, African Union, Red Cross and Red Crescent, European Central Bank but not Bank of England
    • name on a specific list of important organizations
      • name covered by at least 2 classical lemmings
      • name covered by at least 3 classical lemmings, e.g. Democratic Party and Republican Party
  • name that sees figurative use
  • name from which other includable term is derived, e.g. Church of England
  • name that has non-trivial translation
  • combined: name without entity type that is an important organization, e.g. United Nations, Red Cross, and Red Crescent, but not European Union, WHO, ECB, FBI, CIA and Royal Navy

Of the above properties, the figurative use one and interesting etymology are very exclusionist. Allowing all intransparent names and all misnomers would lead to huge inclusion, and so would probably allowing all names with non-trivial translation. Among the others, the only one with sufficient filtering power is "important" or "important and international", in part since the filter is not based on ontology or lexicography but rather on an unspecified quantitative threshold. Something like importance must be used by the dictionaries that contain more than a handful of names of organizations, such as AHD, Macmillan, Collins and Cambridge. It seems to follow that unless we want to include very few or very many names of organizations, an element of arbitrariness such as importance or being international is unavoidable.

Including at least some important names would allow us to serve the translation purposes to some extent: e.g. full name of NATO is often translated into Czech as Severoatlantická aliance, non-encyclopedic information about names. One could probably extract that information from Wikipedia and Wikidata, but that does not make it any less lexicographic. In Lexicographic Criteria for Selecting Multiword Units for MT Lexicons[4], Jack Halpern notes: "MT [machine translation] lexicons, on the other hand, should include as many proper nouns as possible. In fact, most MT systems perform poorly in translating proper nouns in general and multiword POIs in particular. To achieve higher translation accuracy, proper noun resources for MT must be greatly expanded". He further notes: "The recognition and accurate translation of proper nouns, many of which are bilingually non-compositional, are a major issue in MT and other NLP applications. This is especially true for Chinese and Japanese, whose scripts present linguistic and algorithmic challenges not found in other languages." Whether we want to serve machine translation is open for debate, but the class of information is lexicographical. As someone noted, Wikidata does not have the data structures and processes for attestation of translations into various languages.

Lemmings: OED, M-W, AHD, Macmillan, Collins, Cambridge.

OED is most conservative about names of organizations. M-W is less so, but is fairly conservative: it does not have ECB or WHO; it has European Union, United Nations and North Atlantic Treaty Organization but not Warsaw Pact.

Some lists of important organizations:

--Dan Polansky (talk) 11:12, 17 September 2022 (UTC)Reply

Translation dictionary of proper names, Wikipedia and Wikidata[edit]

The following is a bold exploration far beyond the reaches of what is customarily accepted. It starts with the following thesis:

1) The primary tool to support translation is a dictionary, not an encyclopedia.

Translators do in fact use encyclopedias to translate proper names, for lack of better options. A related observation is the following, supported by some linguistic sources:

2) Machine translation of multi-word proper names is a hard problem, especially between English and CJK languages.

So is in fact human translation. Human translators will use Wikipedia, Google search and other sources to identify translation hypotheses and to verify them. Proper names are not trivially compositional (sum of parts) even for humans: a human may fairly easily identify a superficially plausible translation, but not the most customary one. A supporting observation:

3) Wikipedia and its interwiki links support translation of proper names, but merely by accident.

It is not the primary purpose of Wikipedia to support translation; indeed, the translation statements of the form "English name A B C is best translated into Japanese as X Y Z" are not part of Wikipedia articles themselves. Today, interwiki is captured in Wikidata. A related observation:

4) Wikidata captures translations of proper names into many languages, but without proper tracing to sources and attesting quotations.

This is supported by observing Wikidata:Q7184 for NATO: there are name translations and alternative names to many languages, but with exactly zero tracing to sources. The above brings us to the following bold observation leading to a bold conclusion:

5) If Wikipedia has enough database space for articles for specific entities, and if Wikidata has enough database space for entries for specific entities that include translations, then Wiktionary has enough space for all proper names of Wikipedia-notable and Wikidata-notable specific entities.

The database load of Wiktionary could be reduced by prohibiting inflected-form entries for multi-word names of specific entities; no big loss. Which leads to the following bold proposal that is going to be acceptable close to no one:

6) Wiktionary shall include all attested names of Wikipedia-notable specific entities, whether the name is a single word or multiple words.

Entries included by the above will often have compositional translations, but rather often also non-trivial non-compositional translations. It is a much easier criterion than trying to figure out for each name whether some of the translations are non-trivial, especially since most RFD participants do not know CJK languages and since this would require participation from many people knowing different languages to work properly. Let us note that this will not duplicate all Wikipedia headwords in Wiktionary since Wikipedia headwords are often not proper names, e.g. "History of England". A related rule is this:

7) Wiktionary shall include all attested names of Wikidata-includable specific entities, whether the name is a single word or multiple words.

However, Wikidata has 99,387,101 items per Wikidata:Wikidata:Statistics. Random item generation reveals many of them are names of scientific articles. Other items are identifiers, such as "UCAC2 6941167" of a star or "Hypothetical protein CELE_Y39A1A.10", unfit for a dictionary. The case of scientific articles alone suggests 7) is too inclusive.

I don't see this proposal flying, mostly because of the following:

8) The discussions about including proper names are lemming-exclusionist and based on prejudice rather than a rational and impassioned lexicographical utility analysis.

What I mean by lemming-exclusionist is that the exclusionist says, look, real dictionaries don't include that, so we should not either. I am a lemming-inclusionist, so I say, look, multiple professionally edited dictionaries include that, so we should as well. And professionally edited dictionaries introduce arbitrary frequency or importance cutoffs for inclusion of proper names. The objection remains:

9) Via 6), we are going to duplicate Wikipedia and interwikis in Wikidata, and this is not a business of a dictionary.

And the response is, as has been pointed out, Wikidata does not support proper sourcing, and the task is lexicographical in so far as translation is a lexicographical task, as well as synonymy between alternative names. A related objection is this:

10) Via 6), we are going to duplicate what Google Translate and similar translation technologies already do remarkably well.

That is likely, but whether these technologies using various techniques of automated statistical analysis always succeed is unclear; the translations offered by them show no tracing to sources or attesting quotations. A drawback of proposal 6) is this:

11) While the traditional content of Wiktionary can often be supported by dictionaries, this new bold extension can't.

And that is true. The lexicographer of proper names will need to learn to use other sources. But the lexicographer searching for attesting quotations already knows the drill, understands frequency and predominance in sources, can use Google Ngram Viewer to determine relative frequencies, can learn to search official name databases, and so on. This is not an insoluble problem, and professional translators are solving the name-verification-problem all the time. The lexicographer of proper names will probably acquire a slightly different skillset from the one of the lexicographer of other words. Another possible drawback is this:

12) Not enough people will come to Wiktionary to create that content.

That is quite possible, but we do not know that. There are enough people entering data into Wikidata, a relatively obscure project. If people knew policy prevents their content from being deleted, they could come; there could perhaps be even some non-profit funding for creation of that content. And if they don't, our coverage will be no worse than that of all those languages that have very little coverage of common nouns.

Another possible drawback:

13) Wikipedia notability guidelines are not easily administrable, and Wiktionary will have to defer to decisions made by another project.

That is a real drawback, one which could be addressed by developing custom notability guidelines. Temporarily deferring to another project seems preferable to doing nothing, missing the translation service opportunity altogether.

Some statistics:

Various inclusion and notability criteria much stricter than 6) can be developed. 6) is some of the boldest and most inclusive that come to mind and is not astronomically inclusive: if we include about a million of taxa, we can afford to include about 6 million proper names from Wikipedia. By contrast, including all names of scientific articles from Wikidata would probably achieve very poor cost-yield ratio: most of them have never been translated so translations would have to be invented, failing the requirement of verification, or be missing.

I don't see the above changing minds, given all the RFD and policy discussions I took part in in the past in Wiktionary. People went as far as to claim that names are not words, backed with no sources and refuted by many as well as straightforward linguistic analysis that a twelve-year-old can make. Hardly anyone seems to use their imagination to see what a dictionary of proper names can be, and whether that is realistic. The above suggests there are no technological constraints, no unsurmountable policy design constraints, and no verification and substantiation constraints. Wikidata can solve the translation problem but without verification and traceability; we could do so much better. I expect such responses as "this is insane", which shows that fundamental assumptions about what a dictionary can be have been violated. Another response is the non-differentiated "we are not encyclopedia", a statement that anyone of any intelligence can make with zero nuance and giving no thought to the matter whatsoever. It takes zero analytical effort to note what dictionaries and encyclopedias have historically been doing. Whatever one will say, I can't be accused of failing to think "out of the box".

See also #Include attested proper names that are lexicographically interesting, which is much less bold and focuses on single words, while including some multi-word names.

Links:

Dan Polansky (talk) 14:31, 21 September 2022 (UTC)Reply

Including many names of fictional persons and places[edit]

This is going to be slightly repetitive to the following:

WT:FICTION is extremely exclusionist/deletionist. As a first approximation, it amounts to "exclude all names of fictional specific entities"; it allows to include only a fraction, that which is attested in "attributive sense". We require no such thing of common nouns.

Failure of justification: WT:FICTION was brought about by Wiktionary:Votes/pl-2008-01/Appendices for fictional terms. The vote is unfortunate, showing bad voting culture: the vote has no rationale and the voters provided almost no rationales, making comments that constitute no reasoning and provide no analysis. The linked Beer parlour discussion fares not much better, failing to explain why most names of fictional characters and places shall be excluded.

Comparison to Wikipedia: Wikipedia is very generous with fictional characters and places. It is much more inclusionist than Britannica. If Wikipedia followed the model of Britannica as narrow-mindedly as Wiktionary follows the model of other dictionaries, it would exclude most of it. Some examples from LOTR: Bilbo Baggins, Rivendell, Mordor, Mirkwood, Geography of Middle-earth, Gondor, Frodo Baggins, etc. The articles are extremely generous, as if Wikipedia were a fan website. Consider Gondor alone, so much information there about history and geography. Wikipedia still has notability policy, not allowing any and all fictional characters and places as separate articles.

Comparison to mythological entities: Hardly anyone is serious about requiring Ancient Greek god names to be used "attributively" or "figuratively" and relegating the specific gods to etymologies. (Nor should they lest they awaken the wrath of gods.) The same is true of other mythological entities such as Chimera. More are at Category:en:Greek mythology, currently not threatened by deletionist application of WT:FICTION, but that can change. We have Merlin and King Arthur, and more at Category:en:Arthurian mythology. One may argue that "fictional entity" is not the same as "mythological entity"; perhaps, but there would seem to be a subclass relationship, each mythological entity being fictional. Thus, mythological entities are not protected by CFI, and are allowed a free pass by tradition.

Comparison to surnames: We are very generous with surnames, and we should if we want to cover all words. Category:English surnames currently has 44,719 entries. By contrast, Category:en:Fictional characters has measly 223 entries. If it had 40,000, that would not be too many, especially compared to the million of biological taxa potentially duplicated from Wikispecies.

Support of adjectives: For a language such as Czech, names used attributively require adjectives that are not proper nouns. Furthermore, possesive adjectives are derived from names. Thus, there is gondorský (of Gondor), rohanský (of Rohan), rauroský (of Rauros), Bilbův (belonging to Bilbo) and Gandalfův (belonging to Gandalf). None of that is excluded by current policy, but the current policy prohibits inclusion of the base names. The same is true of many Slavic languages.

Objections:

  • O: Wiktionary is not an encyclopedia. R: How is that relevant? A dictionary should treat names as words, providing etymology, pronunciation, a short dictionary definition (no encyclopedic exposition), an inflection table and a translation table. If Wikipedia is so generous, why can't we? Why do we have to be so narrow-minded and stingy?
  • O: These names are too numerous. R: From what perspective? There sure is enough space in Wiktionary database to include attested names of Wikipedia-notable specific fictional entities. Furthermore, if that was the concern, we could tighten the criteria to limit the number, e.g. by requiring that there are translations to 10 languages.
  • O: These are ephemeral and pop culture content. R: We include nicknames of real individuals, and these relate to ephemeral content. Pop culture is no less real than other parts of the world.
  • O: These are not words unless they start being used out of context. R: Makes no sense: names have all signs of wordhood even without being used out of context. Many such names have much higher corpus and mind penetration than rare 3-attested common nouns. They get used, spelled, pronounced, inflected and translated.
  • O: These are fictional words and not real words. R: The entities are fictional, not the words. Once these words see adoption beyond the original author, they are as much words as a scientific term that starts to get used beyond a single author. And fictionality of referent does not impact wordhood: unicorn is a word and so is fairy. (And 1-attested proto-word is not a 0-word either: it is a proto-word, one in the making. It is not binary; it is fuzzy logic, allowing grades of set membership.)
  • O: These fictional word referents are restricted to a single fictional world and do not work across multiple possible worlds. R: How does that detract from wordhood and lexicographical interest? Furthermore, if names of real specific entities can span multiple possible worlds, such as those resulting at least putatively from the worlds forked from this world in the past in non-deterministic events, one can think of e.g. forking Middle-earth worlds as a result of forking history, and the referents therefore succeed in existing between multiple possible worlds. Not that it should really matter.

Candidate criteria:

  • Include all 3-attested names of fictional entities. 3-attested requires 3 authors. Very generous, likely to be opposed. Workable nonetheless. We are not running out of database space and we know how attestation works.
  • Include all 3-attested names of fictional entities that are Wikipedia-notable.
  • Include all 3-attested names of fictional entities from works that have been translated into at least 3 languages.
  • Include all 3-attested names of fictional entities from works that have been translated into at least 10 languages.
  • Include all 3-attested names of fictional entities from works that have been translated into at least 10 languages and only if the works exist both as literature and film media, and the name is not a capitalization of non-names such as nouns or adjectives.

We could come up with further restrictive criteria. Most of it would be better than the current extremely exclusionist policy.

Benefits:

  • Include interesting lexicographical information such as items evidencing name translation strategies used by various translators.
  • Improve coverage of words in align with "all words in all languages".

Alternatives:

  • Use Wikipedia and its interwiki for translation. Works for some languages and some names. Provides no inflection tables.
  • Use Wikidata translations: Wikidata is very generous with entities. Provides no inflection tables and no tracing to the sources for the translations.

The alternatives are not lexicographicaly focused, covering lexicography as an accident, not essence.

Past discussions:

--Dan Polansky (talk) 10:04, 24 September 2022 (UTC)Reply

Using Twitter for attesting quotations[edit]

Use of Twitter for attesting quotations is being discussed at Wiktionary:Beer parlour/2022/September § Whether Reddit and Twitter are to be regarded as durably archived sources. This discussion is an implementation of Wiktionary:Votes/pl-2022-01/Handling of citations that do not meet our current definition of permanently archived. That vote was preceded by Wiktionary:Votes/2021-09/New standard for archived quotations, which would allow all Wayback-Machine-archived quotations, which was in essence a rerun of Wiktionary:Votes/pl-2012-08/Citations from WebCite. If the 2022 vote was to do anything, it was to allow Twitter to enable attestation of slang; allowing copy-edited online news has very little lexicographical added value.

OED, M-W and sustained, widespread or accumulated use: OED uses Twitter as a source of attesting quotations. However, OED requires evidence of "sufficiently sustained and widespread use", which 3 independent tweets are not. And M-W requires "accumulated and sustained use in print", per Kory Stamper, again much stronger than 3 independent year-spanning tweets even if disregarding "in print". Wiktionary attestation requirements for print are as lax as they can realistically be: a year is a very short period and the number of 3 is the minimum number achieving anything like independence or extrapolation from multiple data points.

Tweet volume: In 2022 and recent years, the volume of tweets are about 500,000,000 tweets per day or above 180,000,000,000 per year. The meaning of quotation count in the volume of non-copy-edited content is wholly different from that in copy-edited print. Hardly anyone can have an idea of what can be 3-attested in that non-copy-edited volume.

Attestation example: Citations:Turkroach gives examples of what can be found on Twitter.

Ease of gaming and word-creation for fun: Creating accounts and tweeting is easy, fast and cheap. Self-publishing in print and getting the result scanned and available online is not. The only real barrier is the one-year spread requirement. It is therefore relatively easy to tweet year-spanning 3-attested would-be words and phrases into existence, just for the fun of it. It would be a project of adding new synonyms for human anatomy into Wiktionary by tweeting, surely enough fun for some teenagers. They could start at Urban Dictionary for inspiration. Even if the intent is not to game, use of invented words and phrases may be enough fun in itself. Twitter moderation is unlikely to change that in any way.

Durably archived sources as a surrogate requirement: The requirement of "permanently recorded media" or "durably archived" is mainly a surrogate for "copy-edited", "edited" or "professionally published". The point of contention is not whether the material is durably archived.

Links:

--Dan Polansky (talk) 08:11, 1 October 2022 (UTC)Reply

Limiting discussions[edit]

To avoid getting into trouble, I need to consider adopting a policy:

  • If a person has turned repeatedly uncivil and unproductive in discussions and if past interactions let to high-frequency discussions with little value, limit the number of responses to them on a per-day basis. Like, no more than 3 responses to the person in a given thread on a single day. Wait for others to join the conversation. You are unlikely to convince that person anyway. Think of the audience. Let the other party have the last word, perhaps even a snarky last word.

--Dan Polansky (talk) 11:20, 9 October 2022 (UTC)Reply

strč prst skrz krk[edit]

I just felt for saying it. na zdraví. Allahverdi Verdizade (talk) 21:56, 27 October 2022 (UTC)Reply

Also: nazdar! Allahverdi Verdizade (talk) 21:56, 27 October 2022 (UTC)Reply

admin again[edit]

Hey. How about becoming an admin now? It would spark major lulz if nothing else GreyishWorm (talk) 14:45, 31 October 2022 (UTC)Reply

Small request[edit]

Hey, could you please have a quick look at prigl and Category:Czech Hantec? Thanks. — Fytcha T | L | C 04:07, 1 November 2022 (UTC)Reply

I placed prigl to RFV with an explanatory comment. It is somewhat plausible but it needs to be attested; the variant prýgl is attested via google books:"prýglu". Another variant could be prygl. The category could be fine, referring to a certain Czech dialect/slang; it currently has prigl and šalina. šalina is well attested. --Dan Polansky (talk) 07:29, 1 November 2022 (UTC)Reply

Literalism, textualism and statutory interpretation[edit]

Some key terms and links on the subject:

open.edu:

  • The literal rule[5]
  • The golden rule
  • The mischief rule
  • The purposive approach

Wikipedia:

Dictionaries:

Wiktionary:

Encyclopedias:

--Dan Polansky (talk) 10:15, 1 November 2022 (UTC)Reply

Lists of adjectives: do it properly[edit]

Instead of giving mostly useless lists of adjectives that you don’t even bother to link or provide any context for, do collocations. They are far more useful for everyone.

And before you try the usual sophistry, I will just take this to the Beer Parlour if you don’t, where I strongly suspect others will agree with me. Theknightwho (talk) 11:58, 4 November 2022 (UTC)Reply

I am using a traditional format used in the English Wiktionary for over a decade. It is more compact and the information conveyed is the same as the space-wasting format for collocations. The lists are as useful as collocations; the difference is that, instead of writing "A X, "B X" and "C X", I write A, B, and C and leave it to the reader to fill in X. This format is used by some collocation dictionaries. The format is very compact, ensuring that even a fairly long list of items takes little screen space.
I yield to verifiable consensus against continuation of such practice. Some may appreciate that someone is willing to do this kind of menial work adding this kind of uniquely valuable information into Wiktionary. My idea is that widespread practices that are not forbidden are in fact allowed and that the Knight is no benevolent dictator of Wiktionary to start prohibiting long-accepted practices. --Dan Polansky (talk) 12:17, 4 November 2022 (UTC)Reply
Given your predictable behaviour of refusing to abide by any requests made of you, I will take this to the Beer Parlour. What you are currently doing is filling pages with low-effort junk, and given you can’t even be bothered to make it slightly useful, it is harmful to the project. The personal attacks at the end just demonstrate your usual hypocrisy and lack of self-awareness. Theknightwho (talk) 12:33, 4 November 2022 (UTC)Reply
It is not "junk" and the effort is more than I would like to admit and more than others are willing to spend. Sure, take it to BP, that's the way to go. --Dan Polansky (talk) 12:37, 4 November 2022 (UTC)Reply
It is junk. Just because you spend a lot of time on junk doesn’t make it valuable. Try spending some of your time learning how to link entries properly, instead of being a meat robot doing something I could get a bot to do in 2 minutes. They’re literally just lists of adjectives and nouns, whereas collocations actually demonstrate use (and word order). Theknightwho (talk) 12:39, 4 November 2022 (UTC)Reply
BP. --Dan Polansky (talk) 12:42, 4 November 2022 (UTC)Reply

Hypocrisy[edit]

After today's rather shameful display, I don't think you ever have the right to complain about ad hominem ever again. I suggest that you get off your high horse and learn how to collaborate with other users. Theknightwho (talk) 16:32, 4 November 2022 (UTC)Reply

I am not flawless and I do sometimes engage in the ad hominem fallacy, although far from as often as the Knight, who keeps steering discussions to personal level instead of sticking to substance all the time. Today, I noted elsewhere that it is probably pointless to try to argue with someone who claims that schwarzes Loch is a compound and does not stand corrected even when presented with copious explanation and sourcing. That reasoning seems to have no flaws, but is perhaps ideally avoided. That is harder to do when I was a target of vulgar insults from the Knight without ever receiving anything resembling an apology; instead, editors decided to make him an admin. I am merely a human and when pushed too far, as now happens in continuous interference by the Knight in my editing, I may lose my calm and veer into the personal realm. --Dan Polansky (talk) 16:43, 4 November 2022 (UTC)Reply
Replying to someone while talking about them in the third person is an extremely rude thing to do. I also don't particularly care as to whether schwarzes Loch is a compound at this point, because if I remember correctly you only kept bringing it up in order to avoid admitting that your source for there being no open compounds in German was extremely weak. The discussion was about that; not about a specific example.
The one thing I have consistently noticed with your behaviour is that the only thing you really care about is never feeling wrong, and you don't give a shit about how hypocritical you have to be in the process of that. I don't know whether you think you're fooling anyone other than yourself (you are not), but it is disruptive and needs to stop. You and I both know that I am far from the only person who has been frustrated with your behaviour over the years, so there is no point pretending this is me simply being unreasonable. Theknightwho (talk) 16:57, 4 November 2022 (UTC)Reply
@Theknightwho: I will strongly suggest you stop this pointless quarrel and distance yourself from this talkpage at least for a day or two. If you think Dan is actively harming Wiktionary, bring it up on BP but don't continue the fight here. Thadh (talk) 17:07, 4 November 2022 (UTC)Reply
I am fine with receiving feedback on my talk page; much better than in a discussion that should be about the substance of the matter at hand and not about me. However, it should be fair and substantiated. The claim that I care about being right all the time to the point of never correcting course is falsified by objective verifiable evidence, e.g. when I recently switched from support of Twitter to oppose of Twitter in Beer parlour. There are more recent cases of me admitting I said something incorrect. I make mistakes and I learn from some of them. I may stick to my position more often than ideal, but that is normal human behavior: in Wiktionary discussions, people all too often do not change course. I am sure I made more bad arguments than I like to admit, but Wiktionary discussions contain many very bad arguments from all sorts of parties; that is a fact of life, and we have to live with it. The best procedure is to calmly continue analyzing the arguments and avoid personal level; when that becomes pointless, the best way is probably to disengage. The vast majority of people succeeds in doing exactly that; the Knight does not.
About the third person: I don't find it impolite. It is used in the British parliament so it seems unlikely to be truly impolite. It helps calm down and remind myself that I am talking to the reader, not to the person to whom I am responding. If I have nothing new to add for the reader, I should probably add nothing.
Each case of my editing that someone finds problematic can be discussed on my talk page or in Beer parlour. What is not acceptable is arbitrary rule making imposed on me contrary to common practice.
I maintain that the ad hominem fallacy is ideally avoided, and my failure to avoid it myself at times does not change that position. It is fair to call me out on engaging in the fallacy; it is not fine to ask me to never invoke it again. --Dan Polansky (talk) 17:31, 4 November 2022 (UTC)Reply

Infinite block[edit]

I was blocked indefinitely by -sche, with the summary: "persistent, years-long history of disruptive editing & obstructionism (more rationale in BP); in particular, I highlight as w:WP:DE does that D.E. need not be "intentional."

I am not surprised; -sche is one of the main resident Orwellians and demagogues, but he is not alone. Let us have a look:

  • Wikipedia policies are not Wiktionary policies; referring to a Wikipedia policy is a block summary seems inappropriate.
  • Years-long disruptive editing? What would that be?
  • Obstructionism? I bow to two things: policy and verifiable consensus. That, by -sche construction, is obstructionism.
  • In BP, they talk of "rules-lawyering". What they mean by that is that I apply the British "literal rule" or "plain-meaning rule" to policy interpretation since that, to my mind, is the only honest and rational approach. There are different approaches, but it is not clear that the Western world recognizes the plain-meaning rule as illegitimate. My point is, policies should not say what they do not mean, and should mean what they say; when they don't, that should be corrected. Still, policy overrides are a good thing, but should be recognized as such, as overrides, not as "interpretations".
  • In BP, they talk of "filibustering". That is largely nonsense; filibustering is something completely different. In Wiktionary, it is very hard to delay decisions by long posts, if not outright impossible, so the analogy breaks down completely. At worst, long posts make discussions harder to read. Yes, my posts are much longer than those of most, and contain much more substance. My posts are not very long by Wikipedia discussion standards. If people do not want to read my posts, they do not have to, and they do not need to respond either; the less people respond to my posts, the less opportunity there is for iterations of discussion to develop.
  • "POINTy mainspace reformatting edits": what are they? Does evidence count for nothing here?

I am not surprised that I have angered some editors. What I have observed is that some bad administration of RFDs have developed in the English Wiktionary when I was absent. Wiktionary uncodified conventions have started to be violated, e.g. when RFDs at 5-3 for deletion were started to be closed as "deleted", not reaching the 2/3 conventional threshold. That is the disadvantage of uncodified conventions: they do not provide reliable stability and predictable environment. Things should hopefully improve now that Option 3 I drafted for the following vote passed: Wiktionary:Votes/pl-2022-09/Meaning of consensus for discussions other than formal votes created at Wiktionary:Votes. The vote was created in part as a result of my protests to the notion that 50% plain majority is a consensus; I was the only one to clearly object to that notion in a BP discussion. None of those who now celebrate my block voted in that vote. It is quite possible that they are angered by that vote: they don't like it but they cannot oppose it without looking bad.

Where there is power, there is abuse of power. The abuse does not occur on individual level only; that would not work. The abuse happens on collective level. That's how power works. Interfering with power is risky.

Length and scope of block: Let us assume (I don't) that I am a burden for Wiktionary namespace. I was recently blocked from that space for 3 days. Why, then, would one need an indefinite block for all namespaces? Why would one not be happy with 1 week or 1 month block on Wiktionary namespace? The answer is clear: if I return, I may dial down my contribution to Wiktionary namespace, still figuring out how to interfere with power that is not based on consensus. The best solution is indefinite block.

My block was reduced by Equinox to 1 month, still from all namespaces. Why am I blocked from the Thesaurus namespace, where I was the only person doing recently any substantive work? Why am I blocked from the mainspace? Because of Plato? All that it takes to resolve Plato is two editors agreeing and showing their agreement in reverting edits; as a lone voice, I am quite powerless. All my power derives from all the editors who came to my votes (which are many) and supported my proposals. Thank you so much.

As in real life, Wiktionary has many bad editors and many good editors, some good admins and some bad admins. Plato thought that the bad are the majority. Was he right? Our votes give some hope that he was not quite right: many good proposals were passed, and only few bad ones, by my assessment of "good" and "bad". The rule by the majority, as problematic as it is, is the best form of rule of which we have knowledge. It is very far from perfect, and a lot of criticism of it is valid, but the other options are even worse.

--Dan Polansky (talk) 07:38, 12 November 2022 (UTC)Reply

A further discussion is at User_talk:-sche#Dan_block. I see some known suspects, e.g. the master of nonsense Fay Freak. There is also the fellow Orwellian and demagogue Chuck Entz, who nonsensically claimed that my creation of User:Dan Polansky/Inclusion arguments is somehow circular, and who likened me to an evil spirit years ago for asking people to avoid adding unattested entries and who claimed that the productivity of ex- prefix somehow magically makes its derivation inclusion-unworthy, as if that argument had any force for non- prefix.

Also, why don't those who have problems with me use my talk page? Is there a cultural prohibition I am unaware of? Why not post "I for one disagree with the following behaviors of yours: X, Y, Z. For example, I find diffs A, B and C problematic." Another one comes and says, "I agree with the above." Then comes another one and says, "I agree as well". Why not? Is not that the most amicable and proper dealing, which requires no admin privilege?

Why is this not discussed on my talk page where I could meaningfully respond? Is there no sense of fair process, especially in relation to an indef block? What has become of the English Wiktionary?

--Dan Polansky (talk) 07:59, 12 November 2022 (UTC)Reply

For the reader, here are my votes that passed: User:Dan Polansky/Votes created. They helped create more productive and predictable environment. The near-universally opposed attributive-use rule was gone (DCDuring was not accused of "rules-lawyering" for invoking that absurd rule), THUBs were codified, absurd images were restricted at least a little, inactivity-based desysopping was codified to prevent unnecessary desysopping votes on a per-admin basis, 50%-majority desysopping policy was adopted as a result of my protests against too hard desysopping (not in the vote list), and a new logo I created by modifying another one was adopted. To get an indef after all the good work I have done in an area where few are willing to do it seems outlandish. I still do work that few are willing to do, e.g. by creating well-referenced Appendix:Compounds, where the opposition replaced referencing with their own opinions and flatly refused to play the referencing game. --Dan Polansky (talk) 08:34, 12 November 2022 (UTC)Reply

go all out and basically take over the project, by Chuck Entz: nonsense? How do you single-handedly "take over the project" governed largely by consensus when you are not an admin and cannot even delete and undelete pages or block anyone? By posting long posts to Beer parlour? And this is from a bureaucrat. I will note that Chuck Entz is largely not interested in creating any new pages. That does not prevent him from sharing his often misguided "wisdom" in various discussions. And when he does create a page, it looks like the original version of Thesaurus:mountain range, almost empty and wrong to boot: landform is not a holonym, "hyponyms" is empty, and "cordillera" is not a synonym. From that and similar occurrences, I make my own judgment of what kind of person I am dealing with. --Dan Polansky (talk) 11:03, 12 November 2022 (UTC)Reply

Was "go all out and basically take over the project" a Freudian slip, a case of projection, pointing not to what I have been doing but rather to what a group of other editors have been doing? Who is more likely to "take over the project", a lone non-admin or a group of deletionist admins who close RFDs below 2/3 as "deleted", protect each other, and hardly ever create any votes? --Dan Polansky (talk) 13:32, 12 November 2022 (UTC)Reply
I cannot edit -sche talk page where they are discussing me so I cannot defend myself. In terms of fair trial, that is another absurdity. -sche is making the absurd claim that the Plato incident somehow triggered block on mainspace. And yet, Plato only shows that someone other than me reverted after being reverted instead of respecting status quo ante on the page, violating normal editing processes. --Dan Polansky (talk) 13:37, 12 November 2022 (UTC)Reply
More on the themes explored, you know that you are dealing with poorly restrained power when those who hold it engage in brazen lying. This kind of lying is different from ordinary lying, which may involve plausibility and a hope that one will not be discovered. The lying I mean is one that involves claims that on the surface are logically impossible or empirically highly improbable. Any external observer can fairly easily recognize them as such without investigating particulars. When there is a chorus of voices supporting such brazen lies, this becomes a true Orwellian dystopia, where the act of speaking a simple truth is revolutionary. --Dan Polansky (talk) 15:07, 12 November 2022 (UTC)Reply

That's a very minor point, but I'd like to note that Thesaurus:isolating language and Thesaurus:analytic language, created by you, do not seem much richer to me than Thesaurus:mountain range originally was. What was the point of creating those? PUC23:14, 12 November 2022 (UTC)Reply

Edit: I see we currently do not have main space entries for analytic language and isolating language. I'd be inclined to move the Thesaurus entries to the main space. PUC23:19, 12 November 2022 (UTC)Reply
I am disinclined to talk about things of trifling importance, and Chuck Entz's avoidance of lexicography as if it was pain to be avoided at any and all costs stands as a curious fact. You can take a stand. You can reduce my block to Wiktionary namespace since that would eliminate the post volume problem (which at least has some plausibility to it, although it is at odds with the strength of the argument spirit): the Plato revert problem is clearly a false pretext. Especially editing the thesaurus would be very useful to me: it is a remarkable tool of thought and writing; it helps me think. Or you can at least say that you disagree with the block but you do not want to interfere with another admin; Equinox did choose to interfere. Instead, you choose to talk about trifles. To my mind, that is not decent conduct. --Dan Polansky (talk) 07:19, 13 November 2022 (UTC)Reply
Thank you for the access to Thesaurus namespace. About these language entries, if they get moved to the mainspace, they will get deleted as sum of parts. By contrast, the thesaurus does not require headwords to be non-sum of parts, per its long-term tradition. Since hyponymy and instance-of are parts of the thesaurus, these language entries provide semantic nodes to bind items together by these two relationships, and can be expanded by anyone interested. Some editors (who do not edit the thesaurus) recently started to treat instance-of as problematic and allegedly "non-lexical"; it was not so for a decade, e.g. per Thesaurus:planet, and it is not so per WordNet. The self-conceited arrogance of many editors, especially those who would not substantiate a single claim with a reference if their life depended on it, is discouraging. For all its flaws, Wikipedia has much better culture in some ways. Wiktionary is a paradise for halfwits who think to know everything better but all too often know very little. --Dan Polansky (talk) 07:45, 13 November 2022 (UTC)Reply

One thing I need to be prevented from is productively participating in Wiktionary:Votes/pl-2022-11/Should unidiomatic phrases be included if there is consensus for likely utility to readers?, which could lead to adoption of sound wording of the override rule preventing unrestrained deletionism, codifying an analog of W:Wikipedia:IAR. Such participation would be important in refining the wording of Option 2 to make it pass. A block longer than a month was an urgent measure to take by the deletionists. (Indeed, I do not assume good faith: blatant evidence of repeated dishonesty cannot be ignored. And one cannot assume good faith and at the same time make relevant charges against abuse of power.) --Dan Polansky (talk) 07:19, 13 November 2022 (UTC)Reply

@PUC Can you unblock me from the mainspace as well? Arguably, the greatest problem was in the Wiktionary namespace, where I was discussing in greater volume and in higher number of iterations than most editors. Above all, I would like to continue adding collocations, which hardly anyone else does, and add links to the thesaurus. I pledge to avoid reverting editors for the next month. If problems develop, I can easily be blocked again. Preventing me from adding collocations tangibly harms the project. --Dan Polansky (talk) 07:28, 14 November 2022 (UTC)Reply

@excarnateSojourner: I am blocked, so I cannot respond to you on the place where you have pinged me. To explain, widely used templates should not be deleted but rather deprecated so that the revision histories do not show redlinks or other strange artifacts and are neatly legible. Even redlinks such as Template:deletedtemplate, which I created via "{{deletedtemplate}}", disturb the comfort of reading old revisions. The deprecation involves placing entries using the template into a tracking category, which has been proven empirically to work reasonably well. What has not been done is create edit filters to prevent use of deprecated templates; it would be an ideal solution. I believe deprecation should be used much more widely, and especially for templates that are used in very many revision histories. I believe comfort of inspecting revision histories is very important. --Dan Polansky (talk) 08:00, 17 November 2022 (UTC)Reply

"Also, why don't those who have problems with me use my talk page? Is there a cultural prohibition I am unaware of?"[edit]

It's because they want to express their opinion without getting into 50 feet of pseudo-legalese. Try to have some empathy. Your friend, Equinox 09:34, 12 November 2022 (UTC)Reply

They would not need to read the "pseudo-legalese". The feedback would surely have some effect. I know that you have been my friend, although we often disagreed on names of specific entities and hyphenated single words. We agree on many things, e.g. that absurd images should be gone. Without you, I wouldn't know someone actually likes some of my writing on my talk page.
I am talking about cultural problem because I see people relatively rarely try user pages before blocks, and when I do use a user page to tell someone what I think is wrong, I get chastised for that. --Dan Polansky (talk) 09:58, 12 November 2022 (UTC)Reply

Vote about RFD overrides[edit]

@Lambiam I cannot edit the vote page, being blocked, so I respond here: please add your preferred option 3 to the vote. Maybe it will pass; it seems fine. Dan Polansky (talk) 18:38, 12 November 2022 (UTC)Reply

Filibuster[edit]

I have been accused of "filibuster" by multiple editors, including -sche. It should be obvious that no analog of "filibuster" can ever threaten a wiki like Wiktionary. Apparently, it isn't, so articulation may seem in order. No one should ever be accused of "filibuster" here because it is nonsense.

As per W:Filibuster, "A filibuster is a political procedure in which one or more members of a legislative body prolong debate on proposed legislation so as to delay or entirely prevent decision." What happens in practice is that politicians give long often irrelevant speeches in deliberative chambers and thereby obstruct democratic processes. There is no such analog on wiki: if someone posts 10 KB of text to a discussion, someone may post another piece of text a minute later. There is no delay; no one has to wait until the disruptor finishes speaking. Very long posts would indeed be a problem, but not one similar to filibuster. And very long posts are a problem with simple solutions: 1) place the long post into a collapsible heading as too long; 2) if very long, outright remove the post from the discussion and tell the poster on their talk page to desist.

When a vote in Wiktionary starts, no poster of long texts can ever delay the vote via the posts or prevent other participants from voting or discussing in the vote. If the long texts are indeed disruptive, a simple solution is, again, to just remove them. That hardly ever happens in Wiktionary and it is good so: under normal conditions, people should be able to present their arguments at length. Yet, there is such a thing as "too long"; we cannot tolerate, say, 10 MB of clearly irrelevant text.

I was accused of "filibuster" by -sche years ago in my admin vote. It was nonsense back then as it is now. Back then, -sche was supporting non-consensual mass changes by CodeCat. What -sche means by my "obstructionism" is e.g. my protestation against CodeCat and similar editors who do not care in the least about genuine consensus. What he further means is my requirement that mass changes of well established practices, when opposed by some, should only proceed when genuine evidence of consensus is produced, ideally via a vote but a Beer parlour discussion can also be fine.

I was blocked previously by -sche for trying to prevent a long-term creator of errors in various languages from continuing the errors. Back then, -sche called it "hounding", and like now, he invoked a Wikipedia policy, which again is inappropriate. -sche is the same problematic administrator he was before, not fair, playing his power games without creating votes and figuring out what the real consensus is. A further example of his power games is his Beer parlour discussion designed to sneak in changes to CFI without a vote: Wiktionary:Beer parlour/2022/August#Individuals whose names use matronymics rather than patronymics. The purpose of that discussion was not to fix CFI for "matronymics" (which was not real problem) but rather to sneak in a vaguely related proposal without a vote, and that is indeed what has become of that discussion. By opposing in that discussion, I was "obstructionist". By applying "plain-meaning rule" like the British do, I was somehow a bad person. In that discussin, his Comrade demagogue Fay Freak says "No one is allowed to be that literal—rules lawyering. Analogy is intended by any code of rules." Well, that is clearly untrue: analogy is not intended to be used in British statutory interpretation relying on plain-meaning rule, golden rule and one more rule.

I have little doubt that this very text is going to be considered to be "filibuster". It is indeed longer than absolutely required. What is absolutely required is to say "accusing someone of fiibuster on wiki is nonsense and should never happen". But that is not the articulation game characteristic of examination of arguments, as contrasted to plain dismissive statements. This very text is not terribly long. And anyone who sees a title "filibuster" and does not want to read about it does not, indeed, need to read about it. --Dan Polansky (talk) 12:17, 13 November 2022 (UTC)Reply

Rules lawyering[edit]

I have been accused of "rules lawyering". The accusation is wrong, and it confuses objectionable rules lawyering with the "plain-meaning rule" of statutory interpretation. Wikipedia article is W:Rules lawyer. I don't find the term "rules lawyer" in general dictionaries online.

What I do is that I quote wording of policies as is, and point out to its plain meaning. I sometimes use the intention mode of interpretation, pointing out that a certain interpretation of the rule cannot be what was intended. I sometimes reject the rule, pointing out that application of the plain meaning of the rule is clearly undesirable, and that therefore, the rule as is should not be followed, and that ideally, the rule should be reworded. F-entities such as -sche, Fay Freak, the Knight or DCDuring don't like that, and some other people do not like that as well even though they themselves refer to specific wording of rules. DCDuring is an expert in applying absurd rules "as is" because he likes them; they are not absurd to him; but at other times, he is happy to accuse others of sticking with the letter of the rule. Survival in a place with too many bad people and demagogues is possible, but it is a challenge. Wiktionary is an amazing project. Dan Polansky (talk) 09:36, 14 November 2022 (UTC)Reply

Substantial policy making needs votes[edit]

As per our policy, any substantial change in policy requires a vote: WT:CFI: "Any substantial or contested changes require a VOTE". It is a bad meta-policy. It should read "Any change requires a VOTE", for reasons that I explained in my oppose vote and that are as true now as they were before. First of all, per the current wording, any "substantial" change requires a vote anyway, so no substantial advancement of our policies is sped up. But what the new wording allows is fraud in which someone claims the change they want to make is not "substantial" for some meaning of "substantial". Fraudster -sche even proposed that someone who dares to "contest" changes should be blocked because of that, even though contesting such a change is precisely within the meaning of the meta-rule. And this is what he did, blocking me indefinitely. Simplifying the meta-rule to "Any change requires a VOTE" would prevent any power games of fraudsters, and cause almost no harm: it would only prevent truly insubstantial changes and these, by definition, are not important and can be done without. Obvious? I think so. Not to others. Dan Polansky (talk) 09:58, 14 November 2022 (UTC)Reply

Announcement of support[edit]

The amount of abuse you've taken from this community through the years is, frankly, shocking. Allahverdi Verdizade (talk) 23:24, 20 November 2022 (UTC)Reply

Thank you very much for saying so. It makes a big difference, if not in power outcomes then in psychological well-being. It feels to much less lonely. --Dan Polansky (talk) 07:24, 22 November 2022 (UTC)Reply

Languages seem to be concrete objects, not abstract[edit]

Since, abstract objects are things like numbers. They exist, or depending on ontology, non-exist, before humans. And languages do not exist before humans. Then, ontologically, languages are something like substances, as chemical compounds. Even under this analysis, names of languages are not proper nouns; singularity of referent is largely irrelevant as per various examples (names of qualities (names of abstract objects), names of chemical compounds (names of concrete objects)), and what matters for proper-noun-hood is something else. --Dan Polansky (talk) 20:03, 29 December 2022 (UTC)Reply