User talk:Dan Polansky: difference between revisions
Dan Polansky (talk | contribs) →trreq: new section |
Dan Polansky (talk | contribs) →Razorflame: new section |
||
Line 1,000: | Line 1,000: | ||
I am equally okay with dumping {{temp|rfe}}. --[[User:Dan Polansky|Dan Polansky]] ([[User talk:Dan Polansky|talk]]) 07:42, 3 August 2013 (UTC) |
I am equally okay with dumping {{temp|rfe}}. --[[User:Dan Polansky|Dan Polansky]] ([[User talk:Dan Polansky|talk]]) 07:42, 3 August 2013 (UTC) |
||
== Razorflame == |
|||
{{user|Razorflame}} is incompetent and untrustworthy. Evidence of his lack of dictionary making skill and lack of ability to make and keep promises is available in the archives of [[User talk:Razorflame]]: [[User_talk:Razorflame/Archive_1]], [[User_talk:Razorflame/Archive_2]], [[User_talk:Razorflame/Archive_3]], and more in future. As a consequence of a protracted series of broken promises and mistakes resulting from his contributing in languages in which he has close to no knowledge, he was blocked in July 2010 for one year; see [http://en.wiktionary.org/w/index.php?title=Special:Log/block&page=User%3ARazorflame his block log]. |
|||
Editors expressing serious misgivings about his editing include EncycloPetey, Equinox, msh210, Ruakh, opiaterein AKA Dick Laurent, Yair rand, Atelaes, and Dan Polansky (me). |
|||
Some editors hold hope for his somehow maturing over the years. However, I find it unlikely. |
|||
Arbitrarily selected incidents: |
|||
* Performing various arbitrary unjustified reverts |
|||
* Contributing in a variety of languages not understood by him with unclear error rate; fairly many errors have been identified by the editors versed in the language in question |
|||
* Adding copyrighted non-free definitions, November 2009 |
|||
* Adding Czech translations from a copyrighted dictionary, June 2010 |
|||
* Creating Kannada entries with extremely numerous etymology sections based on nothing at all; see [[ತಡೆ]] and [[Talk:ತಡೆ]], August 2013 |
|||
--[[User:Dan Polansky|Dan Polansky]] ([[User talk:Dan Polansky|talk]]) 19:40, 6 August 2013 (UTC) |
Revision as of 19:40, 6 August 2013
Archives |
---|
Usefulness of phrasebook
I find some of the phrasebook entries rather useful, as their translation is not grammatically straightforward.
- I'm hungry
- Czech: mám hlad, as if "I have hunger"
- German: ich habe Hunger, as if "I have hunger"
- I'm thirsty
- Czech: mám žízeň, as if "I have thirst"
- German: ich habe Durst, as if "I have thirst"
- do you speak English
- Czech: mluvíte anglicky?, as if "do you speak in English" or even *"do you speak Englishly"?
- Polish: czy mówisz po angielsku?, which I do not know how to render in an English analogue, maybe "do you speak in an English manner or way"
- I'm cold
- Czech: je mi zima, rather than *"jsem studený"; in English, perhaps "it is cold to me"?
- German: es ist mir kalt, rather than *"ich bin kalt"; in English, perhaps "it is cold to me"?
- how are you?
- Czech: jak se máš, as if "how are you having yourself?"
- German: wie geht es dir?, as if "how does it go with you?" or the like
- have a seat
- Czech: posaďte se, as if sit down
- how do you say...in English?
- Czech: jak se řekne...anglicky?, as if "how does ... get said Englishly"?
- German: wie sagt man...auf Englisch, as if "how does one say ... on English"?
- how much does it cost?
- German: was kostet das?, as if "what does it cost", which happens to be idiomatic English
Category:English phrasebook has 357 entries. --Dan Polansky (talk) 18:06, 5 January 2013 (UTC)
- Exactly. If only we could get rid of all the rubbish, the phrasebook would be a useful part of the project. SemperBlotto (talk) 18:09, 5 January 2013 (UTC)
Some more:
- I have a cold
- Czech: jsem nachlazený, as if "I am got cold" or the like
- German: ich bin erkältet, just like Czech
- Russian: ja prostudílsja, as if "I have colded myself through" or the like
- I'm twenty years old
- Czech: je mi dvacet let, as if "it is twenty years to me"
- Polish: mam dwadzieścia lat, as if "I have twenty years"
--Dan Polansky (talk) 11:52, 6 January 2013 (UTC)
some random words
Just a few random words from the beginning of Czech Wikipedia article on Hydrogen. (not a full, sorted list of red links as my software doesn't understand the funny accents over letters) Don't feel under any obligation to add them! SemperBlotto (talk) 16:11, 6 January 2013 (UTC)
Vodík chemická latinsky nejjednodušší tvořící převážnou hmoty vesmíru Má široké praktické redukční činidlo chemické syntéze metalurgii meteorologických pouťových balonů vzducholodí Obsah Základní fyzikálně-chemické vlastnosti Historický Výskyt přírodě Tvorba průmyslová Využití Sloučeniny Anorganické sloučeniny Hydridy Další Organické sloučeniny Izotopy vodíku Odkazy Související články Literatura Základní fyzikálně-chemické vlastnosti Molekula chuti zápachu hoří namodralým plamenem nepodporuje - some capitalisation will be wrong
- Thanks. We have lemma forms of base forms of many of these, although not all of them: vodík, chemický, latinský (adjective rather than the redlinked adverb), jednoduchý, tvořit, převážný, hmota, vesmír, mít, široký, praktický, redukční, činidlo, chemický, syntéza, metalurgie, etc.
- For your method to work for me, I would need to enter inflected forms of Czech words into Wiktionary, which I don't feel like doing. I actually have a list of Czech words to add, working offline on their verification, from time to time. --Dan Polansky (talk) 19:37, 7 January 2013 (UTC)
ttbc
Hi,
When adding ttbc's please check for qualifier, they actually explain the sense sometimes as in trio#Translations. --Anatoli (обсудить/вклад) 00:47, 9 January 2013 (UTC)
Harry Potter
I don't really understand the difference between Category:Harry Potter and Category:Harry Potter derivations. Where do metloboj or bezjak or smrtožder belong? Zabadu (talk)
Also, I am very worried about anti-Serb bias here. For example there is no Serbia category but there is Croatia category. Why is that?
Can you please help me add flag of Serbia to Category:Serbia?
- No comment. --Dan Polansky (talk) 13:37, 12 January 2013 (UTC)
chargemaster
Please see talk:chargemaster. Please can we discuss this more before removing this material, as it is integral to the definition. Thank you, -- Cirt (talk) 22:06, 9 March 2013 (UTC)
- Thanks very much for your polite response on the talk page, I really appreciate it! :) I've responded there, -- Cirt (talk) 04:47, 10 March 2013 (UTC)
Request about "chargemaster"
Request: Please, Dan Polansky (talk • contribs), I ask of you to read this article:
I think that will give you some clarity about the term chargemaster. Thank you for your time, -- Cirt (talk) 18:18, 10 March 2013 (UTC)
- I responded at WT:RFD. --Dan Polansky (talk) 19:30, 10 March 2013 (UTC)
DONE: Trimmed the definition to that suggested by Dan Polansky (talk • contribs), above, please see DIFF. Hopefully this is now satisfactory to Dan Polansky (talk • contribs). Thank you, -- Cirt (talk) 23:40, 10 March 2013 (UTC)
Please read this article
I strongly recommend you read this article, as a good faith gesture, it would help inform our discussion. Can you please read it? It is most informative. Thank you, -- Cirt (talk) 23:43, 10 March 2013 (UTC)
- Let me note that, in the discussion about the definition of "chargemaster", I am acting in the capacity of a dictionary maker, trying to figure out what is and what is not a part of the definition of "chargemaster". I am not defending whatever despicable practices exist in relation to chargemasters. --Dan Polansky (talk) 19:18, 11 March 2013 (UTC)
- Sure, sure, I agree with you and I don't doubt your good faith intentions. :) I'm just respectfully asking you to read this article, please? -- Cirt (talk) 20:29, 11 March 2013 (UTC)
Re: KYPark
Well said. I owe you one. —Μετάknowledgediscuss/deeds 02:17, 14 March 2013 (UTC)
- I second. --Anatoli (обсудить/вклад) 02:43, 14 March 2013 (UTC)
drug
A note to myself and whoever cares to read: I am dissatisfied with the "drug" entry, currently having four senses. Recent related events:
- Wiktionary:RFC#drug, August 2012, originally at RFV
- A conversation at User_talk:Msh210#drug, 17 March 2013
As a consequence, I have done this:
- Sent the 1st sense to WT:RFD, with the intention of making it more narrow by removing part of the definition.
- Sent the 2nd sense to to WT:RFV with the intention of getting it removed.
--Dan Polansky (talk) 15:32, 27 April 2013 (UTC)
Key definition edits to "drug" entry:
- diff, March 2003: 1st def entered of "Substance used to treat an illness, relieve a symptom or modify a chemical process in the body for a specific purpose."
- diff, May 2003: A 2nd def entered: "Addictive substance used to alter the level of consciousness"
- diff, August 2004: 2nd def tweak: "A substance, often addictive, used to alter the level of consciousness"
- diff, July 2005: 2nd def tweak: "A substance, often addictive, which affects the central nervous system"
- diff, March 2006: 3rd def added: "A chemical or substance, not necessarily for medical purposes, that alters the way the mind or body works", with the summary "Added definition(noun) that encomasses non-medicinal drugs)", by an anon
- diff, December 2006: 4rd def added: "An illegal drug", by an anon
- diff, March 2007: 4rd def tweaked: "A drug, especially illegal, taken for recreational use"
- diff, July 2008: 4rd def tweaked: "A substance, especially one which is illegal, ingested for recreational use."
- diff, May 2013: 4rd def tweaked: "A psychoactive substance, especially one which is illegal and addictive, ingested for recreational use, such as cocaine"
--Dan Polansky (talk) 12:20, 4 May 2013 (UTC)
Two senses removed by me in diff, failing WT:RFV.
I have reverted this revision by an anon, one of interest:
- (pharmacology) A substance intended for use in the diagnosis, cure, mitigation, treatment, or prevention of disease in man or other animals; a medicine.
- (pharmacology) A substance (other than food) intended to affect the structure or any function of the body of man or other animals.
- A narcotic substance.
- (figuratively) Anything that has the effect of a narcotic substance.
The 1st definition is subject to objections raised in a recent RFD: contraceptives do not meet the definition. This may be dealt with by placing contraceptives under the 2nd definion, but it is unobvious why the 2nd definition should be separate from the 1st one.
The "narcotic substance" definition is unhelpful, IMHO; it relies on narcotic entry featuring three definitions, failing to select the intended definition from "narcotic". Furthermore, the "narcotic substance" definition may be wrong, depending on what the definitions at "narcotic" are intended to cover; the 1st one seems to entail "induces sleep" as a condition necessary, so it does not cover all illicit drugs; the 2nd one entails "numbing", so ditto; the 3rd one "certain illegal drugs" is massively unspecific, failing to tell us which illegal drugs it selects, but if it selects some and not all, then it cannot be covering all illicit drugs.
When checking “drug”, in OneLook Dictionary Search., no dictionary equates illicit drugs with narcotics. Collins, for instance, has "chemical substance, esp a narcotic, ...", which makes it clear "drug" and "narcotic" are not synonymous.
The 4th figurative sense is one that we are possibly missing, and was mentioned by msh210 on his talk page. However, it should be added only together with citations supporting it, IMHO. --Dan Polansky (talk) 09:59, 30 June 2013 (UTC)
Hi there. In the UK a "tax office" is a place where you can go (or phone) to discuss your tax affairs. The organization used to be called the "inland revenue" and is now called "HM Revenue and Customs". See [1] as an example of use. SemperBlotto (talk) 10:10, 4 May 2013 (UTC)
- Oops! So it is not like "post office", which can refer both to a particular place and to the organization itself. The tax-collecting organizations have various specific names across the world, as per W:Revenue_service: "HM Revenue and Customs" (U.K.), "Internal Revenue Service (IRS)" (U.S.), "Australian Taxation Office" and "Canada Revenue Agency". Would "revenue service" be the generic term for tax-collecting agency or organization I am looking for? What about tax agency, or tax authority? --Dan Polansky (talk) 10:19, 4 May 2013 (UTC)
- Yes, I think that "revenue/tax service/agency/authority" combinations are used in the UK and elsewhere as a generic term for the organization. In the UK, there are several other "offices" that are organizations rather than places (normally capitalised) - Office for National Statistics is one that springs to mind. SemperBlotto (talk) 10:26, 4 May 2013 (UTC)
- I have fixed the entry. Feel free to edit it further. --Dan Polansky (talk) 10:33, 4 May 2013 (UTC)
Wiktionary popularity among online dictionaries per Alexa rank
I can't even believe the following statistics:
Rank of dictionary web sites per number of visitors per Alexa.com, ordered by global rank:
Web | Alexa Global Rank | Alexa U.S. Rank | Alexa U.K. Rank | Note |
---|---|---|---|---|
wikipedia.org | 6 | 8 | 10 | Listed despite not being a dictionary, as a super successful Mediawiki project |
reference.com | 207 | 77 | 144 | dictionary.reference.com - 54% visitors of the domain go here |
thefreedictionary.com | 265 | 223 | 205 | Multi-lingual; has a definition dictionary for several languages; by Farlex |
wordreference.com | 306 | 1,024 | 325 | |
wiktionary.org | 641 | 1,313 | 867 | This is for all Wiktionaries, not just the English one. en.wiktionary.org - 40% of visitors of the domain go to this subdomain |
urbandictionary.com | 836 | 378 | 429 | Note the U.S. and U.K. ranks |
merriam-webster.com | 867 | 315 | 1,817 | Note the U.S. rank |
yourdictionary.com | 3,440 | 1,775 | 3,294 | |
cambridge.org | 3,509 | 5,781 | 982 | |
oxforddictionaries.com | 5,635 | 7,898 | 1,540 | |
infoplease.com | 7,936 | 2,742 | 7,682 | |
uchicago.edu | 7,983 | 3,084 | 7,711 | Hosts Webster 1913 and Roget 1911, but naturally also many other things machaut.uchicago.edu - 1.2% of domain visitors go here; this is the subdomain that hosts the dictionaries. |
macmillandictionary.com | 8,232 | 6,731 | 4,798 | |
rhymezone.com | 13,207 | 3,548 | 6,301 | |
collinsdictionary.com | 19,187 | 19,487 | 5,175 | |
wordnik.com | 19,976 | 11,546 | 13,955 | |
onelook.com | 20,022 | 8,228 | 17,626 | |
vocabulary.com | 20,588 | 11,113 | 15,779 | |
dicts.info | 124,942 | 144,782 | 81,619 | |
wordsmyth.net | 147,063 | 64,960 | N/A | |
allwords.com | 166,231 | 77,270 | 172,247 | |
freedictionary.org | 576,552 | N/A | N/A | |
freedict.org | 8,006,370 | N/A | N/A | It may be that most downloaders download the complete dictionary files; I don't know. |
It follows that there are five dictionary web sites significantly competing with Wiktionary in terms of number of visitors: reference.com, wordreference.com, thefreedictionary.com, urbandictionary.com, and merriam-webster.com. All the other dictionaries perform worse than Wiktionary even in access from U.S. and U.K., no matter how good definitions they offer. --Dan Polansky (talk) 10:58, 8 May 2013 (UTC) Updated. --Dan Polansky (talk) 15:16, 8 May 2013 (UTC)
Alexa rank for dictionaries selected with the focus on the Czech Republic aka Czechia:
Web Site | Alexa Rank for CR | Note |
---|---|---|
seznam.cz | 1 | slovnik.seznam.cz - 6% go to this subdomain, so Seznam dictionary is really popular; features data from Lingea and Macmillan Dictionary |
centrum.cz | 10 | slovniky.centrum.cz - 0.56% go to this subdomain |
abz.cz | 240 | slovnik-cizich-slov.abz.cz - 68% go to this subdomain |
slovnik.cz | 423 | Features LangSoft vocabulary + GNU/FDL dictionary |
online-slovnik.cz | 783 | En<-->cs + synonym dictionary; unclear owner and licensing terms |
wiktionary.org | 827 | cs.wiktionary.org - 0.4% go to this subdomain, so chances are the visitors from Czechia actually go somewhere else, like to en.wikt, fr.wikt or de.wikt. |
zcu.cz | 838 | slovnik.zcu.cz - the subdomain is not listed; why? |
slovnik-synonym.cz | 1,072 | Seems to belong to abz.cz |
lingea.cz | 8,016 | slovniky.lingea.cz - 36% go here |
--Dan Polansky (talk) 13:14, 8 May 2013 (UTC)
See also http://www.alexa.com/topsites/category/Top/Reference/Dictionaries, a list of top Alexy sites in Dictionaries category. There, Wiktionary is 3rd, probably based on the global Alexa rank. --Dan Polansky (talk) 15:16, 8 May 2013 (UTC)
The popularity of Wiktionary can be corroborated from other sources, rather than relying on Alexa only.
According to Google Ad Planner at http://www.google.com/adplanner/static/top1000, Wiktionary had rank of 574 by the number of unique visitors in July 2011; it had 8,200,000 unique visitors and 26,000,000 page views. As for some other dictionaries, thefreedictionary.com had rank 167 and 21,000,000 unique visitors, while merriam-webster.com had rank 700 and 7,400,000 unique visitors. Again, these are 2011 data. To find other dictionaries there, search for "dictionaries", as there is "Dictionaries & Encyclopedias" category shown in their table.
Website quantcast.com is another source. By going to http://www.quantcast.com/wiktionary.org, and using "Compare Site" button, you can compare Wiktionary popularity to other dictionaries, including "merriam-webster.com". The comparision is shown as a time-dependent graph. For March 31 through April 29 2013, the graph shows around 1.8 million "people" in United States per month for Wiktionary while around 14 million "people" for merriam-webster.com; for cambridge.org, it shows around 0.2 million "people". Presumably, "people" refers to unique visitors. --Dan Polansky (talk) 18:26, 17 May 2013 (UTC)
Page views of Wiktionary and some other stats per Wikimedia statistics[2]:
Language | Page Views in March 2013 | Very Active Editors in March 2013 | Speakers |
---|---|---|---|
English | 95,761,090 | 80 | 1,500,000,000 |
French | 36,264,524 | 32 | 200,000,000 |
Russian | 21,857,366 | 12 | 278,000,000 |
German | 14,115,657 | 17 | 185,000,000 |
Portuguese | 8,935,120 | 3 | 290,000,000 |
Polish | 8,399,996 | 11 | 43,000,000 |
Greek | 5,636,774 | 5 | 15,000,000 |
Chinese | 5,506,369 | 1 | 1,300,000,000 |
Spanish | 5,385,766 | 7 | 500,000,000 |
Italian | 5,181,025 | 4 | 70,000,000 |
Dutch | 4,243,709 | 6 | 27,000,000 |
Japanese | 3,502,241 | 3 | 132,000,000 |
Swedish | 3,118,274 | 3 | 10,000,000 |
Korean | 3,022,935 | 1 | 78,000,000 |
Vietnamese | 2,663,255 | 0 | 80,000,000 |
Turkish | 2,480,022 | 2 | 70,000,000 |
Finnish | 2,127,353 | 7 | 6,000,000 |
Malagasy | 1,606,076 | 0 | 20,000,000 |
Lithuanian | 1,503,229 | 0 | 3,500,000 |
Czech | 1,491,805 | 8 | 12,000,000 |
--Dan Polansky (talk) 17:39, 19 May 2013 (UTC)
Czech rhymes
I always thought Czech had stress on the first syllable in most words. Am I mistaken, or do words rhyme differently in Czech? —CodeCat 19:03, 14 May 2013 (UTC)
- I don't think consciously of stress in Czech; I just speak it. Whatever the case, is there any impact on the pages that I am creating? Is there anything I have entered that you think in fact does not rhyme? --Dan Polansky (talk) 19:05, 14 May 2013 (UTC)
- If stress is word-initial, I'd expect (deprecated template usage) hrana and (deprecated template usage) obrana to not rhyme. You are probably better at judging what rhymes and what doesn't, but if those words do rhyme, I am curious why that is. —CodeCat 19:11, 14 May 2013 (UTC)
- I can't think of a rhyme with "hrana" and "obrana", but consider this: "Mariana byla panna¶ než vrazila do klokana". There, "panna" has two syllables while "klokana" has three syllables. It can be that the stress shifts to the preposition do before klokana; I do not really know. What I do know is that the words I am entering generally can be paired to create rhymes. --Dan Polansky (talk) 19:18, 14 May 2013 (UTC)
- I think other Slavic languages have similar stress shifts with prepositions. It has something to do with the original Proto-Slavic pitch-based accent I think. —CodeCat 19:24, 14 May 2013 (UTC)
- I can't think of a rhyme with "hrana" and "obrana", but consider this: "Mariana byla panna¶ než vrazila do klokana". There, "panna" has two syllables while "klokana" has three syllables. It can be that the stress shifts to the preposition do before klokana; I do not really know. What I do know is that the words I am entering generally can be paired to create rhymes. --Dan Polansky (talk) 19:18, 14 May 2013 (UTC)
- If stress is word-initial, I'd expect (deprecated template usage) hrana and (deprecated template usage) obrana to not rhyme. You are probably better at judging what rhymes and what doesn't, but if those words do rhyme, I am curious why that is. —CodeCat 19:11, 14 May 2013 (UTC)
Deprecated Czech templates
There are some Czech entries listed in Category:Pages using deprecated templates. Could you have a look and fix them if possible? —CodeCat 19:23, 19 May 2013 (UTC)
- I have removed the deprecation from
{{cs-conj-it}}
. After the server catches up, Category:Pages using deprecated templates should get emptied. The template was marked as deprecated in diff, on 3 March 2009. The template seems to produce correct results. I am not really much into Czech inflection templates, so I am unenthusiastic about implementing replacement proposals invented but not executed on by other editors. --Dan Polansky (talk) 20:11, 19 May 2013 (UTC)
Phonosemantic interpretations
Thank you for calling my attention to the new Beer Parlour thread, Dan. I await the community's decision, and will of course be adding no entries for the time being. Lawrence J. Howell (talk) 22:48, 9 June 2013 (UTC)
Your View?
Hello, Dan. My watchlist tells me that user 75.71.64.241 reverted data I uploaded for the character 身, writing Very little evidence to support those claims. As I'm abiding by the community's request to refrain from doing anything until the matter under debate has been settled, I believe it's only fair that the hands-off policy cut both ways. What's your take? Lawrence J. Howell (talk) 08:23, 13 June 2013 (UTC)
- I don't really know. I don't think Wiktionary can keep "Phonosemantic interpretations" backed by a single source. The anon should better wait for the discussion to proceed, though. However, many view such waiting as too bureaucratic and proceed via a fast track. As per fast track, etymological content that is sourced from a single source, having no obvious other sources, and for which no sources are in the process of being added can be removed.
- Links: Wiktionary:Beer_parlour/2013/June#Phonosemantic_interpretation, 75.71.64.241 (talk). --Dan Polansky (talk) 15:42, 13 June 2013 (UTC)
What is a misspelling
What is a misspelling may be a hard question but let us have a look, in a hasty sketch.
A misspelling can be understood as a transmission error, in terms on sending messages over a noisy communication channel. A message--a sequence of letters--sent over a noisy communication channel is subject to random changes to the letters. The intended received message is the one that was sent; the criterion of correctness is identity: the received message has correct spellings if they are identical to the spellings used in the sent message. As a consequence, misspellings resulting from noise of low-noise channel tend to be of much lower frequency in the corpus of received messages than "correct" spellings.
What is the noise in the case of man-made misspellings? For one thing, each person makes misspelling in individual written utterances; these tend to have lower frequency in all writings of the person than the "correct" spellings. For another thing, a person can store an uncommon spelling as the standard one in the mind and consistently reproduce the spelling that has low frequency in the corpus of the language community but high frequency in the writing of that single person.
There may be an authority declaring what is and what is not a misspelling, such as a dictionary published by a successful commercial publisher or a dictionary published by a regulatory government-funded organization established in one of the countries in which the language prevails. The decision made by the dictionary may be arbitrary, disregarding current frequency. The point of making an arbitrary decision about "correct" spelling and sticking to it is enabling uniformity of spelling in the corpus, coupled with compactness of spelling patterns if the spelling decision is made according to implied spelling patterns and regularities rather than by individual frequencies.
As a practical frequency criterion, misspellings tend to have vanishingly lower frequency than their "correct" alternatives, whereas alternative spellings have much more favorable frequency ratio to the "correct" or mainstream alternatives. In English, it is worthwhile to have a look at frequency ratios of U.S. vs. British spellings, such as "color" vs "colour". From what I can see in Google Ngram Viewer, their frequency ratio tends to be 2 to 4, meaning the U.S. spelling is twice to four times more common in the whole corpus than the British spelling. By contrast, looking at "conceive" vs. "concieve", the frequency ratio is 1000.
As per frequency criterion, a misspelling can never have a higher frequency than a "correct" spellings. Nonetheless, there are probably etymology afficionados claiming about one mainstream spelling or another that it is "incorrect". If these are allowed to run authoritative dictionaries, their preferences can end up being codified as "correct". --Dan Polansky (talk)
Policies and would-be policies:
Discussions:
Categories:
- Category:English misspellings - currently 1,477 entries; only for common misspellings
--Dan Polansky (talk) 10:05, 5 July 2013 (UTC)
- Yes, I'll go along with most of that. I had always assumed that spelling mistakes were honest errors (-ie- instead of -ei- etc.), the results of typing too fast (that's where most of mine come from) and simple ignorance (I can never remember how to spell (deprecated template usage) manoeuvre. But when is a spelling mistake "common" (as the ones we include)? Maybe when the "frequency ratio" is greater than hundreds but less than thousands? SemperBlotto (talk) 10:25, 5 July 2013 (UTC)
Re: 'Maybe when the "frequency ratio" is greater than hundreds but less than thousands?' Sounds okay to me as a criterion for "common misspelling"; what has lower frequency ratio is "alternative spelling". However, the lower bound could be even lower, like 20 or 50. In RFV, I have posted a table that gives an impression:
Short Term | Long Term | Ngram | Frequency Ratio in Year 2000 |
---|---|---|---|
referencable | referenceable | Ngram | 8 |
experiencable | experienceable | Ngram | 10 |
influencable | influenceable | Ngram | 16,5 |
sequencable | sequenceable | Ngram | 6 |
servicable | serviceable | Ngram | 156 |
enforcable | enforceable | Ngram | 860 |
replacable | replaceable | Ngram | 190 |
colour | color | Ngram | 3,4 |
behaviour | behavior | Ngram | 2,8 |
rigour | rigor | Ngram | 2 |
concievable | conceivable | Ngram | 3867 |
idiosyncracy | idiosyncrasy | Ngram | 6 |
supercede | supersede | Ngram | 15 |
--Dan Polansky (talk) 10:42, 5 July 2013 (UTC)
I was not paying attention. You asked when is a spelling mistake common enough to be includable. For this, not only frequency ratio can be considered but also absolute frequency. Let me think some more and have a look. --Dan Polansky (talk) 10:45, 5 July 2013 (UTC)
Currently, Wiktionary is not overflooded with misspellings, having 1477 English misspellings. To decide what misspellings to exclude based on frequency ratio, we would need to choose a fairly arbitrary threshold. I would choose such threshold that prevents overflooding of Wiktionary with misspellings while allowing a fair amount of them. As I cannot determine the number of acceptable misspellings per various frequency ratio thresholds, I have not much of an opinion on that threshold. From the table that follows, I would guess the threshold should be higher than 2000. With the use of the data that Google has published for download at Google Ngram Viewer, the number of misspellings per threshold could be determined, but that would require fairly heavy number crunching, it seems.
One could object that frequency ratio should not be used alone. I don't have much of an opinion on that other than that using it alone seems okay, not too bad.
Term 1 | Term 2 | Ngram | Ratio in Year 2000 |
---|---|---|---|
beleive | believe | Ngram | 3349 |
beleiver | believer | Ngram | 22913 |
aquitted | acquitted | Ngram | 433 |
aquire | acquire | Ngram | 1075 |
arithmatically | arithmetically | Ngram | 441 |
concieve | conceive | Ngram | 1494 |
recieve | receive | Ngram | 1874 |
bibiliography | bibliography | Ngram | 2920 |
assidious | assiduous | Ngram | 1084 |
bizzare | bizarre | Ngram | 396 |
athiest | atheist | Ngram | 561 |
condensor | condenser | Ngram | 99 |
concensus | consensus | Ngram | 341 |
accross | across | Ngram | 5097 |
--Dan Polansky (talk) 12:06, 5 July 2013 (UTC)
To get an idea of how selective the predicate "common misspelling" is as opposed to mere "misspelling", I had a little look at imaginable misspellings of "conceive", and their frequency ratio as per Google Ngram Viewer:
Spelling | Corpus Frequency in Y2000 in % |
Freq Ratio to Base Spelling |
Ngram |
---|---|---|---|
conceive | 0,0006574282 | 1 | Ngram |
concieve | 0,0000004472 | 1470 | Ngram |
coceive | Not found | N/A | Ngram |
cocneive | Not found | N/A | Ngram |
cnceive | Not found | N/A | Ngram |
concive | 0,0000000197 | 33372 | Ngram |
conceie | Not found | N/A | Ngram |
conceibe | Not found | N/A | Ngram |
conceice | Not found | N/A | Ngram |
Notice that, using Google Ngram Viewer, we are looking at Google books, which is a corpus of copyedited works, as contrasted to world wide web. --Dan Polansky (talk) 09:02, 6 July 2013 (UTC)
To broaden the impression, here comes a comparison of a couple of -ize/-ise forms:
Term 1 | Term 2 | Ngram | Frequency Ratio in Year 2000 |
---|---|---|---|
analyse | analyze | Ngram | 2.6 |
crystalise | crystalize | Ngram | 6.8 |
revitalise | revitalize | Ngram | 6.5 |
popularise | popularize | Ngram | 3.7 |
formalise | formalize | Ngram | 4.4 |
pluralise | pluralize | Ngram | 7.5 |
criticise | criticize | Ngram | 5.1 |
realise | realize | Ngram | 6.7 |
organise | organize | Ngram | 5.9 |
equalise | equalize | Ngram | 7.8 |
neutralise | neutralize | Ngram | 6.8 |
socialise | socialize | Ngram | 9.8 |
--Dan Polansky (talk) 18:29, 9 July 2013 (UTC)
Hypothesis: Copyediting massively impacts frequency ratio. Verification:
Term 1 | Term 2 | Ngram | Ngram Freq Ratio in Year 2000 |
Freq Ratio in English Web |
Ratio of Ratios | Hits 1 | Hits 2 |
---|---|---|---|---|---|---|---|
beleive | believe | Ngram | 3349 | 127 | 26 | 22900000 | 2900000000 |
beleiver | believer | Ngram | 22913 | 417 | 55 | 220000 | 91700000 |
aquitted | acquitted | Ngram | 433 | 243 | 2 | 188000 | 45600000 |
aquire | acquire | Ngram | 1075 | 72 | 15 | 5080000 | 366000000 |
arithmatically | arithmetically | Ngram | 441 | 50 | 9 | 9640 | 484000 |
concieve | conceive | Ngram | 1494 | 2 | 612 | 25400000 | 62000000 |
recieve | receive | Ngram | 1874 | 40 | 46 | 56000000 | 2260000000 |
bibiliography | bibliography | Ngram | 2920 | 2118 | 1 | 68000 | 144000000 |
assidious | assiduous | Ngram | 1084 | 93 | 12 | 25600 | 2390000 |
bizzare | bizarre | Ngram | 396 | 27 | 15 | 16300000 | 444000000 |
athiest | atheist | Ngram | 561 | 67 | 8 | 1710000 | 115000000 |
condensor | condenser | Ngram | 99 | 9 | 11 | 4130000 | 37200000 |
concensus | consensus | Ngram | 341 | 91 | 4 | 1990000 | 181000000 |
accross | across | Ngram | 5097 | 187 | 27 | 16800000 | 3140000000 |
Anomalies or outliers: acquitted, conceive, bibliography.
--Dan Polansky (talk) 17:19, 12 July 2013 (UTC)
This currently has a chemistry definition. But given that it has a Proto-Slavic origin, it's almost certainly missing senses. Can you help? —CodeCat 16:14, 19 July 2013 (UTC)
Also, are -ný and -ní the same suffix or is there a difference? —CodeCat 16:25, 19 July 2013 (UTC)
- I have added a def to -ný. -ný does not seem to be the same suffix as -ní. --Dan Polansky (talk) 16:47, 22 July 2013 (UTC)
Personal attack
Why did you have to personally attack me on my own user talk page? If anyone is being shoddy, you are by attacking me personally on my own talk page. Don't do it again. Razorflame 19:34, 28 July 2013 (UTC)
- Evidence to the claims I have made on your talk page is in the archives of your talk page, in your editing history and in your block log. If you find any inaccuracy in what I write, let me know. --Dan Polansky (talk) 18:26, 29 July 2013 (UTC)
- It is a personal attack. Don't add it back to my talk page. Razorflame 20:18, 29 July 2013 (UTC)
- @Razorflame: You're bandying about "personal attack" with abandon. Don't. What he wrote is not what I, or I believe most editors, would consider a personal attack.
- @Dan: That said, I quote the following from WT:BLOCK: "[A reasonable cause for blocking is causing] ... our editors distress by directly insulting them or by being continually impolite towards them." I'm not going to block you, but it is true that you are arguably being "continually impolite". Please be civil. —Μετάknowledgediscuss/deeds 21:08, 29 July 2013 (UTC)
- You are misrepresenting WT:BLOCK. The complete WT:BLOCK policy is this: "The block tool should only be used to prevent edits that will, directly or indirectly, hinder or harm the progress of the English Wiktionary. It should not be used unless less drastic means of stopping these edits are, by the assessment of the blocking administrator, highly unlikely to succeed." I am calling Razorflame to responsibility for what he does. If you find a particular sentence that I posted incivil, be specific about it. For the record, given your history of misrepresentations and poor understanding, I am nowhere all too enthusiastic seeing you on my talk page or in the talk between me and Razorflame. --Dan Polansky (talk) 21:13, 29 July 2013 (UTC)
- Yes, I know you dislike me as well. I am here because you two are at each other's throats, and I am (apparently ineffectively) trying to make sure that neither does something actually blockworthy. —Μετάknowledgediscuss/deeds 21:20, 29 July 2013 (UTC)
- Be specific. --Dan Polansky (talk) 21:20, 29 July 2013 (UTC)
- About what? If you mean for me to be specific about "something actually blockworthy", I essentially mean harassment. Whether or not harassment has occurred could easily be argued; I think not, but Razorflame certainly feels harassed, judging by his defensive reaction. —Μετάknowledgediscuss/deeds 21:41, 29 July 2013 (UTC)
- Be specific. --Dan Polansky (talk) 21:20, 29 July 2013 (UTC)
- Yes, I know you dislike me as well. I am here because you two are at each other's throats, and I am (apparently ineffectively) trying to make sure that neither does something actually blockworthy. —Μετάknowledgediscuss/deeds 21:20, 29 July 2013 (UTC)
- You are misrepresenting WT:BLOCK. The complete WT:BLOCK policy is this: "The block tool should only be used to prevent edits that will, directly or indirectly, hinder or harm the progress of the English Wiktionary. It should not be used unless less drastic means of stopping these edits are, by the assessment of the blocking administrator, highly unlikely to succeed." I am calling Razorflame to responsibility for what he does. If you find a particular sentence that I posted incivil, be specific about it. For the record, given your history of misrepresentations and poor understanding, I am nowhere all too enthusiastic seeing you on my talk page or in the talk between me and Razorflame. --Dan Polansky (talk) 21:13, 29 July 2013 (UTC)
- It is a personal attack. Don't add it back to my talk page. Razorflame 20:18, 29 July 2013 (UTC)
- Which sentence that I have posted is incivil, blockworthy or borderline blockworthy? --Dan Polansky (talk) 21:42, 29 July 2013 (UTC)
- Nothing except reposting material that Razorflame read, removed, and (relatively civilly) asked you not to repost. You have a right to notify him of his errors on his talkpage, but reposting reverted material like that is basically edit warring. I don't think you could be reasonably blocked for it, but if you continue, perhaps someone would block you (as I said, I myself would not). —Μετάknowledgediscuss/deeds 21:51, 29 July 2013 (UTC)
- Do you believe users have the right to remove posts to their talk pages that are critical of their editing? I am not notifying Razorflame about his errors; I am notifying other editors of Razorflame's dubious editing by providing direct evidence in the form of diff hyperlinks from which editors can figure things out for themselves, without taking my word for it. My posts on Razorflame's talk page are not for Razorflame, and he knows it very well. This is why he is removing my posts. I would have blocked him for those removals, but I am not an admin. --Dan Polansky (talk) 21:55, 29 July 2013 (UTC)
- Yes, I do believe he has that right. If you truly wished to post them for the community, you should do so in the BP. —Μετάknowledgediscuss/deeds 22:02, 29 July 2013 (UTC)
- Do you believe users have the right to remove posts to their talk pages that are critical of their editing? I am not notifying Razorflame about his errors; I am notifying other editors of Razorflame's dubious editing by providing direct evidence in the form of diff hyperlinks from which editors can figure things out for themselves, without taking my word for it. My posts on Razorflame's talk page are not for Razorflame, and he knows it very well. This is why he is removing my posts. I would have blocked him for those removals, but I am not an admin. --Dan Polansky (talk) 21:55, 29 July 2013 (UTC)
- Nothing except reposting material that Razorflame read, removed, and (relatively civilly) asked you not to repost. You have a right to notify him of his errors on his talkpage, but reposting reverted material like that is basically edit warring. I don't think you could be reasonably blocked for it, but if you continue, perhaps someone would block you (as I said, I myself would not). —Μετάknowledgediscuss/deeds 21:51, 29 July 2013 (UTC)
- A user's talk page is the most natural location for finding out about him. If I posted to Beer parlour, and months later people come to Razorflame's talk page, they would not find much there. But because most of discussions about Razorflame took place on his talk page, any editor with a sincere wish to know about Razorflame can conveniently find out. Furthermore, there is no need to bring his editing to a broad community attention when the user talk page suffices. Thus, I see not a single benefit of posting to Beer parlour, while I see two benefits of posting to his talk page. --Dan Polansky (talk) 22:07, 29 July 2013 (UTC)
trreq
I want {{trreq}}
deleted or constrained as much as possible. Other editors disagree.
Discussions:
- Template_talk:trreq#RfD_deletion_debate, October 2012
- Wiktionary:Beer_parlour/2012/October#Translation requests
- Wiktionary:Beer_parlour/2013/August#trreq.27s
I am equally okay with dumping {{rfe}}
. --Dan Polansky (talk) 07:42, 3 August 2013 (UTC)
Razorflame
Razorflame (talk • contribs) is incompetent and untrustworthy. Evidence of his lack of dictionary making skill and lack of ability to make and keep promises is available in the archives of User talk:Razorflame: User_talk:Razorflame/Archive_1, User_talk:Razorflame/Archive_2, User_talk:Razorflame/Archive_3, and more in future. As a consequence of a protracted series of broken promises and mistakes resulting from his contributing in languages in which he has close to no knowledge, he was blocked in July 2010 for one year; see his block log.
Editors expressing serious misgivings about his editing include EncycloPetey, Equinox, msh210, Ruakh, opiaterein AKA Dick Laurent, Yair rand, Atelaes, and Dan Polansky (me).
Some editors hold hope for his somehow maturing over the years. However, I find it unlikely.
Arbitrarily selected incidents:
- Performing various arbitrary unjustified reverts
- Contributing in a variety of languages not understood by him with unclear error rate; fairly many errors have been identified by the editors versed in the language in question
- Adding copyrighted non-free definitions, November 2009
- Adding Czech translations from a copyrighted dictionary, June 2010
- Creating Kannada entries with extremely numerous etymology sections based on nothing at all; see ತಡೆ and Talk:ತಡೆ, August 2013