User talk:Dan Polansky

Definition from Wiktionary, the free dictionary
Jump to: navigation, search
Archive
Archives


Attested Acanthasitta[edit]

I am not sure why you think you can single me out, but if you spent less time attacking me and more time searching for words and their usage you would find things like this. http://newzealandecology.org/nzje/free_issues/NZJEcol34_1_28.pdf or this http://www.boprc.govt.nz/media/33700/Wildland-091118-TaurangaEcolDistPhase1.pdf or this http://www.sahs.uk.net/BoTNHAS%20Trans%20Vol%20V,%20Part%20II.pdf on Page 186. Please before you attack me in the future, do your research with the same dedication that you use to, well, attack me. Sincerely Speednat (talk) 18:24, 8 January 2014 (UTC)

Speednat, don't take it personal. Dan can be a bit assertive, but he has done it to others, including me before. Pass a Method (talk) 21:27, 12 January 2014 (UTC)

On copyrights and public domain[edit]

By the way, just out of curiosity why did you suddenly feel the need to note the change to my userpage on my talkpage? Was there some specific event/page/something which prompted you to do so? Or perhaps general curiosity about my history here, and elsewhere at Wikimedia? TeleComNasSprVen (talk) 08:51, 11 January 2014 (UTC)

My attention is my private concern. It so happened that I noticed your placing of a statement on your user page that could threaten the integrity of Wiktionary: 'I always thought copyrighting a dictionary is pretty silly given that definitions of words contain no original thought.' --Dan Polansky (talk) 11:42, 11 January 2014 (UTC)
Well, there are lots of worse things out there that "could threaten the integrity of Wiktionary", I assure you I am not one of them. TeleComNasSprVen (talk) 20:26, 11 January 2014 (UTC)

German compounds[edit]

German closed compounds are traditionally included. For a discussion, see Talk:Zirkusschule (mentioning also "Tanzschule"), Talk:Sportlerherz, Talk:Plastikschwanz, Talk:neuntausendneunhundertneunundneunzig. Tanzschule. --Dan Polansky (talk) 19:53, 17 January 2014 (UTC)

Are you talking to yourself? I think you probably meant to post this somewhere else. --WikiTiki89 20:00, 17 January 2014 (UTC)
I am making public notes, which appear like me talking to myself. I can search my talk pages and find the notes later using the search terms of my choice. --Dan Polansky (talk) 20:07, 17 January 2014 (UTC)

Talkback[edit]

You have new messages Hello, Dan Polansky. You have new messages at Kc kennylau's talk page.
You can remove this notice at any time by removing the {{talkback}} template.

--kc_kennylau (talk) 07:02, 18 January 2014 (UTC)

Response[edit]

Since I've had a good look-through your history and interpersonal communications with other editors, I'm going to leave this here for future reference. I know you mean well and I do not want to disparage you or anything, but my advice for avoiding further trouble would be to approach other editors in a less confrontational or direct manner about pointing out their mistakes or aberrations. We are all volunteers here, and we have to keep that in mind. Take for example, Haplology's message on my talkpage (permalink), which was well-mannered enough and sufficient enough to convey the information to me that what I did was wrong, and I responded accordingly with the appropriate measures. Your posts to my talkpage appear to convey a less than friendly tone, and I recognize that that can be difficult sometimes for people whose native language is not English, and so I've grown used to adjusting my level of WT:AGF accordingly (i.e. I have a thick skin, but that only lasts so long). I've taken notes of your messages and rest assured I've tried my best to listen as well as respond to them in the way I thought most appropriate, even if perhaps you might think I have not. For the sake of risking further appearance of antagonization from either of us, and because you've posted to my talkpage for three threads in such a short timespan (which I found slightly disturbing) I am asking you to please refrain from further posting at my talkpage, and instead to respond at the appropriate venues at RFD and RFV. Or if you have a direct issue with my behavior on broad subjects or what I put on my userpage, as you've noted at the "Copyright" section of my talkpage, to bring it up to the Beer Parlour where it would warrant further attention.

Final note: I've also noticed you once used the excuse that there had been no prior warning of "blockable" behavior on your talkpage, so that the administrators who blocked your account were considered unjustified. I am thus leaving this here as a... recommendation that you refrain from posting on my talkpage. You can still respond here as appropriately if you have any questions or concerns with this recommendation to you, and I will respond accordingly as I have this page watched. TeleComNasSprVen (talk) 08:40, 18 January 2014 (UTC)

It is true that I do a lot of talking of criticism that many admins don't do. For instance, in User talk:Pass a Method, I have asked him to stop certain behaviors again and again, to not much avail. Some admins (I am not an admin) have a different approach: they block the guy for a month without even bothering to leave a note on his talk page. All the blocks made against me were in violation of WT:BLOCK, supplied with excuses so lame that I cannot think it possible that the blocking admins believed them. --Dan Polansky (talk) 08:52, 18 January 2014 (UTC)
No, see, you're missing the point. This is not about your past blocks, and I'm sure the other admins have their reasons for choosing to block or not block. This is about your approach to criticism. Criticism is best received when it is dealt with in a constructive manner. Cf Haplology's response to me on my talkpage. That is the example I would like you to follow when dealing with me as well as other people. I have bolded my central point and what I believe to be worthy of you keeping as your 'public notes' to be looked at in the archives. TeleComNasSprVen (talk) 09:01, 18 January 2014 (UTC)
In the post above, you have started to prepare ground for my blocking, by making a reference to blockable behavior just before you make a "recommendation that you refrain from posting on my talkpage". You remove inconvenient criticism from your talk page as you see fit (diff). I find your general pattern of editing in English Wiktionary such that you should ideally depart as soon as possible, to prevent harm to English Wiktionary and avoid waste of other editor resources. --Dan Polansky (talk) 09:07, 18 January 2014 (UTC)

German adjectives lacking inflection table using AWB[edit]

The following guide was posted by User:Kc kennylau, while an original hint of the method is due to User:CodeCat:

  1. Start AWB
  2. Tools -> List comparer
  3. List 1: Category, German adjectives
  4. List 2: What transcludes page, Template:de-decl-adj-table
  5. Unique in List 1
  6. There you go

--Dan Polansky (talk) 16:54, 18 January 2014 (UTC)

Wow, I discovered that you can tag people by linking to other people's user name, for example User:Dan Polansky should give you a notice. --kc_kennylau (talk) 16:56, 18 January 2014 (UTC)

English entries lacking etymology using Python[edit]

Finding English entries lacking etymology using Python, applied to a Wiktionary dump (http://dumps.wikimedia.org/enwiktionary/latest/enwiktionary-latest-pages-articles.xml.bz2):

import sys, re
 
entryStartFound = False
etymologyFound = False
title = ""
for line in open(sys.argv[1]):
  line = line.rstrip()
  if "<title>" in line:
    title = re.sub(" *</?title> *", "", line)
  if entryStartFound:
    if "===Etymology" in line:
      etymologyFound = True
    if "----" in line or "</text>" in line:
      entryStartFound = False      
      if not etymologyFound:
         print title
      etymologyFound = False
  if "==English==" in line:
    entryStartFound = True

I wish I knew how to do this using AWB alone. --Dan Polansky (talk) 17:10, 18 January 2014 (UTC)

Updated. --Dan Polansky (talk) 10:00, 19 January 2014 (UTC)

Category intersection[edit]

The method described at #German adjectives lacking inflection table using AWB leads to a method of determining the intersection of two categories, that is, the list of items present in both categories. Ditto for set difference applied to categories. --Dan Polansky (talk) 17:34, 18 January 2014 (UTC)

Slovak nouns not linking to dictionaries[edit]

The method described at #German adjectives lacking inflection table using AWB leads to a method of determining which Slovak nouns are not using {{R:SDK}} to link to great online Slovak dictionaries.

  • List 1: Category Slovak nouns
  • List 2: What transcludes page, Template:R:SDK
  • Result: Unique in List 1

--Dan Polansky (talk) 17:41, 18 January 2014 (UTC)

English entries lacking pronunciation using Python[edit]

The method described at #English entries lacking etymology using Python applies, just that "===Etymology" is replaced with "===Pronunciation". --Dan Polansky (talk) 08:03, 19 January 2014 (UTC)

Declension of 08/15[edit]

I would like to have the declension of 08/15. --kc_kennylau (talk) 11:01, 19 January 2014 (UTC)

No idea, really. I'd guess it would be uninflected, but I am not a native German speaker; User:Longtrend is a native speaker, and he is even a student of linguistics. --Dan Polansky (talk) 11:17, 19 January 2014 (UTC)

Inflection of for Czech inflected noun forms[edit]

I have run an AWB batch that placed {{inflection of}} to definition lines of Czech inflected noun forms. I used Latin entries as a model. Czech inflected noun form entries mostly used {{form of}} before this was changed by a run of User:MewBot from 9 January 2014‎; an example edit: diff. I used User:DPMaid to do the job.

I have changed WT:About Czech to indicate the use of {{inflection of}} in diff.

The AWB specification:

  • Work list: entries in Category:Czech noun forms
  • Example edit: diff
  • Number of replacements: 107 (plus 2 under User:Dan Polansky)
  • Replacements used in AWB:
# {{genitive of\|(.*)\|lang=cs}}        # {{inflection of|$1||gen|s|lang=cs}}
# {{dative of\|(.*)\|lang=cs}}  # {{inflection of|$1||dat|s|lang=cs}}
# {{vocative of\|(.*)\|lang=cs}}        # {{inflection of|$1||voc|s|lang=cs}}
# {{locative of\|(.*)\|lang=cs}}        # {{inflection of|$1||loc|s|lang=cs}}
# {{instrumental of\|(.*)\|lang=cs}}    # {{inflection of|$1||ins|s|lang=cs}}
# {{plural of\|(.*)\|lang=cs}}  # {{inflection of|$1||nom|p|lang=cs}}
# {{instrumental plural of\|(.*)\|lang=cs}}     # {{inflection of|$1||ins|p|lang=cs}}
# {{instrumental singular of\|(.*)\|lang=cs}}   # {{inflection of|$1||ins|s|lang=cs}}
# {{accusative singular of\|(.*)\|lang=cs}}     # {{inflection of|$1||acc|s|lang=cs}}
# {{vocative singular of\|(.*)\|lang=cs}}       # {{inflection of|$1||voc|s|lang=cs}}
# {{genitive singular of\|(.*)\|lang=cs}}       # {{inflection of|$1||gen|s|lang=cs}}
# {{locative singular of\|(.*)\|lang=cs}}       # {{inflection of|$1||loc|s|lang=cs}}
# {{dative singular of\|(.*)\|lang=cs}} # {{inflection of|$1||dat|s|lang=cs}}
# {{accusative of\|(.*)\|lang=cs}}      # {{inflection of|$1||acc|s|lang=cs}}

--Dan Polansky (talk) 09:09, 2 March 2014 (UTC)

Czech traffic statistics[edit]

What follows is the entry access statistics or entry traffic statistics of Czech-only entries in English Wiktionary for the October of 2013, obtained using data available at http://stats.grok.se/. Russian and Latin are shown for comparison, again only for entries that only have one language section. Only the lemma entries were considered, which was especially needed for Latin with its huge number of inflected-form entries; for Czech and Russian, non-lemmas not marked using "{{inflection of" and "{{conjugation of" were included in the analysis, but the two languages have only a small number of inflected-form entries anyway; see also the Python script below.

Language Month Entry Hits in the Month Lemma Entries* Lemma Entries w/ 1 Lang Section
Czech 201310 157,193 21,531 18,057
Russian 201310 495,573 21,915 18,588
Latin 201310 1,244,477 33,139 27,621

* - The numbers for Czech and Russian actually include some non-lemma entries, since Czech and Russian inflected-form entries do not exclusively use "{{inflection of" and "{{conjugation of", but these non-lemma entries are relatively few.

Filtering the Czech entries with multiple language sections out of the statistics was essential. Otherwise, the result would be completely skewed, as an experiment showed. As one source of skewing, the access statistics would include high access numbers for the significant number of Czech entries that also have an English language section.

The data was obtained using the following script applied to the relational quasi-dump available from http://toolserver.org/~enwikt/definitions, in particular enwikt-defs-20140206-all.tsv. The script uses the relational quasi-dump to determine the term work list, and then uses the term work list to access http://stats.grok.se/. The fact that the date of the relational quasi-dump list lies ahead of the month for which the statistics is collected does not harm. The script output was redirected to a file; the standard error output showed the progress of the script. Running the script for Czech took more than an hour to finish.

def accessStatsPerLanguage(relationalDumpFile,language,monthString,ignoreNonlemma=True):
  # relationalDumpFile:
  # e.g. enwikt-defs-20140206-all.tsv
  # Available from http://toolserver.org/~enwikt/definitions/
  # monthString: e.g. 201310
  #
  # Limitations: ignoreNonlemma may need to capture other templates than "{{inflection of" and "{{conjugation of"
  startTime = time.time()
  # Collect a set of terms in the language
  langTerms=set()
  for line in open(relationalDumpFile):
    if not line.startswith(language): continue
    fields=line.split("\t")
    if ignoreNonlemma and ("{{inflection of" in fields[3] or "{{conjugation of" in fields[3]): continue #
    langTerms.add(fields[1])
  print >> sys.stderr, "Terms of the "+language+" language collected."
  # Collect a set of terms that are in that language but have more than one section
  langMultiSecTerms=set()
  for line in open(relationalDumpFile):    
    if line.startswith(language): continue
    term=line.split("\t")[1]
    if term in langTerms:
      langMultiSecTerms.add(term)
  print >> sys.stderr, "Terms of the "+language+" language having multiple language sections collected."
  langSingleSecTerms=langTerms-langMultiSecTerms
  # Collect the page hits
  totalHits=0
  termsProcessed=0
  singleSecTermsProcessed=0
  startTime2 = time.time()
  for term in sorted(langTerms):
    hitString="0"
    if not term in langMultiSecTerms:
      url="http://stats.grok.se/en.d/"+monthString+"/"+term
      for line in urllib.urlopen(url):
        if "has been viewed " in line:
          line = line.rstrip()
          hitString = re.sub(".*has been viewed ","",line)
          break
      singleSecTermsProcessed+=1
    else:
      hitString="-0" #-0 to indicate that it has multiple language sections
    totalHits += int(hitString) 
    print term + "\t" + hitString + "\t" + str(totalHits)
    termsProcessed+=1
    if termsProcessed%+100==0:
      # Ensure progress is visible
      timeSpent = time.time() - startTime2 
      timeToGoSeconds = (timeSpent/float(singleSecTermsProcessed))*\
                        (len(langSingleSecTerms)-singleSecTermsProcessed)
      print >> sys.stderr, str(round(100 * singleSecTermsProcessed /\
               float(len(langSingleSecTerms)),1))+"%"+\
        " - "+ str(int(timeToGoSeconds/60)) + " min to go"
  print >> sys.stderr, "Total time elapsed:", int((time.time() - startTime)/60),"min"
 
if __name__ == '__main__':
  dumpFile = sys.argv[1]
  language="Czech"
  if len(sys.argv)>=3: language=sys.argv[2]
  month="201310"
  if len(sys.argv)>=4: month=sys.argv[3]
  accessStatsPerLanguage(dumpFile,language,month)

--Dan Polansky (talk) 17:12, 2 March 2014 (UTC)

Wikisaurus statistics[edit]

Some Wikisaurus statistics for October 2013 follow, based on stats.grok.se, such as "http://stats.grok.se/en.d/201310/Wikisaurus:fatty_acid".

For previous Wikisaurus statistics, see User_talk:Dan_Polansky/2012#Wikisaurus_statistics.

  • The number of Wikisaurus pages: 1309
  • The total number of page hits in Wikisaurus in October 2013: 40,600
  • The total number of page hits in Wikisaurus in October 2013, without top 100 pages: 20,357
  • Median page hits per Wikisaurus page in October 2013: 15
  • Average page hits per Wikisaurus page in October 2013: 31

Top 100 Wikisaurus pages in October 2013, with page hits in the month:

Wikisaurus:penis 2090
Wikisaurus:vulva 1904
Wikisaurus:masturbate 1812
Wikisaurus:breasts 1088
Wikisaurus:sexual intercourse 1024
Wikisaurus:vagina 820
Wikisaurus:testicles 667
Wikisaurus:money 631
Wikisaurus:labia 434
Wikisaurus:penis/translations 426
Wikisaurus:anus 372
Wikisaurus:beautiful woman 358
Wikisaurus:prostitute 358
Wikisaurus:drunk 344
Wikisaurus:clitoris 327
Wikisaurus:semen 277
Wikisaurus:marijuana 266
Wikisaurus:erection 244
Wikisaurus:insane 199
Wikisaurus:promiscuous man 195
Wikisaurus:buttocks 194
Wikisaurus:promiscuous woman 186
Wikisaurus:defecate 144
Wikisaurus:wow 143
Wikisaurus:beer 141
Wikisaurus:ear 136
Wikisaurus:pubic hair 134
Wikisaurus:joke 127
Wikisaurus:marijuana cigarette 124
Wikisaurus:erect penis 119
Wikisaurus:idiot 117
Wikisaurus:nonsense 116
Wikisaurus:oral sex 106
Wikisaurus:obstinate 100
Wikisaurus:fool 99
Wikisaurus:vagina/translations 99
Wikisaurus:male homosexual 97
Wikisaurus:bathroom 94
Wikisaurus:sexual partner 94
Wikisaurus:die 93
Wikisaurus:excellent 93
Wikisaurus:fastidious 90
Wikisaurus:copulate 89
Wikisaurus:arrogant 86
Wikisaurus:index finger 86
Wikisaurus:woman 85
Wikisaurus:libertine 83
Wikisaurus:abode 82
Wikisaurus:cheeky 81
Wikisaurus:obese 81
Wikisaurus:thingy 81
Wikisaurus:destroy 78
Wikisaurus:give head 75
Wikisaurus:villain 74
Wikisaurus:kill 73
Wikisaurus:mad person 73
Wikisaurus:sexual activity 73
Wikisaurus:witty 70
Wikisaurus:disorder 68
Wikisaurus:saying 68
Wikisaurus:condom 67
Wikisaurus:naive 67
Wikisaurus:nothing 67
Wikisaurus:noodle 66
Wikisaurus:characteristic 65
Wikisaurus:death 65
Wikisaurus:abandon 64
Wikisaurus:mock 64
Wikisaurus:nipples 64
Wikisaurus:intelligent 63
Wikisaurus:pasta 63
Wikisaurus:utter 63
Wikisaurus:ejaculate 62
Wikisaurus:male genitalia 61
Wikisaurus:scrawny 61
Wikisaurus:water 61
Wikisaurus:covert 60
Wikisaurus:ejaculation 60
Wikisaurus:apex 59
Wikisaurus:girl 59
Wikisaurus:calm 58
Wikisaurus:child 58
Wikisaurus:delicious 58
Wikisaurus:laugh 58
Wikisaurus:anal sex 57
Wikisaurus:fake 57
Wikisaurus:bad 55
Wikisaurus:ghost 55
Wikisaurus:sexy 55
Wikisaurus:zillion 55
Wikisaurus:humble 54
Wikisaurus:masturbation 54
Wikisaurus:praise 54
Wikisaurus:reprehend 54
Wikisaurus:tiny 54
Wikisaurus:chav 53
Wikisaurus:hinder 52
Wikisaurus:circumcised 51
Wikisaurus:hidden 51
Wikisaurus:obstinacy 51

Method:

  • 1. Create a text file with the list of members of Category:Wikisaurus, one per line.
  • 2. Run the following script on the text file.
import sys, urllib, re
monthString="201310"
WSTerms =  []
for line in open(sys.argv[1]):
  if line.startswith("Wikisaurus:"):
    WSTerms.append(line.rstrip())
WSTerms.sort()
totalHits=0
WSTermNo=0
for WSEntry in WSTerms:
  WSTermNo+=1
  hitString="0"
  url="http://stats.grok.se/en.d/"+monthString+"/"+WSEntry
  for line in urllib.urlopen(url):
    if "has been viewed " in line:
      line = line.rstrip()
      hitString = re.sub(".*has been viewed ","",line)
      break
  totalHits += int(hitString) 
  print WSEntry + "\t" + hitString + "\t" + str(totalHits)
  print >> sys.stderr, str(WSTermNo) + " out of "+str(len(WSTerms))+" processed"

--Dan Polansky (talk) 18:18, 2 March 2014 (UTC)

Interesting to see what kind of words people want synonyms for. --WikiTiki89 19:22, 2 March 2014 (UTC)

English -ing form and gerund[edit]

The treatment of English -ing forms that act as nouns, sometimes correctly or incorrectly called "gerund", is an unresolved problem. See also Appendix:English -ing forms and Talk:fucking, Talk:perusing, Talk:ploughing, Talk:dating. And also User:Dan_Polansky/Notes#Gerund. --Dan Polansky (talk) 09:06, 9 March 2014 (UTC)

Re linking in reference templates[edit]

Hello Dan Polansky. Re our recent discussion-via-edit-summaries of {{R:L&S}}, my contention is that reference templates should at least link to the relevant cited authority's Wikipedia article (where it exists); that way, an explanation for why the source is being cited as an authority is readily available for the sceptical reader on the other side of the link. Besides that principle, if you want a precedent, {{R:LSJ}} already links to Wikipedia; admittedly, I added that link, but Atelaes has since edited that template without removing the link, so he must, at least, not think that its inclusion is a bad idea. Before I do any more editing pursuant to this issue, I'd like to know: Why do you oppose this linking, other than out of a desire for consistency across referencing templates? — I.S.M.E.T.A. 13:23, 9 March 2014 (UTC)

To be explicit, I do think that the current linking on {{R:LSJ}} is a good idea, for the above-stated reasons, that readers can find out about our source for themselves. However, I also think that the initial round of linking on {{R:L&S}} was a bit excessive. Linking to the Wikipedia article on New York seems utterly superfluous in that situation. -Atelaes λάλει ἐμοί 17:13, 9 March 2014 (UTC)
Less relevant links are distracting, IMHO. A reader can copy the name of the reference work and paste it to Wikipedia article box thereby finding the relevant article, so the presence of the wikilinks in the source name is inessential. Wikipedia's article does not give any authority to a source anyway. Your original edit in diff looked like a bad joke; I have absolutely no idea why anyone might find it a good idea. I like the minimalist linking practice used in so many reference templates; if it is to be changed, there needs to be a consensus to do so. --Dan Polansky (talk) 18:18, 9 March 2014 (UTC)
FWIW, reconsidering it, I agree that "the initial round of linking on {{R:L&S}} was a bit excessive". Re "copy[ing] the name of the reference work and past[ing] it to [the search] box", doesn't the same go for everything? What's the point of linking at all in that case? And re "giv[ing] authority to a source", I didn't mean that just having an article on Wikipedia automatically gives a source authority; however, reading statements like "A Latin Dictionary…is a popular English-language lexicographical work of the Latin language" and "Lewis and Short remains a standard reference work for medievalists, renaissance specialists, and early modernists, as the dictionary covers Late and Medieval Latin, if somewhat inconsistently" indicates the (fairly) high regard in which the source is held. Anyway, in the hope of obtaining consensus for this sort of linking, I have started a policy discussion at Wiktionary:Beer parlour/2014/March#Links to Wikipedia in reference templates; please feel free to pass comment on it there. — I.S.M.E.T.A. 19:49, 9 March 2014 (UTC)

Formal voting and change of voter stance[edit]

(My notes on voting are at User:Dan Polansky/Voting.)

Voting via a formal process may make some people change the confidence with which they take their stance. In Wiktionary:Beer_parlour/2014/March#Stop_treating_Nynorsk_and_Bokmal_as_languages_separate_from_Norwegian, Angr, Pengo and Teodor voted boldfaced support and Eiríkr Útlendi voted weak support; in Wiktionary:Votes/pl-2014-03/Unified Norwegian, they all voted "Abstain".

The vote counting in the Beer parlour discussion yields 8.5 for support and 2 for oppose; it yields an overwhelimg consensus by any standard ever used in a Wiktionary vote. The vote counting in the vote currently yields 5 for support, 5 for oppose and 5 for abstain. This is a sharp contrast between Beer parlour and the vote. --Dan Polansky (talk) 08:21, 5 April 2014 (UTC)

Latin and first person in lemma and definition of verb[edit]

Latin verbs are currently defined in first person ("I swim" rather than "To swim"). There was a Wiktionary discussion on that, especially in Wiktionary:Tea_room:nāscor, linked to below; since the discussion was not properly archived, I could only find it by finding the vote that the discussion spawned.

First person in lemma:

First person in definition:

--Dan Polansky (talk) 08:00, 12 April 2014 (UTC)