Wiktionary:Beer parlour

Definition from Wiktionary, the free dictionary
(Redirected from Wiktionary:BP)
Jump to: navigation, search

Wiktionary > Discussion rooms > Beer parlour

Lautrec a corner in a dance hall 1892.jpg

Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.

Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.

Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!

Beer parlour archives +/-


February 2015

Wiktionary:Criteria for inclusion[edit]

I ask that Kephir edit to Wiktionary:Criteria for inclusion (diff) is undone. I would do it myself but I cannot. Vote Wiktionary:Votes/2014-11/Entries which do not meet CFI to be deleted even if there is a consensus to keep did not pass, and therefore cannot lead to any edit to WT:CFI; no edit to that page was proposed. Furthermore, the opposers of the vote did not express the wish to ignore CFI, merely to override it in a relatively small number of cases. I am fairly certain that the edit is not based on consensus. --Dan Polansky (talk) 12:42, 1 February 2015 (UTC)

Yeah, it probably should be undone by a sysop. At present, we have no mechanism for deletion of articles. See my comment below. Purplebackpack89 16:21, 1 February 2015 (UTC)

Wiktionary:Votes/2014-11/Entries which do not meet CFI to be deleted even if there is a consensus to keep: What does it mean?[edit]

The clock ran out on Wiktionary:Votes/2014-11/Entries which do not meet CFI to be deleted even if there is a consensus to keep, and it was closed as not enacted. So what does this mean? Recently, Wiktionary:CFI was changed to be an obsolete policy. Much as I do not agree with CFI as currently written, it's clear that that was not the correct approach, and creates a host of problems stemming from the lack of a mechanism to delete any article. No, what I interpret the discussion to be is that many participants are unhappy with CFI as written and they believe things other than CFI should be considered in RfD discussions. The upshot of this is not demoting CFI (or at least the part of it that is RfD's purview) to obsolescence, but demoting it to a guideline. Some of you have asked "what's the difference between a policy and a guideline?" A policy is overarching, supported by a wide supermajority of participants, and should be followed all or almost all the time. A guideline can be ignored if there's a consensus to, need not cover everything, and can be enacted with less of a supermajority. The practical implications of this are that articles can still be nominated for deletion under the auspices of CFI, but they won't necessarily be deleted solely on CFI. I see this as being in line with the vote. Note that much of this doesn't apply to the parts of CFI that have to do with RfV. I have seen no evidence of people being upset with the verifiability sections of CFI, which should probably be spun off into a different page that remains policy. Purplebackpack89 16:16, 1 February 2015 (UTC)

Vote Trimming CFI for Wiktionary is not an encyclopedia[edit]

FYI, Wiktionary:Votes/pl-2015-02/Trimming CFI for Wiktionary is not an encyclopedia. Let us postpone the vote as long as discussion requires. --Dan Polansky (talk) 16:24, 1 February 2015 (UTC)

Just some food for thought about bots[edit]

In the words of C.G.P.Grey, "They don’t need to be perfect, they just need to be better than us [humans]."

How should we deal with bots who make mistakes? --kc_kennylau (talk) 16:42, 1 February 2015 (UTC)

  • The traditional thing is to tell their human owner. That's what people have done with mine (far too often). SemperBlotto (talk) 16:44, 1 February 2015 (UTC)
    • @SemperBlotto: Well, telling the human owner does not directly solve the problem. Sometimes, fixing the error is much harder and time-consuming than creating it, especially if the error is hidden amongst a group of, let's say, 5000 pages. Personally, I spend 1% of the time writing the programme and 99% of the time debugging. --kc_kennylau (talk) 17:29, 1 February 2015 (UTC)

Request to Add New Subcategory "LWT" Within LDL[edit]

This is a request for the Wiktionary Community to consider adding a new subcategory, Languages without a Written Tradition (LWT), under LDL (Less Documented Languages).

What is an "LWT"?

An LWT is a language that has an oral tradition, but has no tradition of writing and no written publications authored by native speakers. LWTs are a subset of LDLs. (Note that documents authored in other languages by outsiders and merely translated by native speakers, such as the Bible and government documents, are not suitable as sources for documenting a language.)

Why not Simply Call LWTs "Unwritten Languages"?

The term "unwritten" can be misleading, because the boundary between languages that are "unwritten" and languages that are "written" is actually quite fuzzy. Presumably we can all agree that a language community that has no writing system, no notion of literacy, and has never had its speech transcribed by outsiders can be considered an "unwritten language."
But when that community is visited by linguists who develop an orthography, and (perhaps imperfectly) transcribe some words and phrases from the spoken language into written form, perhaps publishing the results, what then? Is this language "written," even if no one in the language community is literate, and the published "results" contain the errors of a non-native speaker? Some of you might call such a language "written," and others just as reasonably might say it is "unwritten."
Let us now consider a third example. What about a small indigenous language community in Brazil that is completely unfamiliar with writing, and yet, through a process of increasing contact with the national society, develops an orthography and village schools, where children are taught to read and write in both their indigenous language and Portuguese? Obviously, when nearly every child can write words in their own language, the language cannot be considered an "unwritten language." Yet is it a "written" language? Does it have great literature? Yes, in oral form. Poetry? Absolutely, in oral form. Historical narratives, sacred texts, genealogies, song lyrics, compendiums of botanical and zoological knowledge? Yes, all in oral form. What, then, is written in this language? Aside from basic word lists and literacy primers modeled on Portuguese examples, virtually nothing — yet. Today's young adults are the first literate generation.
This is the case with Wauja, an Arawak language spoken by 400 indigenous people in lowland Amazonia. Although Wauja was "unwritten" a generation ago, today it is "written," in the sense the children are taught basic literacy in their village schools. However, as yet — and this doubtless will change — there is no written tradition in this language, no body of publications authored by native speakers. All their literature is still in oral form.
For the purposes of Wiktionary, the key issue is not whether a missionary or professional linguist has phonetically transcribed snippets from the language, but whether there exists a body of work authored by native speakers that is large enough to provide references for every word in the language. For languages like English, Chinese, and all "major" languages, the answer is yes. These languages have extensive written traditions. For thousands of small and endangered languages, the answer is no. These are languages with rich intellectual and literary traditions — in oral form. Such languages may have some (recently-acquired) knowledge of writing, but they have no tradition of writing. This presence or lack of native-speaker-authored published references is the distinction that matters for the Wiktionary community, at least in reference to inclusion criteria.

Why is the Subcategory LWT Needed?

LWTs, by definition, lack a body of published sources authored by native speakers. As a result, it is not possible to use published sources to attest to Wiktionary entries for LWTs. Nevertheless, LWTs are important members of the family of human languages, with rich literary and intellectual traditions, and they deserve to be included in Wiktionary. In fact, these LWTs are typically endangered languages spoken by language communities that are most in need of the permanent, globally accessible, open source, cultural commons platform that only Wiktionary can provide. Therefore, it is proposed that the Wiktionary community define this limited category of languages (LWTs) and agree upon attestation criteria that are sensible and appropriate for such languages.

Can LWTs Meet Current Attestation Standards?

Current Wiktionary attestation standards (see Criteria for inclusion) call for verification either through widespread use (hard to verify for a language without publications) or "use in permanently recorded media, conveying meaning, in at least three independent instances spanning at least a year (different requirements apply for certain languages)." For spoken languages that are living [but not well documented on the Internet], only one use or mention is adequate, subject to the following requirements:
  • the community of editors for that language should maintain a list of materials deemed appropriate as the only sources for entries based on a single mention,
  • each entry should have its source(s) listed on the entry or citation page, and
  • a box explaining that a low number of citations were used should be included on the entry page (such as by using the LDL template).
Assuming that the first bulleted requirement above refers to a list of materials that are permanently available online, probably most LWTs cannot meet this requirement. For example, in the case of the Wauja language, spoken as a first language by 400 people in the Amazonian rainforest, there are hundreds of audio recordings, and several dozen carefully transcribed traditional stories, but none of them currently are available online. (Though they could be made available to Wiktionary admins upon request.)
Before these stories are posted online, the community must agree that they are correctly transcribed. That's because they were first recorded and transcribed several decades ago by an anthropologist (myself, in this case), at a time before any Wauja were able to read and write. Today, there is a cadre of young university-educated Wauja bilingual schoolteachers who are deeply committed to standardizing their orthography and documenting their language. However, this process takes time, because it is not decided by fiat. Instead, the Wauja, like many communities that speak LWTs, take time to reach decisions through building consensus. It's a chicken-and-egg situation. Without a standard orthography, it's hard to build a dictionary, but without a dictionary, it's hard to standardize the orthography.

Proposed Attestation Standards for LWTs

To allow responsible documentation to proceed within Wiktionary while members of LWT communities increasingly move toward standard orthography, publications by native speakers, and full compliance with Wiktionary LDL attestation standards, the following interim attestation standards for LWTs are proposed:
  • The community of editors for that language should maintain a list of materials deemed appropriate as the only currently existing sources for entries.
  • These sources may include audio or video recordings of native speakers, and transcripts of such recordings.
  • Sources also may include direct quotes from letters and written messages produced by literate native speakers, provided that the quoted material is archived online and annotated as described below.
  • All sources must include mention of the date of the recording or transcription, names of the native speakers recorded, the location of the recording, the name of the person making the recording, and location where the source is archived, if not online.
  • Once the transcript has been authorized by the language community as a faithful transcription, the names of community members involved in verifying the transcription also must be noted, and a copy must be posted to a permanent online location, such as Wikisource.
  • If Wiktionary admins find any reason to doubt the authenticity of the sources cited, they shall be allowed to examine the source material.
The overall goal of attestation standards for LWTs is to ensure responsible and reliable attestation for LWT entries, while making Wiktionary the best platform for documenting the world's many LWTs.
Clarification re: Sources for Attestation (Text vs. Audio and Video)
Based on comments below, it appears that the "Proposed Attestation Standards" listed above need clarification. My intention was to propose uploading all TEXT (written) sources to a permanent online location. This could be Wikisource or another location, such as an endangered language digital archive. However, I cannot propose uploading the actual AUDIO and VIDEO recordings to a Creative Commons site, because some language communities might not want the actual voice and video recordings of their elders in the public domain. For instance, in the case of the Wauja community (an indigenous people of Central Brazil), it would be offensive to publicly play recordings of elders after they have died, particularly since the community would have no say in how often or under what circumstances the recordings would be played. No such restriction is attached to mere text transcriptions, however, which Wauja elders consider to lack the spiritual power of the human voice. Most types of written texts could be posted in a freely accessible permanent online location.
Fortunately, however, the recordings could still be used for attestation. The Criteria for inclusion states:
"Where possible, it is better to cite sources that are likely to remain easily accessible over time, so that someone referring to Wiktionary years from now is likely to be able to find the original source. As Wiktionary is an online dictionary, this naturally favors media such as Usenet groups, which are durably archived by Google. Print media such as books and magazines will also do, particularly if their contents are indexed online. Other recorded media such as audio and video are also acceptable, provided they are of verifiable origin and are durably archived." (emphasis added)
If audio and video recordings used for attestation are deposited at a digital archive for endangered languages (for example: ELAR, the Smithsonian Institution, the Library of Congress), then "someone referring to Wiktionary years from now is likely to be able to find the original source" and, at the same time, the wishes of the language community will be honored regarding the respectful and appropriate use of their recorded material.
In summary, I propose that TEXT source materials (such as PDFs of transcriptions and translations) be posted on Wikisource or another location, such as an endangered language digital archive, but that AUDIO and VIDEO recordings of actual human beings be archived in a publicly-accessible digital archive that is equipped to honor specific intellectual property rights and privacy concerns of the endangered-language community in question. (Examples of suitable archives: ELAR, the Smithsonian Institution, the Library of Congress, and so on.) Emi-Ireland (talk) 22:10, 2 February 2015 (UTC)

Honoring the "No Original Research" Principle

For a language with a written tradition, it is appropriate to refer to published sources written in that language. However, for a language that consists of an exclusively oral tradition, it is appropriate to refer to authoritative oral sources that have been recorded and transcribed. To ensure that the "no original research" principal is honored, transcriptions of traditional stories, historical narratives, public oratory, and sacred incantations performed by elders before an audience can be given priority as sources, since these linguistic sources are particularly authoritative and reliable for LWTs.
Clarification: Faithful transcriptions from audio or video sources are NOT considered original research on Wikipedia
I searched for a Wiktionary policy statement on No Original Research, but have not found it yet. In the meantime, here is the Wikipedia policy statement on transcriptions from audio and video sources:
"Translations and transcriptions: Faithfully translating sourced material into English, or transcribing spoken words from audio or video sources, is not considered original research. For information on how to handle sources that require translation, see Wikipedia:Verifiability#Non-English sources."
https://en.wikipedia.org/wiki/Wikipedia:No_original_research#Translations_and_transcriptions Emi-Ireland (talk) 19:05, 3 February 2015 (UTC)

Proposed Standard for Transitioning from LWTs to LDLs

When a language has a sufficient body of publications (authored by native speakers) so that every word in the language can be referenced to a published work authored by native speakers, that language is no longer an LWT.
In practical terms, there is no hard and fast cut-off point, but perhaps we can say that once an LWT community has achieved a minimum threshold of 3,000 entries in Wiktionary, the community will have become aware of the importance of lexicography and its methods, and it will have benefited greatly from using Wiktionary to document, analyze, and teach literacy in their language. The language community will have had an opportunity to standardize their orthography, properly review transcriptions of older recordings of traditional oral literature, have native speakers produce new publications based on new recordings, and permanently archive online all such transcripts and publications. As a result, this language community will be considered capable of meeting LDL attestation standards going forward.

Emi-Ireland (talk) 19:35, 1 February 2015 (UTC)

Symbol support vote.svg Broadly support. This whole idea needs much more detail yet, but it seems clear that attestation standards for languages that have an existing literature, just one that is not well documented on the Internet, will have to be different from languages that have never had a written tradition.
It is also unclear to me if any considerable community of editors with a LWT as their heritage language (whether as a mother tongue, passive knowledge, or something in-between) even exists on the English Wiktionary yet. Even several larger minority languages out there, with relatively long-running historical traditions, have hardly any editors with more than elementary skills (e.g. Nahuatl, Navajo, Northern Sami, Xhosa). The situation might be different on other Wiktionaries, though, and e.g. I would not be surprized if a hypothetical Wauja wiktionarian community ended up preferring the Portuguese Wiktionary.
Also, as far as I know, "materials deemed appropriate as the only sources for entries based on a single mention" is not a priori limited to material permanently available online. This could well include sources such as linguistic publications, depending on the language in question.
Some other issues to consider:
  • If no consensus orthography exists yet, how are we to title any word entries? In terms of a pronunciation?
  • Would the entries qualify for the main namespace at all, or should a new dedicated namespace such as Unwritten: be established?
  • What about unwritten extinct languages? (I have been preparing a proposal with respect to some extinct languages, but for now I think I will instead watch this discussion unfold.)
--Tropylium (talk) 00:16, 2 February 2015 (UTC)
Re Tropilium's question: "If no consensus orthography exists yet, how are we to title any word entries?"
Tropylium, this is an excellent question. Perhaps we can consider the case of the Wauja, as an example. Currently, the Wauja themselves agree on the spellings for many words, but there is a vowel that missionary linguists spell one way, using a character not found on standard keyboards, and some young Wauja schoolteachers want to spell it another way, using the standard Latin alphabet. The community will have to sort that out, and it may not be decided overnight. (Certainly English spelling was not standardized overnight). In the meantime, the spelling of Wauja words in Wiktionary may occasionally need to be corrected.
A more thorny issue is where to place the breaks between words. This is where various Wauja authors most often differ from one another. Wauja is an agglutinating language. For example, verbs can use multiple suffixes simultaneously. Some authors write them all as one word, and others might break off the last suffix or two and write them as a separate words. It is possible that decisions to break up long Wauja words may result from notions that a word looks "too long" when compared to Portuguese words. My view is that both approaches are entirely valid, and that the community will have to decide which it chooses to use as the standard.
In the meantime, this wonderful language is endangered, and so it is essential to continue with the process of documenting it. Documentation is valuable not only for its own sake, but because it sends a strong message to young Wauja that the outside world values their language. In fact, the community is very excited that this summer they will be trained in how to participate in building a digital lexicon on the Wauja-Portuguese site. It is entirely possible, Tropylium, that you are correct, and that the Wauja will see the Wauja-Portuguese site as "their" dictionary. However, the Wauja also see themselves as global citizens, and are delighted and proud that a dictionary is being created that translates their language into English. Currently, some young Wauja learn snatches of English from popular song lyrics they encounter online. A Wauja-English Wiktionary will be welcomed not only by scholars and the general public, but by the Wauja themselves. Emi-Ireland (talk) 01:19, 2 February 2015 (UTC)
  • Oppose. No mechanism of independent verification proposed. The attesting recordings are proposed to be uploaded directly onto Wikimedia servers as per "Proposed Attestation Standards for LWTs" above. The section 'Honoring the "No Original Research" Principle' above seems to be contradictory; this does look like original research, especially in that the attesting material itself is original research (we do original research in that we are figuring out definitions from attesting quotations, but that's a different game, I think). Thus, this seems like something for Wikiversity. --Dan Polansky (talk) 18:48, 2 February 2015 (UTC)
Re: No Original Research rule
Please note that, per Wikipedia Policy, transcribing spoken words from audio or video sources is not considered original research. If Wiktionary has a policy on transcriptions and the No Original Research rule that contradicts this, I have not been able to find it. See: https://en.wikipedia.org/wiki/Wikipedia:No_original_research#Translations_and_transcriptions Emi-Ireland (talk) 19:22, 3 February 2015 (UTC)
Re: Importance of including all human languages
Thank you for your thoughtful comments. Given that you do not support the proposed attestation standards, I earnestly invite you to contribute your own suggestions for attestation standards that you could support. If we put our heads together, surely we can devise attestation standards that do not automatically exclude a large number of human languages, simply because they have an oral tradition, and not a written one.
The thing we must not lose sight of is that languages without a written tradition should not automatically be excluded from Wiktionary. That would be grossly unfair to the speakers of those languages, and it would be a sad day for Wiktionary, as well. We must find a way to include all human languages in Wiktionary, while taking every reasonable measure to ensure that the work is done as it should be.
I am a newcomer to Wiktionary, and so I assume your knowledge of suitable attestation standards is greater than mine. Can we work together to come up with a standard that does not exclude LWTs (languages that do not happen to have a written tradition)? Surely they are many ways we can address this problem. The important thing is to refrain from treating LWTs as if their languages don't belong here.
This is what inspired me to contribute to Wiktionary:
"Wiktionary ... aims to describe all words of all languages using definitions and descriptions in English."
We should live up to that, as well as to our attestation standards. Let's find a way to do both.
Emi-Ireland (talk) 20:10, 2 February 2015 (UTC)

Add category for terms with IPA pronunciation[edit]

I propose to add a category for the terms with IPA pronunciation by language. The edit is here which I reverted one second after I did it, in order to demonstrate the method before in order to ask for consensus. --kc_kennylau (talk) 12:34, 3 February 2015 (UTC)

To what end? I think it would be more helpful to have a category for terms without IPA pronunciation, so we know what needs to be added. —Aɴɢʀ (talk) 20:37, 3 February 2015 (UTC)
We already have Category:English terms with audio links though. —CodeCat 20:42, 3 February 2015 (UTC)
@Angr: Well, it is virtually impossible to have a category for terms without IPA pronunciation, and using category scanning tools can actually identify those terms without IPA pronunciation if there is a category for terms with IPA pronunciation. --kc_kennylau (talk) 11:10, 4 February 2015 (UTC)
Sure, why not? We are blessed to be equipped with categories, and they should be exploited. --Type56op9 (talk) 11:15, 4 February 2015 (UTC)
Done. --kc_kennylau (talk) 12:42, 6 February 2015 (UTC)

Languages - are they proper nouns or not?[edit]

I have had a long discussion with User:-sche (User talk:-sche#Maori) (can't get it right) about whether languages are proper nouns. In my opinion they are mass nouns instead. It was suggested by -ische that a discussion be started here on the subject. Donnanz (talk) 22:52, 3 February 2015 (UTC)

IFYPFY - -sche (discuss)
Thanks for that! Donnanz (talk) 23:38, 3 February 2015 (UTC)
I see someone already made the point about pluralisation ("various Englishes"); however, our proper-noun template doesn't preclude the possibility of a plural. Some people seem to like to add plurals for given names and surnames. I will admit that I find the common/proper distinction very confusing. Equinox 23:44, 3 February 2015 (UTC)
We should make languages common nouns, also demonyms (nationalities, ethnicities), e.g. German (person), even if they are capitalised in English and some other languages. Nominalised adjectives, like English, Chinese, etc. shouldn't have plural forms in standard English, it's easy to address. --Anatoli T. (обсудить/вклад) 23:47, 3 February 2015 (UTC)
@Atitarev: WTF? Why? This is English Wiktionary. Why in the world should we subjugate our syntactic tradition to those of other languages? That there are lots of secondary uses of proper nouns as common nouns or mass nouns is immaterial.
"Nominalised adjectives, like English, Chinese, etc. shouldn't have plural forms in standard English, it's easy to address." Are you saying that you don't like Englishes and that you disapprove of those who use the word? You should take it up with the authors in these Google Books hits for Englishes. DCDuring TALK 00:27, 4 February 2015 (UTC)
You misunderstand me but I haven't expressed myself well. My point is "English" (noun) and (proper noun) sections should be merged into common noun and a note about "Englishes" should be added, as it is normally uncountable and is pluralised only for some senses. --Anatoli T. (обсудить/вклад) 00:40, 4 February 2015 (UTC)
Funnily enough, the top match (World Englishes: A Resource Book for Students) is a course text at my university. Equinox 00:41, 4 February 2015 (UTC)
Perhaps English is not the best example but "Chineses" or "Vietnameses" sounds pejorative. Anyway, no need to be picky about what I said, let's focus on PoS discussion - common nouns vs proper nouns. --Anatoli T. (обсудить/вклад) 00:48, 4 February 2015 (UTC)
OK. Please provide some reason why you think the merger would be a good idea for Wiktionary users. DCDuring TALK 00:59, 4 February 2015 (UTC)
There was a similar discussion, I think started by User:CodeCat about eliminating proper nouns, there was some reasoning there. Not sure where that discussion is now. While I think it's a good idea, the first step is perhaps deciding what candidates are first to be reduced to common nouns. This will reduce the entries, remove duplication in translations, less maintenance. Days of the week (e.g. Saturday), month names (e.g. November) are also capitalised but they are common nouns. I think language names and demonyms are also common nouns but there are various opinions on this. Let's see what other people think. --Anatoli T. (обсудить/вклад) 01:18, 4 February 2015 (UTC)
The discussion I meant - Wiktionary:Beer_parlour/2014/October#On_proper_nouns --Anatoli T. (обсудить/вклад) 01:24, 4 February 2015 (UTC)
There seems to be a general consensus in that discussion that languages are not proper nouns. However I wouldn't go as far as recommending implementing CodeCat's suggestion that the categories for proper nouns and (common) nouns be merged. Names like English and French are also surnames, so even if languages are treated as common nouns there would still be a need for a proper noun in those cases. Donnanz (talk) 10:36, 4 February 2015 (UTC)
What consensus? Donnanz and Atitarev, non-native speakers of English? It would be like Equinox and I agreeing that Russian entries should not be in Cyrillic. DCDuring TALK 11:16, 4 February 2015 (UTC)
Perhaps you'd like to clarify that, I am a native speaker of English by the way. No decision was reached in that discussion, but reading that thread (even between the lines) I got the impression that there was a general consensus for treatment of languages as common nouns. Donnanz (talk) 11:32, 4 February 2015 (UTC)
In Spanish and French they are treated as common nouns (and are uncountable). --Type56op9 (talk) 11:13, 4 February 2015 (UTC)
In French, the plural may be used in some cases (e.g. Français parlés et français enseignés, a book by Juliette Delahaie, 2010). It's the same as English, except that they are not capitalized. Lmaltier (talk) 21:19, 4 February 2015 (UTC)
In English we dispense with diacritcal marks. So let's clean them out of Spanish and French entries. DCDuring TALK 11:20, 4 February 2015 (UTC)
The Scandinavian languages also treat languages as common nouns, and no capital letter is used. Donnanz (talk) 11:32, 4 February 2015 (UTC)
My inclination is to continue treating language names as proper nouns. I'm having a hard time find authoritative advice on the matter, however. Most of the reference works I've found (via Google Books), both those from a hundred years ago and those from last year, conflate properness-vs-commonness with capitalization. One outright says "Capitalize proper nouns and words derived from them; do not capitalize common nouns", which is obviously inaccurate — tell it to the Marines, the Americans and the Englishmen.
Alfred Marshall Hitchcock's 1910 Junior English Book says North, South, East and West are proper nouns; spring, summer, autumn, fall and winter are common nouns; arithmetic, science, geography and other branches of study are common nouns; and English, French, German, Latin and other names of languages are proper nouns. However, it then goes on to discussion how people don't capitalize the names of familiar animals, but are sometimes tempted to capitalize the names of unfamiliar animals, which makes me question if it, too, is equating properness-vs-commonness with capitalization.
Perhaps most promisingly, International English Usage (2005, ISBN 1134964714) discusses not only "proper nouns (and the names of languages are proper nouns)" and common nouns (its examples are fruit and spider) but also concrete (coin) vs abstract (jealousy) nouns. Maybe someone can find better references — but I'm told CGEL is silent on the matter. - -sche (discuss) 09:04, 5 February 2015 (UTC)
I finally found a source that is explicit on the subject: The Oxford Guide to Practical Lexicography, Atkins and Rundell (2008). In the course of discussing the groups of proper names that a dictionary might include depending on how important the class is to the target market, they have some lists: 'place name', 'personal names', and 'other names'. Under other names are the following subclasses: 'festivals, ceremonies', 'organizations', 'languages', 'trademarks', 'beliefs and religions', and 'miscellaneous'.
To this explicit characterization should be added that all language names are capitalized in English and that they refer to unique things, though those things may be subdivided, especially in technical discussion. The non-existence of a plural of a capitalized noun might be a sufficient condition to indicate that the noun is a proper noun, but the existence of a plural is not sufficient to indicate that it is a common noun. DCDuring TALK 04:59, 8 February 2015 (UTC)
Note that what you explain applies to languages names in English. I think no reader will object when finding language names in French described as common nouns, which reflects better how they are considered by French-speaking people. Lmaltier (talk) 18:47, 8 February 2015 (UTC)
They are common nouns in English too, except here of course. Donnanz (talk) 19:15, 8 February 2015 (UTC)
In the absence of a meaningful way to define the alleged distinction between common nouns and proper nouns (and I mean anywhere in the world, not just on Wiktionary), the question is moot. —Aɴɢʀ (talk) 20:38, 8 February 2015 (UTC)
And will remain moot, because there is a general idea of the definition (and this is more or less the same meaning in all languages), but details about how it's applied are ruled by tradition only, and this tradition depends on languages (and is not always clear). Lmaltier (talk) 20:50, 8 February 2015 (UTC)
@Donnanz: Could you produce some evidence from references or something that asserts that language names are common nouns in English? DCDuring TALK 21:47, 8 February 2015 (UTC)
Look for "mass noun" in orange (not the best colour), otherwise you may miss it.
http://www.oxforddictionaries.com/definition/english/Bokm%C3%A5l and :http://www.oxforddictionaries.com/definition/english/English. Donnanz (talk) 22:10, 8 February 2015 (UTC)
You assume that a mass noun must necessarily be a common noun. But you have acknowledged that trademarks are proper nouns. Providing specific counterexamples to the assumption is left as an exercise to the reader. DCDuring TALK 23:44, 8 February 2015 (UTC)
Trademarks are a different kettle of fish from languages. They start off as proper nouns, but can gravitate into common nouns and can even became verbs (e.g. google and hoover). See also Marmite, Mercedes and Bentley, I think the Bentley car should be a proper noun, not a common noun. Editors can quite easily get their knickers in a twist over trademarks, Oxford lists Marmite as both a mass noun and a trademark. But there shouldn't be any confusion with languages. Donnanz (talk) 09:57, 9 February 2015 (UTC)
There is no confusion. Languages are proper nouns because the are names of singular entities. That they can be used as plurals, used as mass nouns, and used attributively is barely interesting as other proper nouns can too, though the relative frequency might differ by type of noun. Perhaps it would be easier to swallow if you viewed it as metonymy: "There were two IBMs in a refrigerated room."; "I own too much IBM."; "It's an IBM computer." DCDuring TALK 15:28, 9 February 2015 (UTC)

Anagrams - do they serve a purpose?[edit]

While I'm at it, I may as well ask whether the inclusion of anagrams in Wiktionary actually serves a purpose - are they useful, or just a fun thing? I'm not sure whether this has been discussed before. Donnanz (talk) 22:59, 3 February 2015 (UTC)

Yeah, someone will come along and object to the anagrams fairly regularly. Points brought up in the past include (i) they are useful for word games such as Scrabble; (ii) they are a genuine provable "function" of a word, whereas e.g. spelling-bee trivia is not. There also tends to be general interest in words with unusual properties, such as palindromes, very long words, and words with unusual combinations of letters (such as our Q-without-U category); anagrams are that sort of thing. Equinox 23:08, 3 February 2015 (UTC)
I see. I have actually changed one or two where the anagram happened to be a synonym or variant spelling, but this doesn't happen very often. Donnanz (talk) 23:21, 3 February 2015 (UTC)
What did you change? From the point of view of the Scrabble player, SPECTER and SPECTRE might as well be totally different words: the point is which one of them is better strategy (e.g. perhaps you don't want the E on a square that gives more possibilities to your opponent). Anagrams are anagrams; I don't think we should edit them based on semantics. Equinox 23:46, 3 February 2015 (UTC)
I'm afraid I can't remember now, but it was only two at the most. I'll bear that in mind in future. Donnanz (talk) 00:03, 4 February 2015 (UTC)

I give zero phucks about games. They do not add any lexicographical content, so we should not include it. Even paronyms and folk etymologies (I have read somewhere that we do not include them) are more interesting than anagrams. --Dixtosa (talk) 15:07, 5 February 2015 (UTC)

I agree, anagrams take space and are useless. Let's remove them. --Vahag (talk) 15:15, 5 February 2015 (UTC)
Out of interest, do Dixtosa and Vahag also favour removal of the palindrome and Q-without-U categories? Equinox 16:51, 5 February 2015 (UTC)
No, because categories do not take up space on the page. I am concerned about the layout of our entries. The fewer sections we have, the better. --Vahag (talk) 16:54, 5 February 2015 (UTC)
Agreed. BTW, I can argue that QwoU category can have lexicographical value (as I see it, it is gonna contain exceptional words, because every occasion when Q is not followed by U is an exception).--Dixtosa (talk) 19:10, 6 February 2015 (UTC)

I personally find anagrams (and also rhymes) pretty useful for solving and compiling cryptic crosswords. The sections are automatically maintained by bots, so I don't see any real reason to object to them. Smurrayinchester (talk) 15:28, 5 February 2015 (UTC)

  • I don't use the anagrams sections for anything myself, but they do no harm and don't duplicate information available (or even potentially available) in any other Wikimedia project, so I'd be opposed to trashing them. —Aɴɢʀ (talk) 15:41, 5 February 2015 (UTC)
  • Ah, we're getting some mixed reactions. Keep 'em coming. Donnanz (talk) 16:02, 5 February 2015 (UTC)
  • At least anagrams can be automatically generated (e.g. this tool for fr). Rhymes can also be generated automatically as well. Although both depend on the exhaustivity of information. Dakdada (talk) 16:08, 5 February 2015 (UTC)
  • A simple way to reduce the space taken by anagrams is to list them horizontally instead of vertically. DCDuring TALK 18:12, 5 February 2015 (UTC)
Not a bad idea, if there's quite a few of them. Donnanz (talk) 18:56, 6 February 2015 (UTC)
It is not exactly about how many lines they take, but rather the fact that nonlexicographical content does not deserve place in articles' pages. Besides, I am sure no1 is able to prove that listing rearrangements rather than, for example, subsets is more plausible. --Dixtosa (talk) 19:09, 6 February 2015 (UTC)
If we can accommodate such content at low cost and low intrusiveness, it might serve to get a few more contributors. I would expect that word-puzzle and word-game fans constitute a significant share of users and contributors. Anything that makes contributing fun is worth consideration. DCDuring TALK 20:23, 6 February 2015 (UTC)
Looking at earnt, putting anagrams on one line has already been happening. Donnanz (talk) 23:52, 6 February 2015 (UTC)
I've been doing it for a while, but not systematically. DCDuring TALK 02:25, 7 February 2015 (UTC)
Apparently, Conrad.bot, which also inserted anagrams, used to do it. DCDuring TALK 02:29, 7 February 2015 (UTC)

Why stating that they do not add any lexicographical content? Usually, they are not included in dictionary entries, true, but it would be possible to include them, and we do it, this makes them lexicographical content. Some dictionaries are dedicated to anagrams (see w:Anagram dictionary). The important question is their usefulness, and I think they are useful. Lmaltier (talk) 18:55, 8 February 2015 (UTC)

Anagrams are mathematical. They reduce a word to mathematical properties ignoring meaning, pronunciation, etymology. Literally everything apart from what letters the word uses and if any other words use the same letters. 11:56, 7 March 2015 (UTC)

Admin vote[edit]

This is to inform you that I've decided to nominate myself for adminship again. The reason is because I want to work with lots of templates, in order to make lots of cleanup pages. This way I won't have to keep bugging other admins to make changes. BTW, the page is at Wiktionary:Votes/sy-2015-02/User:Type56op9 for admin. It would be fun to hear your opinions. --Type56op9 (talk) 11:33, 4 February 2015 (UTC)

I oughta nominate myself for the mop as well. If this guy's gonna get it, and Kephir still has it, why can't I? Purplebackpack89 15:08, 4 February 2015 (UTC)
Hey, I gotta great idea. Instead of blocking vandals, why don't we make them sysops instead? SemperBlotto (talk) 21:07, 4 February 2015 (UTC)
Yeah! And instead of applying CFI and policy, why not ignore them instead? Oh wait we've done that one. Equinox 13:27, 5 February 2015 (UTC)
We've already made vandals sysops too, and even allowed them to remain sysops after their vandalism has come to light. —Aɴɢʀ (talk) 15:43, 5 February 2015 (UTC)
If I knew what a sysop is, maybe I could have a laugh (?). Donnanz (talk) 15:50, 5 February 2015 (UTC)
Did you consider consulting an online dictionary - hint sysop SemperBlotto (talk) 15:58, 5 February 2015 (UTC)
Er, no, I thought it was Wiktionary jargon. Thanks. Donnanz (talk) 16:04, 5 February 2015 (UTC)

Category for double modals[edit]

Would it be useful to have a category for double modals (might can), like we have one for double contractions? We don't even have a category for single modals at the moment. - -sche (discuss) 20:58, 6 February 2015 (UTC)

As an English-specific category, maybe. —CodeCat 21:26, 6 February 2015 (UTC)
Certainly. (For those reading this thread who don't speak other languages: German and many other languages are also able to stack modals, but they're not defective so they're not remarkable.) Do you think it would be useful to also have categories for modal verbs in general? I notice we do already have Category:German modal verbs, but it's empty. - -sche (discuss) 22:15, 6 February 2015 (UTC)
I'm surprised that we don't have something as fundamental as modals categorized. If we had that would we need one for double modals? That is, wouldn't it be clear by inspection of the base category which were double modals. DCDuring TALK 15:21, 8 February 2015 (UTC)
We do have Category:English auxiliary verbs, which does not seem to be complete. Even English modals may be too sparse to be a good category. Perhaps an Appendix? DCDuring TALK 17:37, 8 February 2015 (UTC)

Is Nostratic allowed in etymologies?[edit]

When I removed Nostratic material from a PIE etymology section, Ivan Štambuk reverted me; evidently he has some belief in it, but scholarly opinion is strongly opposed to it. I think we ought to avoid having something so far from the linguistic mainstream treated as credible in PIE entries, and if necessary I would create a vote about it. Does the community support its inclusion? —Μετάknowledgediscuss/deeds 07:53, 7 February 2015 (UTC)

What's "Nostratic material" stand for, exactly? A sourced claim to the effect "*wódr̥ may be akin to *wete" would be sensible enough — there are several "Nostratic" comparisons of this sort that are both credible and well-established, and the main dispute is if they involve inheritance or some kind of loaning. But anything along the lines of "from Proto-Nostratic *wede" (privileging a disputed explanation; not to mention that no two Nostraticists agree on a reconstruction) or "also compare XX in Southern Oromo, YY in Old Kannada, ZZ in Evenki" (or other utterly dubious comparisons) should obviously be nuked on sight.
Looking up the appendix in question (*h₁er-), it seems we have some from column A, some from column B here. At least the Semitic root is well-reconstructed and should be OK to mention. I don't know if there's much point in discussing the alleged cognates in other Afrasian branches and Dravidian, if we don't even have the relevant Proto-Chadic or Proto-Dravidian etymology pages up yet. (The former could well be mentioned in the Proto-Semitic entry, of course.) --Tropylium (talk) 09:22, 7 February 2015 (UTC)
(Also, mutatis mutandis, I would suggest the same for claims of Altaic cognates, in case there's any work going on with that.)
A separate appendix of "Nostratic roots" for documenting the various proposals out there would be OK for me, but it should not be cross-linked from the main etymological appendices. --Tropylium (talk) 09:01, 7 February 2015 (UTC)
Why shouldn't it be linked? How else are people going to find out the connections? --Ivan Štambuk (talk) 20:55, 8 February 2015 (UTC)
What I'm against is creating any Nostratic appendices in the same mold as established protolangs, i.e. each page is an entry for a single proto-root, which lists its descendants, and each descendant is linked back specifically as "from Proto-Nostratic *ʔer-". The research just isn't far enough for that to be a sensible approach. There is no coherent consensus reconstruction of "Proto-Nostratic" that could be treated as a language according to Wiktionary's standards.
It seems doable to instead have pages that catalogue the different overlapping proposals in a particular semanto-phonetic area. Let's say we've a PIE root that has been compared to three different Semitic roots by Bomhard, Dolgopolsky and Illich-Svitych respectively; we can neither pool all of those into a single root, nor should we try to enforce an executive decree on which proposal is the closest to being correct. Instead a new kind of an article layout entirely seems to be required.
Moreover, note that this same problem also comes up within several established language families. There is no standard reconstruction of Proto-Afro-Asiatic, Proto-Niger-Congo, Proto-Sino-Tibetan, etc. So if we'd need some less mechanical way of formatting etymology appendices dealing with these families anyway, it stands to reason that the same approach, and not the proto-language approach, should be applied to Nostratic as well.--Tropylium (talk) 15:29, 13 February 2015 (UTC)
What do you mean by "research is not far enough" ? What standards does Wiktionary have when we allow original research in etymologies? I'm all for creating and establishing standards, but they should be applied consistently.
That type of layout should also be used for all protolangs, since they vary widely depending on the author/school, even established ones. The problem is the software which requires one spelling to be the main entry, and others redirecting to it. It's often a political question as well. But it's not that much of a priority IMHO - the priority is to collect information, and the formatting/presentation issue can always be solved later.
I think that there should definitely be a way in a PIE or PS reconstruction to indicate "there is a Nostratic root that has been connected with this reconstruction, and you can find more information about it here". It's absurd to have Nostratic roots listed somewhere without linking back to them. Perhaps some kind of a floating box would suffice? --Ivan Štambuk (talk) 01:12, 26 February 2015 (UTC)
The research is not far enough in that there is no such thing as accepted Nostratic soundlaws, or an accepted perimeter of Nostratic, that could possibly guide our work. Within any relatively young and well-studied group (on the order of Germanic, Slavic, Finnic, etc.) it is usually simple enough to check whether a particular proto-form, even if not explicitly sourced, is what the alleged descendants suggest. Admittedly I have not paid attention to what kind of OR we might have around exactly though; if editors are establishing etymological connections or devising new soundlaws all on their own, and they have some kind of a policy support for this, I'd argue that that's rather worrisome, yes. But my understanding has been that Wiktionary doesn't "allow OR" as much as "has been lenient in tolerating OR".
Your comment that some type of less formulaic entry layout should be used for better-established protolanguages as well is intriguing. It would indeed probably work for many roots in bottom-level languages like PIE and PU on which there remain many open questions. On the other hand, aside from notational fine-tuning, there is also widespread agreement on the reconstruction of words/roots like *gʷṓws or *kala, and chucking the regular entry layout entirely doesn't seem necessary (even if individual sub-headings may require different treatment). And again, closer to historicaly recorded languages, proto-words like this are probably the majority. --Tropylium (talk) 20:49, 27 February 2015 (UTC)
My opinion on Nostratic is the same as my opinion on Altaic. Someone who digs all the way down to the Appendix page of a PIE root is probably into etymologies, and might be interested in theories that go even deeper, like Nostratic — the key is that they need to be clearly labelled so no-one is mislead either to think that the theories are reliable, or that Wiktionary is believing them. "As part of the controversial Nostratic hypothesis, Smith connects this word to foo." And like Altaic, Nostratic should be limited to appendices (linked to from other places using {{etyl}}, the same way we link to any proto-language appendices).
If the wording were strengthened just a bit ("Within the controversial Nostratic framework" — note the added word and added wikilink to more info), the text at Appendix:Proto-Indo-European/h₁er- would be fine, IMO, though like Tropylium I would find it preferable if someone created a page for the Proto-Dravidan root and moved the individual proposed-cognates there. I wouldn't require someone to create pages for roots before mentioning cognates, for pretty much the same reasons as I outline in my comment here that begins "people may feel comfortable noting..."
- -sche (discuss) 23:05, 7 February 2015 (UTC)
You're presuming that there is a Proto-Dravidian root, but no such thing has been established here. Nostraticists are not a reliable source on whether a given word in a language is inherited. It's entirely possible that Dravidian specialists instead etymologize those Telugu and Kannada words by some kind of derivation, loaning, semantic shift, etc. E.g. if we can trust the StarLing people to have correctly encoded Burrow & Emeneau's Dravidian Etymological Dictionary, Kannada ere 'black soil' is not connected to the Telugu words, and indeed completely isolated within Dravidian. I for one would first ask if an etymology from the homonymous ere 'dark color' is possible, before reaching all the way to Nostratic. --Tropylium (talk) 15:57, 13 February 2015 (UTC)
Nostraticists don't make up the reconstructions for protolanguages that they compare. If you look at the e.g. last edition of Bombhard's dictionary, it has thousands of citations throughout, and the list of references alone is 300 pages. Burrow's dictionary has been available at the DSAL website for a decade now so it's easy to double check that: [1] - it's indeed connected. --Ivan Štambuk (talk) 00:57, 26 February 2015 (UTC)
OK, fair enough with the Dravidian words then. Although we may note that the original DED provides no reconstruction, and explicitly states it is an arrangement of data for later etymological study, not a Proto-Dravidian rootlist. I am also somewhat skeptical on using these kind of positivist resources, but in the absense of any clear arguments against the comparison, e.g. if there are no well-vetted reconstructions of PD out there yet, I'll accept it.
And no, I am not saying that Nostraticists mostly pull their reconstructions from up their sleeve! But sometimes they do, typically when attempting to project isolated words backwards (a la taking a word previously reconstructed only for Proto-Indo-Iranian and asserting a PIE root behind it), and we should source our lower-level proto-term reconstructions from "local" specialists in the first place. Whether preferentially or exclusively is a different debate though. --Tropylium (talk) 20:49, 27 February 2015 (UTC)
Your opinion and acceptance doesn't really matter in the grand scheme of things. The fact of the matter is that we have a credible authority in the field making the connection, and that's enough. Opinions of editors are irrelevant, other than assessing the credibility of the sources themselves. We are just minions collecting knowledge and our personal prejudices or affinities shouldn't get in the way of that. The relevant question is 1) Is information worthy of adding in terms of relevance 2) Is the source credible.
It's funny that you mention that projecting backwards thing - it's very common for "established" protolangs. Since all of the interesting stuff was done a century ago, researchers today are stuck with positing fanciful theories of protolang prehistory and making reconstructions on the flimsiest evidence. If you look at LIV and EIEC every other reconstruction has a question mark. Methodologically it's of course wrong, but there is always a possibility that a genuine PIE root was preserved only in one branch.
It would be nice to have a PII form instead, but unfortunately PII has not been yet been adequately reconstructed in two centuries of scholarship. IEist seem to like to take shortcuts instead, skipping the middle step. Why blame Nostraticists for doing the same thing?
Note that I don't necessarily disagree with criticism, but if you apply the same scrutiny the entry *h₁er- itself should be deleted. It's at least as far-fetched as the Nostratic etymology thereof. --Ivan Štambuk (talk) 00:24, 28 February 2015 (UTC)
If we're going for the "just follow the credible sources" angle, I should hope for you to remember distinguishing between "a number of Dravidian words have been considered probably related to each other" and "the mentioned words come from a unique Proto-Dravidian root".
I would agree, yes, that *h₁er- is not exactly the most convincing proto-root out there; but it does have one major selling point over any given Nostratic root, namely being reconstructed for a proto-language that we at least know to have existed. --Tropylium (talk) 01:59, 28 February 2015 (UTC)
I agree with including Nostratic on -sche's terms. --Vahag (talk) 09:41, 8 February 2015 (UTC)
  • Nostratic, Altaic and other long-range etymologies have vibrant scholarly communities who publish books and peer-reviewed papers on it. There are even journals exclusively dedicated to it. The opposition usually comes from linguists who oppose it in principle, and are not against any long-range theory per se. At any case, it's not up to us to decide whether it's worthy of inclusion or not on the basis of whether the majority of linguists believe those theories to be true or not - the only thing that matters is the notability of theories itself. It seems to me that you're rather worried that poor readers would be mistakenly guided into believing that Nostratic is on the same level of credibility as PIE. Which is on one hand kind of ironic because the PIE reconstruction *h₁er- is itself dubious, just like the two thirds of the entire PIE lexicon. At any case, should Nostratic be marked with some kind of exra-safe version of {{reconstructed}}, it would be fair that the same kind of scrutiny be applied to original research reconstructions done by CodeCat & co. --Ivan Štambuk (talk) 20:52, 8 February 2015 (UTC)

"It is important prononounce it with a long á, otherwise it will sound like…"[edit]

This kind of a disclaimer seems to have been added to several Hungarian words that are vowel-length minimal pairs, mostly by User:Panda10. E.g. kén, kérés, kint, mély, méz, vágy, vét. Some cases also warn readers about consonant length, e.g. arra, száll.

Is there a point to this? On one hand, I guess this is semi-useful for English speakers prone to ignoring diacritics; on the other, it seems arbitrary to mention just these kind of minimals pairs, and not pairs involving e.g. s/sz. We are not a language-teaching resource, and so this does not seem to generalize into any kind of a useful policy. --Tropylium (talk) 09:38, 7 February 2015 (UTC)

There were other editors who complained about this before, so apparently it is not useful. Feel free to delete them when you see them. I will do the same. --Panda10 (talk) 12:31, 7 February 2015 (UTC)

Trivia sections in entries[edit]

WT:ELE says "Other sections with other trivia and observations may be added, either under the heading “Trivia” or some other suitably explanatory heading. Because of the unlimited range of possibilities, no formatting details can be provided." However, in practice, we haven't accepted ad-hoc section headings or random encyclopaedic factoids in years, and we've done away with ==Trivia== sections, too. (There were 22, out of our 3 950 000 entries, in the last dump, containing stuff like this.) I suggest removing the clause. - -sche (discuss) 06:07, 8 February 2015 (UTC)

On the one hand, spelling bee trivia is pretty clearly outside what belongs in a dictionary, as is stuff like this and this (which was falsely marked as a "Usage note"). On the other hand, all the other trivia seems to be along the lines of assessees#Trivia, 鬱#Trivia and scrootched#Trivia. If not exactly dictionary material, it is still at least information that pertains to the word itself. 死ぬ#Trivia in particular contains useful and interesting information, and I think we should certainly note somewhere that 死ぬ is the only ぬ verb in modern Japanese. What I think we need is just something a bit more structured than "You can add anything you like under any title you like in any format that you like". Smurrayinchester (talk) 12:07, 8 February 2015 (UTC)
I think they'd be more palatable if they weren't called "Trivia". We already have a "Usage notes" section; what about making an "Orthographic notes" section for information such as is provided in the sections linked to above? —Aɴɢʀ (talk) 13:06, 8 February 2015 (UTC)
That sounds a bit too formal and academic for what is essentially word games. Equinox 14:18, 8 February 2015 (UTC)
In my experience, things like the note at 死ぬ generally get shoehorned into the Usage notes section. And that's fine, in my opinion. I'd be hesitant to create an "Orthographic notes" section for just a tiny handful of entries. OTOH, perhaps we could move anagrams under that header, enclosed in a template similar to {{homophones}} (which would reduce how much space they take up), and then the section wouldn't be so useless/little-used? Alternatively, perhaps we could just have a ====Notes==== section, perhaps even replacing ====Usage notes====? But my preferred solution is to shoehorn the dozen-or-so useful Trivia sections under Usage notes. I mean, it's not wrong to call a note that 死ぬ is the only verb a "usage" note... - -sche (discuss) 18:42, 8 February 2015 (UTC)
  • I support a ===Notes=== header in ELE to replace both this trivia business and ===Usage notes===. —Μετάknowledgediscuss/deeds 20:50, 8 February 2015 (UTC)
    This is the fr.wikt current practice (a Notes header). Lmaltier (talk) 20:53, 8 February 2015 (UTC)
I don't like "Notes" — too vague. It's like having an "Information" section. The whole entry is notes, or information, of various kinds. Equinox 21:01, 8 February 2015 (UTC)
I don't mind "Trivia", but it shows condescension. MW Online and some other dictionaries accommodate word games in their entries. Even in taxonomic names folks play word games (eg Iouea, Aa, Zyzzyzus.
How about "Miscellany" or a right-floating box placed so that it does not rise far above the ruled lines at the bottom of entries. DCDuring TALK 23:06, 8 February 2015 (UTC)

On inflections of extinct languages[edit]

Wiktionary has an interesting policy of only including Old Irish verb forms that are actually attested. Why is this? Sometimes I've wondered if as similar policy is appropriate for Ancient Greek as well, which has never seemed to have well-defined conjugations. Thoughts? ObsequiousNewt (ἔβαζα|ἐτλέλεσα)

I wouldn't call it a policy so much as my personal decision which no one has objected to. I made that choice for Old Irish because Old Irish verb forms are notoriously unpredictable. It is very hard, often impossible, to say what a given form of a given Old Irish verb will be unless it's attested. (Students of Old Irish are often left with the impression that all verbs in that language are irregular; that's an exaggeration, but only a small one.) For this reason, I thought it best if we don't even try to predict them, but merely to list the attested forms. Ancient Greek, on the other hand, has comparatively well-behaved verbs: if you know the stem and the ending, you can glue them together to make the verb form. Even if that form is unattested, you can quite certain that the predicted form is correct. Also, the Ancient Greek corpus is orders of magnitude larger than the Old Irish corpus, making it much more difficult to find out what is and isn't attested. —Aɴɢʀ (talk) 16:41, 10 February 2015 (UTC)

Wiktionary Culture[edit]

Some time ago I was told by email that my farewell message in the Beer Parlour had been greatly annotated and that there was a lot of support for what I was doing as a Wikipedia editor. At first I was inclined to ignore it but the lure of the Wiktionary's potential was such that I did look the Parlour item over. The experience was not encouraging as my position seems not to have been understood.
  My resolve to discontinue editing Wikipedia was very definitely not because of what was done to my contributions. Rather it was because of how it was done, that is, because of the despicable rudeness of not even telling me what was being done but leaving me to discover it.
  As a child of the Great Depression I was brought up to abhor rudeness: to avoid doing it and to shun people who do it. This was taught in the home and at both Sunday school and public school, and rudeness was severely punished in the latter, by strapping in the case of boys. Unfortunately the modern educational dogma of building self-esteem seems rather to build selfishness and thus encourage rudeness in many (see here).
  So it's not surprising to me that I should be subjected to rudeness in the Wikipedia quasicommunity, but the lack of surprise doesn't make it less abhorrent to me. In a way it makes it more abhorrent because it is accompanied by a great sadness.
  I know that modern culture is not my culture, but culture makes the person and is not easy to change, especially when you don't want to change it. When as a Wikipedia editor I would often come across modern sexual senses, with informal or transient attestation if any, I realised that this meant that the dominant culture of the Wikipedia editors was modern. Though I disagreed with the inclusion of such senses I left them alone. To undo them would have been rude, and with the predominant editorial culture I felt that raising such an issue in, say, the Beer Parlour would be a waste of time.
  Incidentally, I'm not WF, as was suggested in the Beer Parlour. But I must confess that I got my early Wiktionary coding skills from being an editor under another name. However I left the Wiktionary alone for quite some time after being very rudely lambasted and only returned when a growing enthusiasm for its potential overcame my distaste for the editorial culture.
  This message does not signify a return beyond a couple of items I will put below to convey my personal hopes for the Wiktionary. I will not log back in again and I will not read any emails coming from the Wiktionary editorial community.—ReidAA (talk) 03:45, 10 February 2015 (UTC)

Right, we'll stop posting "sexual" senses that you dislike, and await our strappings. Oh wait, my mistake, I meant to say "good riddance". There's nothing viler than somebody polluting a space with a big bitching rant topped off with the rotten cherry of "I won't be coming back though". If you're gone, don't post. Equinox 00:23, 12 February 2015 (UTC)
I can think of a great many viler things, even if I limit myself to behavior on online forums, and it's unclear to me if the above exaggerration is supposed to accomplish anything other than embellish how your disdain for this ex-editor runs deep indeed. --Tropylium (talk) 14:37, 13 February 2015 (UTC)
I found ReidAA editing style pretty rude: he generally did not respond to user talk interactions in action, and went ahead as he saw fit regardless of disagreement. For an instance not involving me as the main actor, when Widsith asked him to stop switching or get consensus in October 2014, he continued regardless of the conversation. From this and mine interactions with this user and from seeing his long-term pattern of behavior, I learned that he will need to be dealt with directly in the mainspace. And there is no riddance: User:Smuconlaw; last contribution: 15 February 2015. I find it pretty insolent to whine about how one is leaving the project and then go on editing under another user. I think I saw one more user used by the same person, but I cannot find it now. --Dan Polansky (talk) 20:31, 15 February 2015 (UTC) Let me strike out what is an inappropriate speculation, based on insufficient evidence; there is even some evidence to the contrary. --Dan Polansky (talk) 20:42, 15 February 2015 (UTC)
Er ... not sure what this is all about but I am not ReidAA. Smuconlaw (talk) 22:56, 15 February 2015 (UTC)

Suggestions for Examples[edit]

The following suggestions describe an approach to adding examples to the Wiktionary. The motivation for this approach is to exploit the possibilities for an online dictionary to use practically unlimited storage.
  The potential for examples lies in their benefit for learners of English, either children or non-native speakers of English.
  For such users of the Wiktionary, most of whom one would expect to not be logged on as an editor, any quotations would only be accessible by clicking on the individual quotation tags provided with each sense when quotes are available. Probably the option of seeing all quotes should only be offered to users who are logged on, though the ability to get at quotes for individual senses must be available to the learner whose curiosity about the word/sense background must be catered for, which is why quotations should be linked to sources and to online text where the context of the quote can be found (more on this).
  Ideally (I would hope eventually) every sense would have a few examples. Being primarily for learners, the examples should be short phrases or sentences, each with a distinct context within a sense.
  The user should be able to click on any example to hear it spoken. To be able to do this would be a tremendous help for learners. Maybe there should be options for learners to choose between male and female speech when a choice is available, and even for regional accents.
  If the Wiktionary becomes popular for learning to speak English, the learner could maybe choose to have their pronunciation checked and corrected by it.
  Another good option would be to have sign language used to back up the example. This might not be of all that much use to deaf people, but there is a school of thought that sign language should be taught to all students in their early education both for the mental benefits of being bilingual in this fashion but also because it is a distinct channel of communication to be used where speech is ineffectual, for example in noisy rooms or across long distances.
  In traditional dictionaries, like 1913 Webster's, very brief quotes have been used as examples. This is because of the limitations of the printed page and bound volumes. There is some benefit and interest in using such quotes as examples, particularly for obsolete/archaic senses where an archaic pronunciation would be appropriate, but such examples should be stripped of all their context, except for the date, and an improved version of the example provided as a quotation.—ReidAA (talk) 04:05, 10 February 2015 (UTC)

Suggestions for Quotations[edit]

The following suggestions describe an approach to adding quotations to the Wiktionary. The motivation for this approach is to exploit the possibilities for an online dictionary to use practically unlimited storage and to link to a rich and rapidly increasing source of related online data.
  Quotations (hereinafter "quotes") are in general of particular interest to two kinds of users.
  Primarily they are of interest to experienced readers and writers wishing to discover more about a word or a phrase, or one of its senses, in particular about its early use and its more recent use.
  Secondarily they are of interest to avid readers for whom the quote may spark an interest in a quote's author or source or context.
  For neither of these kinds of Wiktionary users would an abundance of quotes be appropriate for any sense upon initial presentation. Rather a maximum of three or four should be presented directly and these should be spread over a variety of dates and contexts. Should there be more available then all should be stored separately and linked to as a store for that sense's quotes.
  For neither of these kinds of Wiktionary users would very brief quotes be available for any sense; that's what examples are for.
  For both of these kinds of Wiktionary users at least three links should be provided as well as a date: to the author(s), to the work, and to an online source of the work. The Wikipedia will often provide the first two links and Wikisource or Gutenberg or Google books the third.
  It's very important that the works quoted from should be formally published and that the links used should be reliably very persistent.
 The following suggestions describe an approach to adding quotes, at least while they remain relatively scarce. Although the quote will most often be for a word, it should be remembered that the Wiktionary also contains phrases, though not very thoroughly, and quotes for these should be added where there is a quote gap.
  1. Choose a book to read in hard copy whose text is available online and preferably whose author is not yet quoted in the Wiktionary.
  2. In reading the book make a note of any interesting word and its location, and check (then or later) that it is consistent with the online version. The online version will sometimes need editing, or might be of a different edition to the one you are reading.
  3. Prepare an RQ template (with documentation) for the book and enter it into the (incomplete) Wikipedia table of quoted works.
  4. Prepare a skeleton of the code to be used for adding the quote for the first word you have noted for use in the Wiktionary, using a text editor so that you can easily copy and paste the entry into the Wiktionary. The skeleton will be in two parts, the first using the RQ template filled out for the word of your choice, the second holding the chosen word and the surrounding text, say forty or fifty words, which can be copied easily from the online text. Do not highlight the occurrance(s) of your chosen word.
  5. Paste a copy of your code quoting the chosen word into the appropriate sense in the Wiktionary and highlight the word wherever it occurs.
  6. Now go through the quote word by word and check whether there is a quote gap for each word's sense. There are very very many such gaps, even for very common word senses. If you find a gap, fill it in the same manner that your chosen word was used.
  7. The skeleton code can be modified for each of your chosen words so that the procedure above can be repeated for them.
  Note that the benefit of using an RQ template for a book is not just to simplify the adding of multiple quotes from a single source, but also to allow all quotes from a single source to be upgraded, for instance when a better online source becomes available, simply by upgrading the template.
  Another avenue for quote improvement in the Wiktionary is to focus on one of the sources used as a combination of example and quote in the original 1913 Webster's Dictionary. Often these are simply given with only a short author or source name which is explained here.
  One way to improve one of these is on the one hand to simplify it as an example, with no source and only giving a date if the sense is archaic or obsolete; and on the other hand to expand and link it. Many of the sources for such quotes are already supplied with an RQ template (see [2]). — ReidAA (talk) 04:06, 10 February 2015 (UTC)

A great tool to have for quotations is some kind of app, where at the click of a button, one can add quotations to a corresponding WT entry, coming in from Wikisource, Google Books, or another compatible media. I'd pay good many for that! --Type56op9 (talk) 15:40, 10 February 2015 (UTC)
Ask, and ye shall receive. Smurrayinchester (talk) 15:53, 10 February 2015 (UTC)
Wow, that is an awesome gadget. It should be linked from Wiktionary:Quotations! --Type56op9 (talk) 11:46, 12 February 2015 (UTC)

Cool gadgets[edit]

After hearing recently about WT:QQ, a cool quotations gadget, I found myself wondering if we had some other cool gadgets that maybe some users don't know about. So, in hope of some civility, I think it would be appreciated if some other users mentioned here some cool Wiktionary gadgets that may not be known to all the communities, as a way of helping each other out. --Type56op9 (talk) 11:08, 15 February 2015 (UTC)

  • I'll start: WT:ACCEL is a nice gadget to quickly and semi-automatically create forms of words (plurals, conjugations, feminine forms etc.) in various languages. --Type56op9 (talk) 11:10, 15 February 2015 (UTC)
I am already working on that. here--Dixtosa (talk) 11:18, 15 February 2015 (UTC)

Let's categorize semantic loans[edit]


I think it is interesting.--Dixtosa (talk) 15:31, 15 February 2015 (UTC)

I have seen "semantic calque" being used synonymously with "semantic loan". We can assume semantic loans are a subtype of calques and include them in Category:Calques by language. --Vahag (talk) 17:22, 15 February 2015 (UTC)
There are also "phono-semantic" loanwords (sometimes adding new, funny senses), such as 馬殺雞马杀鸡. --Anatoli T. (обсудить/вклад) 00:45, 5 March 2015 (UTC)

WT:WE length[edit]

It was my impression base on this discussion that we were supposed to be slowly shortening the list on WT:WE. But given this diff, that is becoming very difficult. I like helping out with it, but it is slowly filling up with words I'm not able to add or that are so obscure that I can't find anything about them. May I request some aid shortening the list or removing unattestable entries? JohnC5 22:44, 18 February 2015 (UTC)

I've moved some apparent Translinguals out of there and -sche removed some blue links. You could move some of the non-English items that you don't know to the various WT:RE:lang pages, eg WT:RE:he. DCDuring TALK 03:33, 19 February 2015 (UTC)

Pitjantjatjara case marking[edit]

In Pitjantjatjara, the ergative case is indicated by the ending -ngku: watingku yuu palyaṉu / man-ERG windbreak make-PAST / The man made a windbreak. Straightforward enough; indeed, I created an experimental entry at watingku. However, case marking in this language strikes me as rather odd: the case ending only attaches to the last word of a noun phrase: wati ninti tjuṯangku yuu palyaṉu / man wise many-ERG windbreak make-PAST / The wise men made a windbreak.

Could -ngku be considered a clitic? Is it appropriate to case entries of the type watingku, given this situation? Given that a lot of nouns (wati being a prime example) are frequently found in their inflected form about as often as not, I worry that we would be doing our readers a disservice by not creating entries for these forms. This, that and the other (talk) 12:32, 20 February 2015 (UTC)

New constructed languages[edit]

Why can't we include all constructed languages in Wiktionary, including Idiom Neutral?

Or my own constructed language, Sintelsk, at least somewhere in Wiktionary's appendix? My constructed language, which I'm writing about at my own Wiktionary that I created and run for fun: http://wikitoslav.monathevampirewiki.org/wiki/Wikitoslav:Frumpsida . It is a constructed language, mainly based on Danish, that has a precise pronunciation system, and looks more straightforward than most Germanic languages.

If you're interested in considering my advice about my constructed language at least, you may find these categories interesting: English lexicon, Danish lexicon, Spanish lexicon, French lexicon, Lexicon for Sintelsk itself, which includes definitions in Sintelsk and translations of its words into other languages, just like all other Wiktionaries. I built up my Wiktionary to make categories by using lots of templates. Also, my longest page on the wiki is on, which is a Sintelsk word meaning one, a, or an.

Also, let's consider definitely including Idiom Neutral into Wiktionary. NativeCat drop by and say Hi! 06:04, 21 February 2015 (UTC)

Minor Constructed languages can be included in the appendix namespace (for example Appendix:Sindarin). — Ungoliant (falai) 16:27, 21 February 2015 (UTC)
...within certain parameters. Most importantly, if the language is copyrighted (which many constructed languages are), we can't include too much of it or we're violating copyright; see Wiktionary:Beer parlour/2014/July#Inclusion_of_Dothraki. Secondly, if the language has no community of users, it's doubtful whether or not it should be included; many people have opposed including minor constructed languages that have no users. Lastly, if you made the language up yourself, it's a bunch of protologisms not suitable for inclusion except in your userspace. (As long as you're also making useful edits to Wiktionary, and as long as no copyright issues arise, people shouldn't complain about it if you put it in your userspace. Note that if you made the language up yourself, and copyrighted it, but you then publish it on Wiktionary, then per the disclaimer at the bottom of every edit window, "you irrevocably agree to release your contribution under the CC-BY-SA 3.0 License and the GFDL", which you might or might not want to do.) Disclaimer: I am not a lawyer, but we have lawyers here, and they weighed in on the discussion I linked to above. - -sche (discuss) 17:55, 21 February 2015 (UTC)

Manual adding of audio files[edit]

Hello. User:DerbethBot/February 2015 contains a list of audio files (and matching Wiktionary entries) that my bot was unable to add automatically - in most cases due to multiple etymologies (human needs to decide where an audio file belongs). Currently there are 675 audio files that can be immediately used to enrich entries in 30 languages. If you want to help, please check the page and remove entries that are done. --Derbeth talk 18:24, 21 February 2015 (UTC)

Subcategory for 'nyms?[edit]

Would it be acceptable to set up a class of subcategory of "Category:<language> names" for ethnonyms (endonyms, etc.)? The names category has the boilerplate description "<language> terms that are used to refer to specific individuals or groups," but it seems this category generally just has 2 subcats, for given names and surnames - no problem there but wondering if the intent of the names categories was to accommodate wider purpose. Or whether an ethnonyms category would go elsewhere. Or if this topic has previously been considered and ruled out. Aside from thinking this would be of general cross-language interest, I'm personally interested in possibly adding various Chinese ethnonyms, using a category to facilitate access to them as a sub-lexicon. I'm no expert on Chinese, but have noted with interest variant forms in hanzi for African ethnic groups. TIA for any feedback.--A12n (talk) 16:39, 23 February 2015 (UTC)

Simplification of topic categories adding[edit]

As the creator of {{zh-cat}}, I propose to generalize this template to {{cat}} and use it to add the topic categories. The syntax would be {{cat|en|CATEGORY_ONE|CATEGORY_TWO}}. I do not believe that there would be technical problems in creating the template, so I am only putting this here for consensus and discussion. --kc_kennylau (talk) 09:12, 24 February 2015 (UTC)

See {{catlangcode}}. Chuck Entz (talk) 13:25, 24 February 2015 (UTC)
@Chuck Entz: Oh, then I propose the automation of it and the name changing :) --kc_kennylau (talk) 16:28, 24 February 2015 (UTC)
Re automation (if I understand correctly that you mean a bot to go through and change existing cats to the new template): What would happen then if one wanted to keep specific categories with associated etymologies or word senses within a section for a particular language?--A12n (talk) 18:02, 26 February 2015 (UTC)
Support the simplification. I can also see benefits for languages requiring a non-default sorting order. Module:zh-cat does it already for Chinese - sorting by radicals. Ideally, Japanese would use a similar approach to sort by hiragana. That way, entries won't require code like this: [[Category:ja:Mammals|しし]] or [[Category:ja:Mammals|ひいばあ']] for kanji or katakana entries, e.g. in 獅子 or ビーバー. --Anatoli T. (обсудить/вклад) 00:41, 5 March 2015 (UTC)

Category:Historical terms by language vs. Category:Terms with historical senses by language[edit]

There seems to be a category scheme renaming in progress here that has not been quite completed. Apparently the latter is where most things go these days. Is there any particular reason the former is still kept around as well, just for three Chinese, three English, two Spanish, one French and one Latvian term? Or is it just waiting for deletion once the articles have been edited to be in the latter category branch instead?

The only previous discussion I can find on this is a brief exchange from June 2011: English terms with obsolete senses, etc. --Tropylium (talk) 19:39, 27 February 2015 (UTC)

See also Wiktionary:Requests for moves, mergers and splits/Unresolved requests/2012#Category:English_terms_with_obsolete_senses. - -sche (discuss) 03:43, 28 February 2015 (UTC)

March 2015

Templatizing topical categories in the mainspace[edit]

FYI: Wiktionary:Votes/2015-03/Templatizing topical categories in the mainspace.

Let us postpone the vote as much as discussion needs.

This thread seems related: Wiktionary:Beer_parlour/2015/February#Simplification of topic categories adding. --Dan Polansky (talk) 21:32, 1 March 2015 (UTC)

How is this even close to being ready for a vote?

[Global proposal] m.Wiktionary.org: (all) Edit pages[edit]

MediaWiki mobile

Hi, this message is to let you know that, on domains like en.m.wikipedia.org, unregistered users cannot edit. At the Wikimedia Forum, where global configuration changes are normally discussed, a few dozens users propose to restore normal editing permissions on all mobile sites. Please read and comment!

Thanks and sorry for writing in English, Nemo 22:32, 1 March 2015 (UTC)

Thanks for the news. We forgive you for speaking in English. --Type56op9 (talk) 14:44, 5 March 2015 (UTC)

Sports logos in images[edit]

Happened to notice both woman and American have sponsorship logos clearly visible in the image thumbnails. If we need to illustrate these concepts, can we find images which aren't as corporatish? Pengo (talk) 07:16, 2 March 2015 (UTC)

We should also extirpate all national flags, political slogans, references to NGOs, religions, etc. not essential to the ostensive definitions the images provide. DCDuring TALK 12:32, 2 March 2015 (UTC)
Logos I'll grant that getting rid of a corporate logo for a generic concept like "woman" is probably a good idea but an American flag behind an American on the entry for "American" doesn't seem like a problem to me. In this case, the image contains the word "Toyota", which is the problem, not American symbols. —Justin (koavf)TCM 14:12, 2 March 2015 (UTC)
I agree that the American flag in [[American]] is OK. I've switched the entry's image to one which is similar in every way except that it lacks the Toyota logo. - -sche (discuss) 17:34, 2 March 2015 (UTC)


The documentation for {{l-self}} claims it does not support tr=, but a simple test reveals this is not the case. The question is then: should it? ObsequiousNewt (ἔβαζα|ἐτλέλεσα) 14:29, 3 March 2015 (UTC)

In principle there's no reason why it couldn't. —CodeCat 19:55, 4 March 2015 (UTC)
But are there any languages that use transliteration within inflection tables? ObsequiousNewt (ἔβαζα|ἐτλέλεσα) 20:06, 4 March 2015 (UTC)
Yes. And this template isn't used only in inflection tables. It's used for any template that includes links to the same language. And the underlying logic which omits links to the current page is also used by {{head}} for the inflections: {{head|en|noun|plural|fish}} on fish will not generate a link for the form. —CodeCat 20:19, 4 March 2015 (UTC)

Parameter for Template:head to indicate that a form is missing[edit]

Several templates across a variety of languages have custom-written code to show a message like "missing" or "please provide" if one of the forms in the headword line is lacking. For missing genders, we already have a standard approach that {{head}} understands, which is to use "?" as the gender. I'd like to do the same for headword-line forms, so that the following will automatically generate a message and categorise the entry appropriately: {{head|en|noun|plural|?}}. Of course, templates written to use Module:headword or {{head}} can then use this themselves.

Of course, the downside is that you can't link to the entry ? in the headword line anymore, which is probably not normally going to be a problem, but there may be a few edge cases where it turns up. So an alternative way would be to include an extra parameter to indicate that a request should be included in case of a missing form. Something like this: {{head|en|noun|plural|f1request=1}} or perhaps the shorter {{head|en|noun|plural|f1req=1}}. This would then fit into the same fN... format that many of {{head}}'s parameters already use.

I don't expect there will be much opposition to this, but I'd like to ask anyway just in case. If you have a preference for one of the two proposed approaches, please indicate this. —CodeCat 19:54, 4 March 2015 (UTC)

The first one looks much better, is there (will be there) any edge case to start with? I don't think there would be any. --Z 08:13, 18 March 2015 (UTC)

Min Nan loanwords[edit]

How should Min Nan loanwords from Japanese be written when they don't have any kanji/Chinese characters? Min Nan is usually written in Chinese characters or in POJ. Should they be written in Pe̍h-ōe-jī, or should they be written in hiragana/katakana? For example, the Taiwanese Min Nan word for ice cream is "ai55 sirh3 khu33 lin51 mu11" according to 臺灣閩南語常用詞辭典. Currently, I have written it as アイスクリーム (ai55 sirh3 khu33 lin51 mu11) in the translation box under ice cream. The problem with loanwords is that that they don't follow tone sandhi and may not even have one of the 7 tones of Min Nan, which is problematic for POJ. Any ideas for this situation? Justinrleung (talk) 03:56, 5 March 2015 (UTC)

Min Nan terms should be written as they would be by Min Nan speakers. Unless Min Nan speakers use katakana to write the terms, we shouldn't. If we know the Japanese terms that are borrowed, those should be linked to in the etymologies for the Min Nan entries, but not be used in the names of those entries. Beyond that, I would refrain from meddling with a language I don't know. Chuck Entz (talk) 04:16, 5 March 2015 (UTC)
We should probably use the attestability for translations, just like for entries. I doubt "アイスクリーム" (Japanese for ice cream) can be attested to be Min Nan or any Chinese topolect, besides, it's a borrowing (ultimately) from English, so "ai55 sirh3 khu33 lin51 mu11" is a Min Nan pronunciation of "ice cream". Min Nan (Hokkien) is mostly a spoken dialect. If a written form is missing, then it shouldn't be added. As an example, Armenians use a lot of Russian words in speech but those terms lack a written form (ask User:Vahagn_Petrosyan). There are many other cases with diglossia or when a language/dialect lacks a well-developed written tradition.
The other issue is non-standard transliteration, as in tempura, see Min Nan translations 天麩羅 (thian35 pu55 lah3). As Justinrleung explained, it's not a standard tone sandhi but the source is only one online dictionary. --Anatoli T. (обсудить/вклад) 04:24, 5 March 2015 (UTC)
Are there any Min Nan speakers who can give any suggestions to this problem? Justinrleung (talk) 20:14, 7 March 2015 (UTC)
We have currently no native Min Nan speakers. The term may be derived from Japanese but katakana is not used to write Min Nan. It would be hard to attest both the Japanese spelling "アイスクリーム" and the "ai55 sirh3 khu33 lin51 mu11" since Min Nan, as I said, is mostly a spoken dialect. If it's written down, it's written in Chinese characters or Pe̍h-ōe-jī. The source above doesn't suggest the term is written in katakana in Min Nan. Here's what the dictionary says with English translations in brackets:
  • 詞目 ai55 sirh3 khu33 lin51 mu11 (dictionary item)
  • 日語假名 アイスクリ-ム (Japanese kana)
  • 日語羅馬拼音 aisukuriimu (Japanese rōmaji)
  • 釋義 冰淇淋(附錄-外來詞表) (meaning "ice cream" (appendix - table of loanwords))
From "ai55 sirh3 khu33 lin51 mu11" one can't really say that it's definitely from Japanese, not from English. I have recently added all translations of "ice cream" into Min Nan I could find in dictionaries and made アイスクリーム to be verified. Eventually, it should be deleted, since it's not verifiable as a Min Nan term. --Anatoli T. (обсудить/вклад) 21:57, 9 March 2015 (UTC)
  • Sorry for any confusion -- I wasn't making a case for アイスクリーム#Min_Nan. I agree with you that katakana, AFAIK, are only used to write Japanese. Instead, I just intended to ask if the etymology of the Min Nan term was EN > NAN, or EN > JA > NAN. ‑‑ Eiríkr Útlendi │ Tala við mig 23:47, 9 March 2015 (UTC)
  • I understood your question. It may be of Japanese origin, if it's a word in Min Nan. According to the dictionary it is. What I meant is that non-standard romanisation "ai55 sirh3 khu33 lin51 mu11" doesn't really indicate that it may be Japanese (except for "khu33"), it's very similar to how Mandarin words are transliterated using Chinese characters and phonology, note that (sī) and (mǔ) are some of the Chinese characters used in romanising loanwords with non-syllabic "s" and "m". Yes, Japanese words are or were well known in Taiwan and there are loanwords in colloquial Taiwanese Mandarin and Min Nan but this particular word may only have been used colloquially and may never had a written form. Most words have Chinese character spellings or at least POJ. --Anatoli T. (обсудить/вклад) 00:04, 10 March 2015 (UTC)

Inspire Campaign: Improving diversity, improving content[edit]

This March, we’re organizing an Inspire Campaign to encourage and support new ideas for improving gender diversity on Wikimedia projects. Less than 20% of Wikimedia contributors are women, and many important topics are still missing in our content. We invite all Wikimedians to participate. If you have an idea that could help address this problem, please get involved today! The campaign runs until March 31.

All proposals are welcome - research projects, technical solutions, community organizing and outreach initiatives, or something completely new! Funding is available from the Wikimedia Foundation for projects that need financial support. Constructive, positive feedback on ideas is appreciated, and collaboration is encouraged - your skills and experience may help bring someone else’s project to life. Join us at the Inspire Campaign and help this project better represent the world’s knowledge! MediaWiki message delivery (talk) 19:22, 5 March 2015 (UTC)

What 20%? We don't have women on Wiktionary. Hos are not good at lexicography. --Vahag (talk) 20:31, 5 March 2015 (UTC)
Not many but we do have them. What about active ones like Hekaheka, CodeCat, Panda10, Fumiko Take (not 100% about the gender of others)? --Anatoli T. (обсудить/вклад) 22:18, 5 March 2015 (UTC)
@Vahag, despite your generalization I'll assume good faith* and direct you to read ho. Modern and Old Armenian, Russian, German, and English aren't enough to familiarize— well, even some native speakers of American English— with just how
b£00d¥ ɟ∪ɔkᵻɳɢ INSULTING  that word is. It has no place whatsoever in any Wikimedia project except to be discussed, never used. --Thnidu (talk) 00:43, 6 March 2015 (UTC)
* Whoops! I just fixed this link. --Thnidu (talk) 04:41, 7 March 2015 (UTC)
I think good faith can only be assumed in combination with an assumption of mind-boggling ignorance and/or stupidity. Either way: not acceptable. --Catsidhe (verba, facta) 00:51, 6 March 2015 (UTC)
Unanimi sumus, Catsidhe. Nonne clare videtur ira mea? --Thnidu (talk) 05:35, 6 March 2015 (UTC)
Not good at all. I think Vahag was just being silly. I didn't get what "hos" mean at first. --Anatoli T. (обсудить/вклад) 05:53, 6 March 2015 (UTC)
@Anatoli T. (I've switched our four-colon replies to maintain chrono order.) "Being silly" does not stretch that far. (... Боже мой, I envy your polyglottism!) Perhaps one has to live in the US or be in very close touch with its cultures to appreciate that word. Calling that "being silly" is like excusing groping a stranger's crotch as "just like a tap on the shoulder". Uh-uh. And look at the sexist remark the word is embedded in. --Thnidu (talk) 06:15, 6 March 2015 (UTC)
I've known Vahag for a long time, not personally though. He trolls from time to time and gets into trouble for that but he is not really a racist, sexist, homophobe and anti-Semite as he sometimes pretends to be with his silly jokes and comments. I think he just wants attention or create a stir. Not sure. Re: polyglottism - thanks for the praise but I am not as good with languages as you may think but I spend a lot of time on them. --Anatoli T. (обсудить/вклад) 06:35, 6 March 2015 (UTC)

North American English vs Canadian and American English[edit]

Some entries are labelled {{lb|en|North America}} and some are labelled {{lb|en|US|Canada}}, and these are categorized differently. This seems unhelpful — users have to check two categories to find all Canadian (or American) entries. Should we (a) make {{lb|en|North America}} an alias of {{lb|en|US|Canada}}, or (b) try to periodically change instances of {{lb|en|US|Canada}} to {{lb|en|North America}}?
The first option is obviously more practical, as the second would require the sort of vigilance and recurring effort that we don't always manage to muster. One might say that it's useful to have a category for words common to both the US and Canada, but the same could be said of "ambitransitive" verbs, yet we've made that label an alias of "transitive, intransitive".
- -sche (discuss) 00:00, 6 March 2015 (UTC)

I like having "North America" be an alias for the separate categories. It would be useful to periodically review definitions that were in {{lb|en|US}} and not {{lb|en|Canada}} and vice versa, but, as we have no practice of marking items as having been passed such a review, it seems to mean a lot of repeated coverage of the same issue. DCDuring TALK 03:19, 6 March 2015 (UTC)
OK, I've made the "North American" label an alias for "Canada, US". Wiktionary:Todo/North American is a list of entries which are labelled as either Canadian or American but not both. We could go through the list, removing entries as we checked them. Once all the entries were removed, we could restore the list to its original state, periodically compile new versions of the list, and compare them to that version to find out which entries were new and thus needed checking. That would hopefully avoid too much re-examination of the same entries. - -sche (discuss) 05:13, 7 March 2015 (UTC)

anchors for links from other Wikimedia projects[edit]

  • On occult, I've added a null-length HTML span with ID to the medical sense of the adjective, as a target for a link from Wikipedia:Occult (disambiguation)#medicine, there being no single appropriate WP page; see the Talk page there.
  • I've done similarly on several other definitions here before, generally noting the reason for the anchor. But this time it occurs to me to ask if there's any problem with my doing this.

Please message me to reply. --Thnidu (talk) 00:09, 6 March 2015 (UTC)

Ungoliant MMDCCLXIV has helpfully answered me on my talk page:
Nothing wrong with it, just use the template {{senseid}} instead of adding the html code manually.

--Thnidu (talk) 02:28, 6 March 2015 (UTC)

We generally discourage HTML, especially in principal namespace In this case {{senseid}} is available and could be useful as a target for in-Wiktionary linking too. DCDuring TALK
Thanks, DCDuring. I'll try to go back over my contribs and templatize any HTML anchors. --Thnidu (talk) 05:41, 6 March 2015 (UTC)

Etymology: root or stem?[edit]

How should the words root and stem be used in an etymology? Are they interchangeable? E.g. "From a Proto-Ugric root *xyz-" or "from an imitative root with -asb suffix"? Google search returns more hits for "imitative root" than for "imitative stem" and 9 hits for "Proto-Ugric stem" (mostly from our Wiktionary), 7 hits for "Proto-Ugric root". It would be helpful to have a list of recommended usage. --Panda10 (talk) 18:23, 7 March 2015 (UTC)

Looking at the Lexicon of Linguistics and other references at root at OneLook Dictionary Search and stem at OneLook Dictionary Search, they probably should not be used interchangeably in a dictionary with our pretensions to technical precision. As I understand it a stem is the invariant, common part of a set of inflected forms of a word. I think it should only be used within a given language. I think root can be used to refer to something more basic than a stem within a language as well as in comparisons across language (I'm hand-waving here.). DCDuring TALK 18:58, 7 March 2015 (UTC)
I don't know about Proto-Ugric, but there is a clear distinction between root and stem in Proto-Indo-European. The root is the most basic lexical part, which has a canonical shape (one or two consonants followed by a vowel [almost always e] followed optionally by a sonorant consonant followed optionally by an obstruent consonant). A stem is in many cases a root (appearing in one of its "grades", full grade, o-grade, or zero-grade) followed by a suffix; the stem is what the endings are added to. A single root may form multiple stems, especially in verbs, which may have a present stem, perfect stem, aorist stem, etc., all formed from the same root but using different "grades" and different suffixes (or no suffix at all—some stems are identical to the roots they're formed from) and maybe other modifications like reduplication. See for example *gʷem-, a root, which forms the present stem *gʷm̥sḱé-, the aorist stem *gʷém- (which happens to be identical to the root in this case), and the perfect stem *gʷegʷóm-. —Aɴɢʀ (talk) 19:31, 7 March 2015 (UTC)
The Uralic languages (to which Ugric belongs) also have a distinction between roots and stems. There are two basic root types: (C)VCV and (C)VCCV, where the second vowel must be a, ä or e (i is also equivalent to e in non-initial syllables). So anything that does not ultimately have this structure is not a root in Uralic. The difference with PIE is that roots can be (and often are) words on their own, so we don't put a hyphen after them. If the root is a verb, we do add a hyphen. As for Ugric, I would be very cautious making reconstructions for it as there isn't actually agreement on whether Ugric even exists as a linguistic group with a definite ancestor (other than Proto-Uralic). User:Tropylium can tell you more. —CodeCat 00:32, 8 March 2015 (UTC)
The technical definition is indeed as Angr says: a root is an inanalyzable content morpheme, a stem is a root plus any possible (productive or fossilized) derivational suffixes. Some definitions may include epenthetic vowels or other morphophonological alternations as a part of a stem, but not as a part of a root; e.g. it would be possible to say that Hungarian hal (fish) has the root √hal, but in some inflected forms the stem hala-.
(The a/ä/e thing is probably not a useful criterion for Ugric, since original unstressed vowels are not distinguished in Hungarian.)
Within etymology, I'd suggest not calling proto-language items "stems", unless one is talking about proto-language morphology specifically. --Tropylium (talk) 01:11, 8 March 2015 (UTC)

Thank you all for the helpful information. I have already started removing the words root and stem, using simply "From Proto-Ugric *xyz-" or "From Proto-Finno-Ugric *xyz". For the proto-language items, I am using two reliable references: Uralonet, an online Uralic etymological database of the Research Institute for Linguistics, Hungarian Academy of Sciences (take a look at kerül and its Uralonet entry, the other is a printed etymology dictionary. The challenge is to provide an accurate translation of the Hungarian text. --Panda10 (talk) 14:15, 8 March 2015 (UTC)

Women honoured in scientific names / Inspire Campaign[edit]

Estimates of the percentage of Wikipedia editors who are female range from 9% to 23% percent.(source) I imagine the stats on Wiktionary are similar. WMF are searching for ways to address the gender gap with their Inspire Campaign. I have little idea how to address that issue in any really useful way.

But if anyone's interested in making entries for women naturalists/biologists, etc who have been honoured in scientific names, like, for example, [[kingsleyae]], I can put together a candidate list of potentially eponymous specific epithets (e.g. the most common epithets ending in -ae which have no other declensions). Then it will be a matter of picking out the names of humans from the list (which will also include places and parasite hosts) and making entries for them. Perhaps some notable scientists who are missing Wikipedia entries could be uncovered, and so feed into efforts of Wikipedians looking for such entries to create. I might try making a test list, and if anyone's interested in adding their name to a proposal, I might write up something for IdeaLab. —Pengo (talk) 16:56, 8 March 2015 (UTC)

I take it that entries like idae#Translingual are not what you have in mind. DCDuring TALK 18:29, 8 March 2015 (UTC)
Looking through the "A"s (through "An") in my Dictionary of Scientific Bird Names, there are a fair number of women's names. Unfortunately, the yield of those who were not wives, daughters, innamoratae, patrons, mythological or historical figures, or unknown is not high, to wit, two: angelae and annae. I looked at ever eponymous epithet in the range. I'm not really willing to go through the whole book with such a modest yield. DCDuring TALK 19:17, 8 March 2015 (UTC)
My impression from looking at hundreds of insect names is that people named tend to be: 1) The people who found and/or provided the type specimens, 2) colleagues (especially authors of invalid names superseded by the names published) 3) benefactors 4) friends and/or family 5) celebrities and/or historical figures 6) targets of disguised insults or other hidden messages. The earlier custom was to draw as much as possible from classical antiquity, which deteriorated into picking random names out of dictionaries as the number of new taxa outstripped the supply of meaningful figures to allude to. The sheer volume of taxa and the restriction on identical generic names or binomials has led to more and more frivolity such as puns, names from pop culture, etc.
Of the categories above, there are some really interesting people in the first category, including a surprising number of women. There are also a few surprises in the second category with some notable female scientists from a century or more ago. Chuck Entz (talk) 22:31, 8 March 2015 (UTC)
@DCDuring, Chuck Entz: — kingsleyae was actually my first find of a missing -ae named for a human, which gave me some hope. Most of the fish named for her seem to have been first discovered by her too. "idae" is kind of borderline, I guess at a minimum, finding who an entry is eponymous for is important (I'm guessing idae usually refers to an Ida of Greek mythology, though didn't find anything definite in my cursory search). Scientists was my initial focus, but there's nothing wrong with increasing the number of female historical figures, patrons, celebrities, and mythological figures too, and it's also quite possible family and innamoratae were also involved in research. —Pengo (talk) 00:15, 9 March 2015 (UTC)
@DCDuring: I don't suppose it would it be any less tedious if you had an "The Eponym Dictionary of Birds"? —Pengo (talk) 03:41, 9 March 2015 (UTC)
A favorite example is the whitefly genus Bemisia, described in 1914 in honor of Florence Eugenie Bemis, who was herself an expert on whiteflies. In 1904 she published a monograph on whiteflies of California in which she described 15 species new to science. I wish I could create a Wikipedia article on her, but I haven't been able to find biographical information, let alone citable references. Chuck Entz (talk) 04:08, 9 March 2015 (UTC)
What about focussing on species which were discovered by women? - -sche (discuss) 21:58, 8 March 2015 (UTC)
@-sche: Species discovered by women would be great, but I have no idea how to find or make such a list. Though it might be easier for plants. The International Plant Names Index (ipni.org) has a "forename" field for their "authors" database, so it could be possible to pick out the feminine names, e.g. Miriam Cristina Alvarez (who described Ditassa oberdanii Fontella & M.C.Alvarez, a dogbane from Espírito Santo, Brazil). Ok, so maybe I do have an idea for how to make such a list. Some of the authors in the database seem to be authors of research papers but don't appear to have any species associated with them, e.g. I.Blok (Ida Blok), which tripped me up a bit. I'm not sure where to find an International list of male/female names. I could try extracting them from Wiktionary and/or try to guess based on suffix. Maybe I should write up a grant proposal. —Pengo (talk) 00:15, 9 March 2015 (UTC)
  • Let's say we do this. Are we doing it so that we can show that we care? If so, how will anyone know what we've done? Do we need a set of women's categories to advertise what we've done? DCDuring TALK 23:07, 8 March 2015 (UTC)
  • @DCDuring: "Are we doing it so that we can show that we care?" Yep. (Also there's a tiny chance it might even encourage new editors, as these entries are fairly straightforward to create.) "If so, how will anyone know what we've done?" Write up some sort of summary on an IdeaLab item I guess. I'll have a go at creating the start of one soon. A category could help. We really ought to have one for eponymous specific epithets named for non-mythological humans or the like already. No idea if a category should be split by gender, but it's easy enough to pick the -ae's from the -i's anyway. —Pengo (talk) 01:13, 9 March 2015 (UTC)

First attempt: Here's a bunch of epithets ending in -ae, sorted by usage in books. Not sure how useful it is. —Pengo (talk) 00:15, 9 March 2015 (UTC)

We have nearly 200 items in Category:Translingual taxonomic eponyms and I don't always remember to categorize the items there, so there could easily be fifty or a hundred more. DCDuring TALK 03:37, 9 March 2015 (UTC)
I got the total up to 660 without creating any new pages. Though only found 14 -ae pages to add (which includes a ship: sibogae). —Pengo (talk) 10:53, 9 March 2015 (UTC)

Here's the IdeaLab page, which I have created in my quixotic quest to gather more participants and interest. Please add your name of support it if you're even vaguely interested. Pengo (talk) 23:03, 11 March 2015 (UTC)

Show/hide broken[edit]

Some days the show/hide (inflections, conjugations, translations) functionality is gone and I can't view translations except by clicking "edit". What's going on? This started to happen one or two weeks ago, perhaps at the same time that "§" characters started to appear next to headings. I'm running Firefox on Linux. --LA2 (talk) 14:02, 9 March 2015 (UTC)

Even when it is broken, the content should always be viewable, so it's a double bug. It should not have anything to do with § though, since § is a new Mediawiki feature, and the "NavBars" (hide/show boxes) are created with MediaWiki:Gadget-legacy.js. If it happens again, could you check the log (Tools > Web development > Web console) to see if there is a javascript error? — Dakdada 15:02, 12 March 2015 (UTC)
Now I removed all cookies from my Firefox browser pertaining to en.wiktionary, and that solved the problem! Can you imagine that a cookie could cause this?! LA2 (talk) 19:05, 15 March 2015 (UTC)
Happened to me too, hours ago. I also deleted cache, which did not solve the problem. Then I checked Delete cookies and other site data checkbox which solved the problem.
@Dakdada, I did look into the console log and I remember there was an error caused by Gadget-legacy.js
It seemed to me that the problem started when I clicked some buttons under the "Visibility" toolbox. --Dixtosa (talk) 19:16, 15 March 2015 (UTC)

Bad italics in comparative/superlative entries[edit]

Could someone please modify Template:en-comparative of and Template:en-superlative of so that they don't put the literal word in italics at the end? e.g. at civilest, it should say "most civil", not "most civil". Equinox 19:16, 9 March 2015 (UTC)

Done. —CodeCat 19:21, 9 March 2015 (UTC)

Codifying sarcastic/ironic and some other rhetorical use as inelligible under CFI[edit]

Vote created at Wiktionary:Votes/pl-2015-03/Excluding most sarcastic usage from CFI

Every so often, a definition like "big: (sarcastic) small" finds its way to RFD. Sarcasm and irony are productive in the English language (and all other spoken languages, as far as I know) and there are effectively no restrictions on what can be twisted sarcastically. Standard practice has been to delete obvious sarcastic and rhetorical use (see eg. talk:touché, talk:James Bond, talk:thanks a lot), but this isn't actually mentioned anywhere. Therefore, I would suggest adding the something like the text quoted below to CFI.

As far as I can tell, this would only result in merging/deleting senses on two pages: great and pray tell, possibly also no kidding, thanks a bunch (which did survive RFD) and eon. Thoughts or improvements welcome. Smurrayinchester (talk) 16:57, 11 March 2015 (UTC)

What exactly is this referring to in "this can be explained in a usage note"? DCDuring TALK 21:12, 11 March 2015 (UTC)
I've tried to make that sentence a bit shorter and clearer. Smurrayinchester (talk) 21:22, 11 March 2015 (UTC)
I'd have guessed that, but it wasn't clear. Thanks.
I agree that it would be useful to be able to point to a policy something like what you've offered. Your draft would be good enough for me, but perhaps it can be further improved. DCDuring TALK 21:45, 11 March 2015 (UTC)
Sounds good to me, though I wonder whether there are cases where a word is now almost exclusively used in a sarcastic way, and rarely or never with its original meaning. If so, those might need special treatment. Equinox 15:51, 12 March 2015 (UTC)
  • I don't want to see this sort of long wording in CFI. I think the problem of sarcastic meanings is marginal anyway. Furthermore, each sarcastic meaning has to be scrutinized for how characteristic it is, and therefore, to what extent it has become lexicalized and thereby inclusion-worthy. The regulatory part (as opposed to explanatory) of the above seems to be largely captured in this: "The straightforward use of sarcasm, irony, understatement and hyperbole does not usually qualify for inclusion." The use of "usually" makes room for reasonable exceptions. If metaphor is intended to be on the list, it needs to be explicity there; it is now conspicously absent. Of course, inclusion of metaphor in the list would make this rather open to abuse. --Dan Polansky (talk) 19:09, 12 March 2015 (UTC)
Metaphor is a tricky case, as you say. Since it's a much more irregular process than the rhetorical devices listed above (or perhaps more accurate, sarcasm, irony, understatement and hyperbole are subtypes of metaphor), and since it's one the main drivers of linguistic evolution, it would be daft to have a blanket exclusion. While it's a bit wordy, I think some explanatory verbiage is needed. CFI changes that just add a rule without giving any context to its application just seem to cause endless squabbling (look at the arguments WT:COALMINE caused). I've put a more pruned version below, which still (I hope) provides enough of the background to the rule to allow it to guide RFD debates effectively. Smurrayinchester (talk) 09:56, 13 March 2015 (UTC)
  • Oppose: Too blanket. There are some sarcastic/ironic definitions that we should have. Furthermore, some words/phrases are used sarcastically frequently, while most are used hardly at all. Purplebackpack89 20:21, 12 March 2015 (UTC)
Can you give an example of a term which would fail CFI under these rules, that should nevertheless be included? The cases that you mention are already covered by with the sentence "Common rhetorical use can be explained in a usage note, a context tag (such as (Usually sarcastic)) or as part of the literal definition." Indeed, usage notes specifically exist to explain the nuances of usage that a definition cannot provide. Smurrayinchester (talk) 09:56, 13 March 2015 (UTC)

Rhetorical devices[edit]

The meaning of a statement always depends on context, and there are various rhetorical devices that speakers and writers use in order to convey a particular message without meaning what they literally say. These include sarcasm, irony, understatement and hyperbole. In speech, the use of these devices is often highlighted by a particular intonation, and in writing, this may be mimicked by the use of italics, quotation marks or exclamation points. Because the set of words and phrases which can be used rhetorically is almost limitless, and because separating ironic use from literal use is often difficult, the straightforward use of common rhetorical devices does not usually qualify for inclusion.

This means, for example, that big should not be defined as "(sarcastic) small", "(understatement) gigantic" or "(hyperbole) moderately large"; the fact that an English speaker might use the word this way is obvious and not especially noteworthy. Common rhetorical use can be explained in a usage note, a context tag (such as (Usually sarcastic)) or as part of the literal definition.

Figures of speech that are not obvious from their parts – for example, a euphemism which successfully disguises its true meaning, or a sarcastic turn of phrase which is more than a simple inversion of meaning – or which are never used literally are not covered by this rule, and can be included on their own merits.

Alternative wording[edit]

The straightforward use of sarcasm, irony, understatement and hyperbole does not usually qualify for inclusion: these are standard rhetorical devices which affect the meaning of a statement as a whole, but do not change the meaning of the words themselves.

This means, for example, that big should not be defined as "(sarcastic) small", "(understatement) gigantic" or "(hyperbole) moderately large"; the fact that an English speaker might use the word in these ways is obvious and not especially noteworthy. Common rhetorical use can be explained in a usage note, a context tag (such as (Usually sarcastic)) or as part of the literal definition. Figures of speech that are not obvious from their parts or which are never used literally are not covered by this rule, and can be included on their own merits.

Phonetic transcriptions (narrowness, number)[edit]

I have been informed that phonetic transcriptions on this site are only to be done on a certain level of depth. As I am personally interested in the variant pronunciations of languages, non-phonemic ones included, I would like to ask whether there are really any great arguments against giving a medium number of regional narrower pronunciations under a broad heading, like in the examples here and here. Korn (talk) 10:39, 12 March 2015 (UTC)

  • I feel like such fine phonetic detail doesn't belong in a dictionary because it's not a lexical property of the word in question. The fact that /ʁ/ is realized as [r] in Bavarian is a fact about the phonology of Bavarian, not a fact about robben. I also wonder how verifiable a lot of these pronunciations are. Who says that it's [ˈʁɔ.m̩], with a highly unusual and almost unpronounceable sequence of vowel plus syllabic consonant in northern and central German? I live in Berlin, and while I've certainly heard [ˈʁɔbm̩] (which isn't even listed), I don't think I've ever heard [ˈʁɔ.m̩]. I don't think I can even produce [ˈʁɔ.m̩] in a way that is reliably distinct from [ˈʁɔm]. And who says that the standard German pronunciation of Madrid is [ˈmadʁɪtʰ] with an aspirated [t] at the end of a syllable? I've never read a phonological description of standard German that permits aspirated consonants at the end of a syllable. I'm also curious about what inflected and derived forms of Madrid are attested to verify the claim that the final consonant is underlyingly /t/, i.e. that the word works in German as if it were spelled Madrit. —Aɴɢʀ (talk) 10:31, 14 March 2015 (UTC)
  • I lived in Berlin (north east) for five years and my impression is that [m̩] is by far the dominant Berlin and German pronunciation. -ben is certainly not pronounced with a fully released plosive like Bad and preventing [b̚m̩] from becoming [m̩] requires some carefulness in speech. When speaking careful, though, I think people normally end up with some form ending in [n] again. Concerning Madrid: The adjective, 'madrider'. Hearing it pronounced with [d] would make me assume the speaker was from an area with intervocalic consonant voicing, i.e. Schwaben, Sachsen, the north et cetera. Its pronunciation with /t/ is based in the devoicing in the noun.
  • As for the lexical property, it could just as well be stated that the fact that /r/ is realised as [ʁ̞] in Western, Central and parts of Northern Germany is a fact about the phonology of Central, Western and parts of Northern Germany and not about the word in question. But at the end of the day, both pronunciations are both permissable and spread variants of the standard language and not features of a non-standard dialect. Hence, if either deserves a place in the list, so does the other. And a note about where they are used seems a reasonable service of convenience. Actively excluding them would mean to blot out a considerable portion of German speakers and creating North-Central-centric bias in this dictionary. Especially with comparison to the English entries, which always differentiate between at least two or more variants (English, American, Australian, Canadian and American dialects), or Indonesian entries which list both /o/ and /ʊ/ (sarung#Malay) and /e/ - /ɪ/, there certainly is some precedent for, at the very least, more level of detail than just a phonemic description of one single accent; even when that accent is the one considered to be the educated regiolect in the cities where most of Germany's TV, radio and cinema is produced.
  • Lastly, as for the aspirated /t/, English Wikipedia cites the Duden Aussprachewörterbuch (which I don't have around to check) as a source for consonants having the same level of aspiration in all positions. It is also mentions that initial-only aspiration is a distinctive feature of northern northern Germany, which is reasonable as the same has been said by Low German grammarians over a century before. Korn (talk) 13:59, 14 March 2015 (UTC)

Coupla new votes[edit]

Thanks to their recent vandal-fighting, I've started a couple of votes for adminhood to be bestowed upon Mr Granger and ISMETA --Type56op9 (talk) 12:43, 12 March 2015 (UTC)

SUL finalization update[edit]

Hi all, please read this page for important information and an update involving SUL finalization, scheduled to take place in one month. Thanks. Keegan (WMF) (talk) 19:45, 13 March 2015 (UTC)

Striking a Blow Against a Spammer[edit]

I just deleted an entry for the name of a business/its website domain name where the definition was a verbatim quote of a slogan from their website (I'm not going to mention the details to avoid giving them the search-engine-ranking boost they were aiming for- I've given enough information here so you can easily find them).

After deleting the entry and blocking the IP for 6 months as a spammer, I took it a step further: I noticed a yelp.com entry for their business, so I signed up there with an account under my own name and zip code and posted a negative review- citing only facts verifiable in the deletion log and noting the lack of direct evidence. Now, whenever anyone searches for the website, this review will come up. Unless I'm missing something, this tactic has the potential to remove some of the incentive/reward for search-engine spam in cases where a negative review would make a difference (this is an advertising/marketing business in Texas).

What does everyone else think about this? Chuck Entz (talk) 00:17, 14 March 2015 (UTC)

This could be an effective approach. There's always the possibility Person A would create an entry for rival Person B's business, knowing we'd delete it and smack Person B, but our historical experience suggests most spammers aren't that smart or else they would have realized by now we delete spam pages and they don't gain any SEO. - -sche (discuss) 02:53, 14 March 2015 (UTC)
I doubt it will make much difference, since spammers are such single-minded meatheads, but it can't actually hurt. If you feel you've got time to mess about filling various online forms then go for it. Equinox 02:59, 14 March 2015 (UTC)


Do we still need {{lang}}? Is there anything that {{lang|it|Nel mezzo del cammin di nostra vita}} does that {{l|it||Nel mezzo del cammin di nostra vita}} (note the two vertical bars after it) doesn't? If I want to put a link inside {{lang}}, e.g. {{lang|it|Nel [[mezzo]] del cammin di nostra vita}}, it doesn't even tell the link to go to the Italian section, while {{l|it|Nel [[mezzo]] del cammin di nostra vita}} does tell the link what language it is. —Aɴɢʀ (talk) 10:48, 14 March 2015 (UTC)

In which situations is {{lang}} used anyway? I’ve only seen it used in quotations, but I think we would benefit from a template specifically for that (one that works like {{usex}}). — Ungoliant (falai) 17:57, 14 March 2015 (UTC)
Besides quotations, I've sometimes used it in inflection-table templates for forms that don't need linking. —Aɴɢʀ (talk) 19:51, 15 March 2015 (UTC)
Looks like replacing it with {{l}} is the way to go. — Ungoliant (falai) 14:28, 16 March 2015 (UTC)
My gut is to keep both. As I've said time and again, merging and moving templates does little other than confuse a lot of editors. Purplebackpack89 14:33, 16 March 2015 (UTC)
The way this process should and used to work is that, if folks agree, the template is deprecated, then its use converted to some other, then deleted.
Deprecation can be preceded by discouraging use. Should we discourage use of this in any of its applications? In all of its applications? The discouragement can be in the form of changing the documentation, gradually converting some or all uses to some other template, as well as any adverse conclusion of discussions such as this. Ii also might be a a good time to determine whether the replacement templates are as good as they could be and to review their documentation. It is a bit more work, but a gradual process should reduce the adverse effects on contributor habits, and extend the utility of edit histories that use older templates. DCDuring TALK 17:05, 16 March 2015 (UTC)
I don't really give a flying fox if we delete it or not; I just want to know if there's any particular reason I should keep using it. —Aɴɢʀ (talk) 21:23, 16 March 2015 (UTC)
Based on {{lang/documentation}} it's basically a shortcut to <span lang="LANGCODE"></span>, which I think is still needed because of browsers that don't work out the script for themselves. How useful it is for languages that use the Latin script, well, I think it only changes the HTML, to a human user, it's no different. Renard Migrant (talk) 20:29, 17 March 2015 (UTC)
  • Usability perspective:
My current understanding is that various accessibility and other tools can make use of linguistic metadata provided by {{lang}} to decide how to handle text. I've been using it for some time to specify that non-link text I am entering is not English.
From what I've been able to test, both {{lang|LANGCODE|$Text}} and {{l|LANGCODE||$Text}} produce identical output in the browser:
<span class="LANGCODE-SCRIPT" lang="LANGCODE" xml:lang="LANGCODE">$Text</span>
This proposed change would thus only 1) affect what templates editors use, and 2) require that someone go through and change all instances of {{lang}} over to use {{l}} instead.
I'm fine with that. I can't think of any other real downsides. ‑‑ Eiríkr Útlendi │ Tala við mig 23:15, 17 March 2015 (UTC)
There are some differences. Compare {{lang|ru|[[тест]]}} and {{l|ru||[[тест]]}}. —CodeCat 00:08, 18 March 2015 (UTC)
  • With the [[]] link brackets, {{lang}} produces:
<span class="Cyrl" lang="ru" xml:lang="ru"><a href="/wiki/%D1%82%D0%B5%D1%81%D1%82" title="">тест</a></span>
Meanwhile, {{l}} produces:
<span class="Cyrl" lang="ru" xml:lang="ru"><a href="/wiki/%D1%82%D0%B5%D1%81%D1%82" title="тест">тест</a></span> (<span lang="" class="tr" xml:lang="">test</span>)
Without the [[]] link brackets, {{lang}} produces:
<span class="Cyrl" lang="ru" xml:lang="ru">тест</span>
{{l}} produces:
<span class="Cyrl" lang="ru" xml:lang="ru">тест</span> (<span lang="" class="tr" xml:lang="">test</span>)
It looks like the key difference is addition of transliteration for those languages for which our infrastructure supports transliteration.
Query: Are there any use cases where users would want to 1) mark text as a specific language, but 2) not have any automatic transliteration? ‑‑ Eiríkr Útlendi │ Tala við mig 18:21, 18 March 2015 (UTC)
Our templates already support tr=- to suppress transliteration. So you only have to search for entries which have that. It's probably used mostly in inflection tables. —CodeCat 19:46, 18 March 2015 (UTC)
I think the idea is something like this, on aduire, where the intention is not to link. Renard Migrant (talk) 17:33, 19 March 2015 (UTC)
But that's where you'd use {{ux}}. —CodeCat 18:36, 19 March 2015 (UTC)
It's a citation, not a usage example. Renard Migrant (talk) 12:45, 21 March 2015 (UTC)


Whyyyy do we have both {{ux}} and {{usex}}??? ‑‑ Eiríkr Útlendi │ Tala við mig 18:38, 19 March 2015 (UTC)

See Wiktionary:Grease pit/2014/February#Template for eg over usex like label over context. —CodeCat 18:59, 19 March 2015 (UTC)
They work the same, one being a redirect to the other.
{{usex}} came first and its name is a bit more intuitive, so some users are accustomed to it and it might be a little bit easier for someone new to Wiktionary to figure out what was intended. As evidence of usex being more intuitive, it gets some use on our discussion pages as an abbreviation of usage example, whereas I don't recollect a single instance of such use of ux". OTOH, {{ux}} is shorter. If there were a big shortage of two-letter codes or a clearly better use for either of the template names we could revisit the matter. DCDuring TALK 20:54, 19 March 2015 (UTC)

Pronunciation formatting[edit]

Should phonetic or phonemic transcription be preferred, by default? WT:PRON appears to be silent on this. Yet this can be a relatively large difference for languages where a word's surface realization involves several phonological processes.

Also, {{IPA}} seems to link every pronunciation to the corresponding [[w:$LANG phonology]] article, even if one does not exist. This seems like a bad idea, given the policy that "[i]deally, every entry should have a pronunciation section". I would suggest instead directing it by default to [[w:$LANG language#Phonology]] (though it seems possible to contemplate defining a set of languages for which it instead links to the separate phonology article). --Tropylium (talk) 14:23, 16 March 2015 (UTC)

I prefer phonemic transcription because that's what most dictionaries use and because that's what lexical. That said, the phonemic transcription need not be highly abstract; for example, if the distinction between two phonemes is loss in a certain environment, then the sound that surfaces can be transcribed even if an abstract analysis would regard the other sound as the underlying one. (For example, German Rad can be transcribed /ʁaːt/ rather than /ʁaːd/ since /t/ and /d/ are distinct phonemes in German, even though an abstract analysis would posit /ʁaːd/ as the underlying form.) But that's just my preference; we have plenty of examples of narrow transcription being used, and there's no reason we can't use both. —Aɴɢʀ (talk) 21:06, 16 March 2015 (UTC)
If allophonic differences cause the distinction between two phonemes to collapse, then that collapsed phoneme should really be treated as new phoneme in itself, rather than either of the original phonemes. For example, in Eastern Catalan, unstressed /a/ and /e/ fall together as /ə/, and you can't really say which of the two it originally belongs to. It's a new phoneme altogether, albeit one that occurs in complementary distribution to both /a/ and /e/. For final devoicing, the same applies in principle, albeit that the phonetic realisation of the new phoneme coincides with the realisation of one of the two phonemes that it results from. But the distinction is definitely phonemic, and it's only when you go into morphophonemics, comparing related forms of a lemma, that the original /d/ arises. Another way to look at it is to ask: if Rad were the only possible form and had no other forms or related terms to compare it with, how would you know it was /d/ underlyingly? You couldn't, and therefore the phoneme is /t/. —CodeCat 21:33, 16 March 2015 (UTC)
I agree. —Aɴɢʀ (talk) 22:26, 16 March 2015 (UTC)
Phonemic, please! I don't think anyone wants to see a whole raft of vowel variants for Yorkshire, London, Manchester, Essex, Scotland, etc. — and that's just the UK! Equinox 21:17, 16 March 2015 (UTC)
I'll agree with everyone then. Renard Migrant (talk) 20:29, 17 March 2015 (UTC)
I'm not asking due to dialects as much as languages with several surface filters between phonemics and phonetics. For an example: Tundra Nenets леды (skeleton) is phonologically analyzable as IPA(key): /lediă/, phonetically realized as IPA(key): [lɤːðɨː]. Would you mandate transcribing the former? Or would you be OK with using "subphonemic" transcription where e.g. the vowel backing process, universal in all varieties of the language, is transcribed? How about the lenition of /d/, which is almost universal — would you consider the fact that there exist a few dialects that have [d] in this position sufficient grounds to not mark [ð] at all?
(For that matter, suppose I were to indicate an underlying phonemicization IPA(key): /lixt/ or even just IPA(key): /līt/ for light, citing w:The Sound Pattern of English…?)
"Do not put in tons of dialectal pronunciations" is not at all the same as "put everything in purely phonemic transcription". --Tropylium (talk) 00:04, 18 March 2015 (UTC)
That's a difficult case. On the one hand, you don't want to give such a highly abstract representation (like the SPE ones you mentioned) that the word would be unrecognizable to native speakers if pronounced the way it's transcribed. On the other hand, you don't want to overwhelm the user with a bunch of fine phonetic detail whose absence would probably not be noticed by native speakers. One rule of thumb I sometimes try to follow in cases like this is "How narrow a transcription can I get without using any IPA diacritics, superscripts, etc., but only the basic characters?" Obviously that rule can't be applied exceptionlessly in all cases, but if [lɤːðɨː] is unambiguous as it stands, then don't go overboard and transcribe it [l̪ˠɤ̽ːð̺ɨ̠ː] or whatever. —Aɴɢʀ (talk) 20:02, 18 March 2015 (UTC)
Would it not be possible to automatically generate phonetic transcriptions from the phonemic one? After all, it's predictable by definition. —CodeCat 21:11, 17 March 2015 (UTC)
In the past, a few users suggested using super-broad/"diaphonemic" transcriptions. Perhaps one day English entries will have expandable templatized pronunciation sections like Chinese entries, where phonemic and semi-narrow phonetic transcriptions into major dialects are shown by default, while smaller dialects' pronunciations, and super-broad/"diaphonemic" and super-narrow transcriptions, are shown when the template is expanded. (Check out the obscure dialect+chronolect in dirty.) PS I definitely agree that Rad should be transcribed as ending with /t/, not /d/. - -sche (discuss) 01:24, 19 March 2015 (UTC)
For what it's worth, I heavily support pursuing this idea. Not that my word seems to be worth much, as before I both asked about how to create such a collapsible template and just a bit further up this page asked more or less the same question about narrowness and diaphonemic/dialect IPA policy and was widely ignored both times. Korn (talk) 11:45, 19 March 2015 (UTC)
Automatic phonemic → phonetic transcription is in principle viable, but fully unified diaphonemic transcription is generally not. Consider examples like lava or pasta. --Tropylium (talk) 00:58, 1 April 2015 (UTC)

Request for citations (!= RFV) for entries not in other dictionaries[edit]

We have a good number of entries (definitions in entries) not in other dictionaries that have no citations. They really should have some citations to confirm our definition and to make us look a little more systematic than Urban Dictionary. The RfV process gives urgency to the process of attestation, but that urgency may be excessive for many of these. Would it make sense to have {{rfcites}} (or something) for entries that were not in {{R:OneLook}}, {{R:Century 1911}}, or any glossary or dictionary in Google Books (template to be written)? I suppose it would be most productive for this to be applied first to entries. Attempting to determine whether a definition is or is not in another dictionary is much harder than determining whether a term is. DCDuring TALK 17:08, 17 March 2015 (UTC)

I like the idea of some sort of collaborative wiki project where you can grab any word/sense without the requisite three citations and go away and cite it, and it is then removed from the list. This would take a lot of organisation, and a bot. Still, it could be done, and I would rather that we generate a separate list, based on our current entries, than change those entries by adding yet more template markup to them. Equinox 19:43, 17 March 2015 (UTC)
There are {{rfquote}} and {{Template:rfquote-sense}}, created on 24 October 2007‎ and 22 October 2007‎. They categorize into Category:English entries needing quotation, which now has 10,750 items. --Dan Polansky (talk) 19:54, 17 March 2015 (UTC)
Thanks. I had looked at Category:Request templates, really. There are only about 60 uses of {{rfquote|lang=en}} AFAICT, somwhat fewer of {{rfquote-sense|lang=en}}. Would it be unreasonable to categorize the English ones into a specific category? DCDuring TALK 01:02, 18 March 2015 (UTC)
I sometimes put {{rfex}} on senses that strike me as dubious but don't seem worth an RFV. I don't think those categorise, but at least it's something editors will see while editing. Equinox 19:55, 17 March 2015 (UTC)
Would it be helpful to have {{rfex}} categorize into the same category as {{rfquote}} or into a different category or none at all. If it contained "en" or "lang=en", the new search could find it, even without a category, but someone would have to know search and know what to look for. DCDuring TALK 01:01, 18 March 2015 (UTC)
Tentative support but not on the main page, perhaps. Also, we need to consider normalisations of spellings and rare languages, for example, quoting Chechen word чӏогӏа (č̣oġa, strong) would be difficult for two reasons - non-standard spelling "чlогӏа" is more common (problems with palochka, especially lower case "ӏ") and Chechen doesn't have a lot of digitised books published. --Anatoli T. (обсудить/вклад) 01:24, 18 March 2015 (UTC)
{{newrfquote}} could be made less conspicuous, like {{rfelite}}, and placed at the bottom of the L2 section. {{rfquote-sense}} is relatively inconspicuous and could me made a bit less conspicuous.
As to the other problems, of course, I'm thinking mainly of English. Judgment needs to be applied for each language, indeed for each individual use. DCDuring TALK 01:47, 18 March 2015 (UTC)

Walser German and Swabian[edit]

It has been brought to my attention that we have Category:Walser German language and Category:Swabian language.

In my opinions, we shouldn't treat these two as separate languages. They are part of the Swiss German dialect continuum, which is covered via Category:Alemannic German language. There is no reason at all to keep Swabian. As for Walser German, it is the least intelligible of the dialect continuum, virtually incomprehensible even to other Swiss German speakers, but linguistic tradition has always treated it as just a variety of Swiss German, and there is no reason why we shouldn't follow suit. We can always use dialect labels to distinguish the different languages, and there are many, many more varieties of Swiss German (like Alsatian) that are not covered. -- Liliana 22:18, 17 March 2015 (UTC)

Yes, merge them into gsw (Category:Alemannic German language). There are lexical distinctions and phonological and hence orthographic distinctions that can be drawn between the lects, but none of them are so great that it would be sensible to treat the lects as separate languages. (And there are many equally distinct varieties of the Alemannic dialect continuum which have not been granted codes, as you've noted.) A cynic might wonder if the reason Ethnologue et al are so much quicker to grant codes to the dialects of other languages than to the dialects of English is that they all speak English well enough to recognize how silly it would be to consider da yooge boid ate da olykoek /də judʒ bɜjd eɪt də ˈ(oʊ~oə).lɪ.kʊk/ and the huge bird ate the doughnut /ðə hjudʒ bɝd eɪt ðə ˈdoʊ.nʌt/ different languages. - -sche (discuss) 20:01, 20 March 2015 (UTC)
Incidentally, I stumbled onto this today: "I notice that 'Walser' is counted as a separate language, but all the other Swiss German dialects are grouped under Swiss German. Does anyone happen to know why this is? My grandparents speak Walser and we have absolutely no trouble communicating (I speak a different, High Alemannic, dialect)."
I have merged Walser into gsw.
- -sche (discuss) 21:28, 17 April 2015 (UTC)


Any thoughts on Wordset? They open sourced their code and data recently and emphasize a structured data approach (in contrast to Wiktionary). Their claim that Wiktionary is "unstructured" is not really correct, there a number of tools which can successfully parse the content (I contribute code to one of them). At best I would call Wiktionary "semi-structured". What I agree with however is that it is time to try out new ways to build a collaborative platform at scale. For instance there is a voting system built into wordset which is used to reach consensus on proposed changes. The big problem is that Wiktionary (and Mediawiki) can be quite intimidating to potential new contributors, the templating system is powerful but also complex. And it was obviously never designed to create a dictionary. On the other hand Wordset's data model is quite limited at the moment (for a project that aims to be more structured), and they only focus on English headwords, at least initially. Jberkel (talk) 18:08, 18 March 2015 (UTC)

  • I'm not impressed. SemperBlotto (talk) 08:03, 19 March 2015 (UTC)
  • It is quite easy to have a data structure when only focusing on one language and only looking for a definition. But try to do that with all languages (described in several languages), with much more diverse information to store and organize (pronunciations, etymologies, flexions, synonyms...) and it becomes very difficult. The semi-structured Wiktionaries allows to have all of these, but at the cost of a real structure (also parsers only work to some extend, and usually only for one Wiktionary language), which indeed make it difficult to reuse the data. Wikidata may be able to improve this, but it is going to be very difficult. Nonetheless, this Wordset site is open-source, including the definitions, and with a philosophy close to the Wikimedia projects, so we should not try to see it as an adversary. — Dakdada 09:16, 19 March 2015 (UTC)
    @BD2412: If they don't import content from dictionaries like us, it will take them a long time to achieve coverage. Some of their content is apparently from WordNet and is available on what looks to me like a non-standard license. Their content is "Creative Commons Attribution-ShareAlike 4.0 International License". Can they simply import our content given that license? Can we use their content provided we include them as a reference? DCDuring TALK 15:22, 19 March 2015 (UTC)
    Yes, and yes. They claim to use "the same CC license for the content as Wikipedia uses, CC-BY-SA", and we can hold them to that. Like everyone else in the world, they are free to copy and reuse our content so long as they credit us for it, and we are free to do the same as to theirs. I would not hold my breath on their providing anything that we can actually use, however. bd2412 T 19:07, 19 March 2015 (UTC)
    Thanks. They might have some particularly well-worded definitions and usexes from time to time. BTW, can we copy WordNet with acknowledgement or is their license a little different?
I've been thinking that it would be handy to have a definition-writers custom edit interface that automatically generated links to various copyright-free and appropriately licensed dictionaries' entries for the headword being edited. Other links might be to various corpora and gateways. Standard boilerplate to credit the sources that needed crediting could be part of it too. At a very basic level templates like {{taxlook}} and {{REEHelp}} do a little of this, but a complete editing interface would be much better. DCDuring TALK 20:42, 19 March 2015 (UTC)
@BD2412: Indeed. In contrast Wordset needs a simple acknowledgement and link to their site. Or is our way of tracking changes not sufficient? DCDuring TALK 21:04, 19 March 2015 (UTC)
From a first glance, it doesn't look like their licence is compatible with ours, in particular the CC ShareAlike clause. ShareAlike means copyleft; they can't "add" restrictions on content that are not present on its original form on Wiktionary. If licencees have to include their copyright notice, it violates that, because such a requirement does not exist here. So Wordset cannot use Wiktionary content. —CodeCat 21:10, 19 March 2015 (UTC)
But our license is also CC ShareAlike. —Aɴɢʀ (talk) 21:39, 19 March 2015 (UTC)
I was referring to ours. Theirs is not, as far as I can tell. So it's probably incompatible. —CodeCat 21:47, 19 March 2015 (UTC)
It is the same, per their own words: "Specifically, we’re going to be choosing the same CC license for the content as Wikipedia uses, CC-BY-SA." [3]Dakdada 16:31, 20 March 2015 (UTC)
Yeah, it's confusing because we're simultaneously talking about Wordnet and Wordset in this thread. Wordset has the same license we do, but Wordnet doesn't. —Aɴɢʀ (talk) 16:41, 20 March 2015 (UTC)
Anyone else notice how they function on "yae" [sic] votes? Heh. Equinox 14:25, 19 March 2015 (UTC)

Interlanguage (interwiki) links[edit]

Does anybody know what the plan is for interlanguage links? Wikidata (as used in Wikipedia) is not yet used in Wiktionary. New articles lack interlanguage links (one example is lägel, created by me on February 21, which also exists on sv.wiktionary, but isn't linked) and existing articles here lack interlanguage links to newly created articles in other languages of Wiktionary, apparently because the interwiki bots have stopped. Should the bots be restarted? Or will Wikidata support come soon? --LA2 (talk) 21:58, 23 March 2015 (UTC)

@LA2: See d:Wikidata:Wiktionary. The fact that interwiki links aren't handled by Wikidata is pretty ridiculous, really. In (e.g.) Wikipedia, there won't be a direct one-to-one equivalent of every idea in every language edition and figuring out where all of them should point can be really tricky. In Wiktionary, it's irrelevant: the entry at wikt:en:foot and wikt:es:foot should link together no matter what (as long as neither of them is a redlink). This could all be accomplished painlessly in an afternoon. —Justin (koavf)TCM 01:52, 24 March 2015 (UTC)
"the entry at wikt:en:foot and wikt:es:foot should link together no matter what [...] this could all be accomplished painlessly in an afternoon": Indeed. And no shortage of Wiktionarians have pointed that out to the folks at Wikidata. They, in turn, have made it clear they are not going to do it. - -sche (discuss) 03:57, 24 March 2015 (UTC)
@-sche: Do you have links or diffs? I can't imagine that the Wikidata community refuse to make interwiki links on Wiktionary. —Justin (koavf)TCM 04:35, 24 March 2015 (UTC)
I think they would like to do all of Wiktionary at once, not just interlanguage links. And since defining a structure for Wiktionary linguistic data is really hard (and much discussed), they will probably not attack the problem until the other projects are converted to Wikidata.
Also, there are some small exceptions that we need to take care of for interlanguage links: see the table I made in d:Wikidata_talk:Wiktionary#First_and_second_phases, in particular the "apostrophe", "capital" and "other" interwikis. Those are due to different communities typographic rules (and some errors). — Dakdada 10:24, 24 March 2015 (UTC)
This sounds like a deadlock situation. How sad! In the meanwhile, it couldn't hurt to restart interwiki bots, could it? I still have bot status (LA2-bot) on some languages of Wiktionary, so should I just go for it? LA2 (talk) 15:29, 24 March 2015 (UTC)


Hi. I'm not gonna be using this username anymore. Time for a change. See you soon with a new name. --Type56op9 (talk) 12:49, 24 March 2015 (UTC)

OK thanks for letting us know ♥ Soap (talk) 15:07, 26 March 2015 (UTC)
It wasn't one of your better names, really. Equinox 02:06, 27 March 2015 (UTC)

Entries from the GCIDE labeled "Webster 1913 Suppl."[edit]

So I've found entries in the GCIDE which are missing from Wiktionary, such as "Pimola":

 <hw>Pim*o"la</hw> <pr>(?)</pr>, <pos>n.</pos> <def>An olive stuffed with a kind of sweet red pepper, or pimiento.  </def><br/
 [<source>Webster 1913 Suppl.</source>]</p>

Apparently these are from the "Webster 1913 Suppl."

My question is this: Should these be copied into Wiktionary? Is it OK for me to copy this definition into Wiktionary, or are there license restrictions for the "Webster 1913 Suppl." ?

Oh, interesting. There are also other words missing which are labeled, simply, "1913 Webster", such as Pinxit:

 \'d8<hw>Pinx"it</hw> <pr>(?)</pr>. <ety>[L., perfect indicative 3d sing. of <ets>pingere</ets> to paint.]</ety> 
 <def>A word appended to the artist's name or initials on a painting, or engraved copy of a painting; <as>as, <ex>Rubens pinxit</ex>, Rubens painted (this)</as>.</def><br/
 [<source>1913 Webster</source>]</p>

Should these be copied in? "Pinxit" seems like a pretty useful word. Are there issues with using the GCIDE definitions?

It's out of copyright due to its age, so you can do what you like with it. Please add them! Equinox 13:33, 28 March 2015 (UTC)
The worst that could happen is that some of them won't meet our attestation standards. Add 'em and we'll sort that out eventually. DCDuring TALK 13:41, 28 March 2015 (UTC)
Oh, yea. You should register. It makes it easier for us to communicate with you in a friendly way. DCDuring TALK 13:44, 28 March 2015 (UTC)
Thanks, I'm registered now. User:Pnelsonmusic But obviously a newby. I've been reviewing the GCIDE for a search engine linguistic processing project and have noticed these differences between it and Wiktionary. Maybe I'll write a program to identify all missing items. TALK 13:52, 28 March 2015 (UTC)
We look forward to your contributions. Equinox has done a lot of work on getting entries from Webster 1913. DCDuring TALK 15:43, 28 March 2015 (UTC)
Not all of us really approve of copying definitions from other dictionaries, even when they are out of copyright. Definitions are supposed to be our own work. But I have no objections on obtaining lists of words from ANY dictionary or similar source - I do that myself. SemperBlotto (talk) 09:15, 29 March 2015 (UTC)
But we see farther if we stand on the shoulders of the giants who preceded us. DCDuring TALK 09:43, 29 March 2015 (UTC)

How can I add a "thank" note?[edit]

How can I add it to edits in entry histories, next to "undo"? Or is it visible to other users and not to myself? I'm more used to the "undo" function being used, or just tacit approval of edits. Donnanz (talk) 17:18, 28 March 2015 (UTC)

The person receiving thanks gets a notification. Others have to read the log. — Ungoliant (falai) 17:37, 28 March 2015 (UTC)
I understand that, I've done that myself and have also received thanks. But I'm afraid that doesn't really answer the question. It doesn't show in the entry history for edits I do. Donnanz (talk) 18:55, 28 March 2015 (UTC)
Maybe your page histories look different from mine, but when I look at a page history, the "thank" button is there for all diffs except my own and those made by anons. —Aɴɢʀ (talk) 10:16, 29 March 2015 (UTC)
Ah, I was beginning to suspect / think that. Thanks, I guess that solves that. Donnanz (talk) 10:20, 29 March 2015 (UTC)

Death and taxes[edit]

No entry for death and taxes? Really? ;) ~ heyzeuss 05:07, 31 March 2015 (UTC)

Damn! UD has it and we don't! We are doomed. DCDuring TALK 12:49, 31 March 2015 (UTC)
What is it supposed to mean; how would you define it? It's just part of a popular phrase about things that are inevitable, but it still only means, in that phrase, DEATH + AND + TAXES. Equinox 15:30, 31 March 2015 (UTC)

Bad Romanian translations[edit]

I know I keep sounding like a broken record - this is not the first time I bring this up - but I've been monitoring Romanian translations and entries, and BaicanXXX is adding incorrect entries again. A great number of contributions are direct translations and don't reflect Romanian equivalents. For instance answerphone is "robot telefonic", not telefon cu răspunzător de apel or telefon cu răspuns automat. The term traveling in basketball is translated pași and not pași greșiți. Baican's translations are four times out of five explanations and not Romanian equivalents. I've also recently found out that he has several operating sock puppets, most of which have been blocked by Romanian Wikipedia administrators because Baican kept using words that don't exist and users who opposed this user's way of contributing were harassed. I just want to know what to do; I usually correct mistakes when I see them, but it's hard to keep up. If his contributions are deemed to be ok, then I'll back off. I just want to know which policies apply. Thank you in advance, --Robbie SWE (talk) 19:59, 31 March 2015 (UTC)

This is definitely a worthy issue. I've given him a warning on his talkpage about it. The way you can help is to a) revert his incorrect/unidiomatic translations and fix them, b) tag his entries for WT:RFV if they're not actually used in Romanian or WT:RFD if they're just a sum of parts, and c) mention on his talkpage if he continues to make these errors so that any admin who may wish to block him in the future has a record of any offences that may occur after he was warned. —Μετάknowledgediscuss/deeds 06:39, 2 April 2015 (UTC)
Oh, and I forgot to mention: if he is operating sockpuppets here, please report them! Abusing multiple accounts is not okay (e.g. editing with one account if another has been blocked). —Μετάknowledgediscuss/deeds 06:53, 2 April 2015 (UTC)
Ok, I'll make changes where needed from now on. I blocked Baican's sock puppets (the ones I could find were Bon.line, LaPietre, Trepier and WernescU) back home in the Romanian Wiktionary after making sure that the administrators of the Romanian Wikipedia confirmed that these accounts truly belonged to the same user. Just a heads-up about Baican's modus operandi: he never responds in the language of the Wiktionary project he is active in, so don't expect him to answer in English. --Robbie SWE (talk) 18:26, 2 April 2015 (UTC)
Unless he's currently active, there's not a lot to discuss they all need checking individually. It's not a policy issue. Many of them are so bad even I can confidently remove them. Many of them are sentences, like for marathon he'd add in Romanian race of 26.2 miles. That's a fictional example but many of them are as bad as that if not worse. Renard Migrant (talk) 23:11, 2 April 2015 (UTC)

April 2015

International Journal of Lexicography[edit]

Hi, I'm not sure where to post my question. Is there someone who has access to the International Journal of Lexicography? I'm interested in an article on Dictionary illustrations because we have some recent discussions about it on Czech Wiktionary (they are strictly prohibited there and routinely deleted) and this article is supposed to provide good insight. However I don't want to spend $28 on it (I tried the DeepDyve free trial subscription but they have this journal only since 2004). Thanks to anyone for any help. --Auvajs (talk) 05:38, 2 April 2015 (UTC)

@Auvajs: I have access. I've downloaded it and I'll be emailing you with the pdf shortly. —Μετάknowledgediscuss/deeds 06:12, 2 April 2015 (UTC)
Wow, that was so fast! Thanks very much! :) --Auvajs (talk) 06:19, 2 April 2015 (UTC)
Rádo se stalo. —Μετάknowledgediscuss/deeds 06:45, 2 April 2015 (UTC)
Images are really useful in dictionaries. I have an Italian language dictionary - sometimes I struggle to quite make out what the definition means, but then they have it in a page of pictures and I think "Oh! One of those!". SemperBlotto (talk) 07:06, 2 April 2015 (UTC)
@SemperBlotto: In case anyone didn't know: Wiktionary:Picture_dictionary. —Justin (koavf)TCM 07:15, 2 April 2015 (UTC)

Suggested modification to sign language syntax[edit]

In British Sign Language (BSL) there is a hand shape that is frequently used which can't be described using the current notation (usually called '10' http://www.signbsl.com/sign/10). I suggest adding ...i@... (for [i]nside or t[i]p) to Entry_names, Detailed description of phonemes used in sign entry names to fix this. ...i@... would be underneath ...p@... in the Detailed description of phonemes used in sign entry names for 'The thumb tip contacts the inside of a finger of the same hand'. Nathanael Farley (talk) 07:41, 2 April 2015 (UTC)

On etymology referencing[edit]

I recently added a recommendation on WT:ETYM that references for the reconstruction of proto-words should preferrably be centralized on their appendix pages. User:Dan Polansky reverted this with the comment:

I see no discussion supporting such a policy, and it is easy to verify that Wiktionary by and large does not reference etymologies so the putative policy contradicts long-term common practice

This was re-reverted as minor by User:Angr. Fair enough I guess, the page is a draft proposal after all. But I support having some discussion regardless.

There is a vote Reconstructions need references in preparation, but rather than editing that right away I'd like to sound the community for a few propositions — since, in addition to my comment on reference formatting it seems that we lack agreement on reference requirements in the first place as well.

Still, note that in principle these are separate questions. Wiktionary undeniably allows references, and my original comment applies to how they should be organized when present. But regardless, for starters, my thoughts on reference requirements:

  1. All etymological information should be verifiable. Which applies to several types of information, e.g.:
    • A given set of words being related to each other at all.
    • A given set of words being related in a specific fashion: usually, common descent or loaning in a specific direction.
    • Reconstructions (phonological, morphological and semantic) for proto-forms of words that are related by common descent.
  2. Wiktionary is a work in progress, and edits to etymology sections do not need to immediately have every possible detail sourced.
    • It would probably be a good idea to bring in a {{citation needed}} template and other similar ones for tagging information whose validity someone contests, or which someone thinks should just be cited more clearly. (Also, these two things probably need different inline tags.)
  3. Elementary synthesis of sources is allowed. E.g.:
    • Phonetic reconstruction: if we can source to Smorgle that *k in Proto-Foobar is reflected as Foo h, Bar k, and if we can similarly source to Zoop that the Foo word hu and the Bar word ku are related, at this point there is no problem in creating a Proto-Foobar entry *ku, even if we for the moment have no exact citation for this proto-form in the literature.
    • Phonetic reconstruction: if Smorgle gives a Proto-Foobar word *tata, Zoop gives a Steamy Foo reflex haha for this etymological set, and Shroobadoo has argued that Proto-Foobar *t should be reconstructed as *☕ whenever Steamy Foo has /h/, we can just fine reconstruct Proto-Foobar *☕a☕a rather than *tata, even if Shroobadoo never treated this particular word.
    • Etymological affiliation: if von Papperson states that word-initial *pr was forbidden in Proto-Foobar but evolved in early Bar; Böp states that in Swampy Foo there are loanwords such as prumf that have been acquired from Bar; and Zoop states that Swampy Foo pripi and Bar pripi are related, there is no problem in asserting that the former is probably a loan from the latter.

The third point might be the most contentious. Several further caveats may be required, e.g. all claims used as basis for synthesis of sources should probably represent mainstream scholarship. --Tropylium (talk) 07:44, 2 April 2015 (UTC)

I agree on all your points – a dictionary is more than definitions – these are methods that "keep an honest man honest" and separate fringe from scholarly interpretation of facts. Also, scholarly consensus changes and providing references or mentions places the etymology on a timeline for future contributors. BTW, User:Dan Polansky also doesn't like my documentation method of including content that is not an attestation of usage, and doesn't like me referring to Wiktionary:Citations and Wiktionary:References since they are not policies. You are not alone. —BoBoMisiu (talk) 13:32, 4 April 2015 (UTC)
  • Yes, "All etymological information should be verifiable" but that does not mean that the English Wiktionary should demand that parts of etymology are referenced using <ref> technique. I oppose such demand, and so does the overwhelming previous practice. --Dan Polansky (talk) 13:41, 4 April 2015 (UTC)
@Dan Polansky: I noticed your subtle moving the goalposts from a discussion about references in an entry into a discussion about a particular format of references (wrapping references in <ref>s). Referencing helps future contributors, shows quality and credibility of the content, demonstrates veracity, and is valid rationale to defend against accusations of plagiarism and copyright infringement. Transparency is a good. —BoBoMisiu (talk) 15:31, 4 April 2015 (UTC)
Moving the goalposts would be making the standard stricter to exclude what passed the earlier one. As to the point at hand: you do tend toward excess in citing. You don't need to nail down every single detail in a definition with a reference. Etymologies need to be referenced, yes- but not definitions. A descriptive dictionary defines terms the way people use them, not the way reference works specify is correct. If people use water beetle to refer to a cockroach, so do we- even though a cockroach isn't a beetle. Technical terms such as taxonomic names are somewhat of a special case, since correctness as a technical term is relevant. Still, the taxonomic literature is full of misapplied taxonomic names, and of changes in meaning due to splitting and lumping. In my area, most of the scrub oaks referred to as Quercus dumosa are really Quercus berberidifolia- for a long time no one paid attention to the difference. While true Q.dumosa only grows along the coast, there are all kinds of references in the literature in numerous disciplines to Q.dumosa being found/used, etc. far inland. Wiktionary isn't a journal of record, and citing details in the definitions as if we were is a bit deceptive and adds to clutter. That's not to say we shouldn't have links to other, more comprehensive sources, but only in moderation. This is a dictionary, so we try to keep things structured and streamlined for ease of use. Chuck Entz (talk) 17:12, 4 April 2015 (UTC)
@BoBoMisiu:, Dan Polansky is not 'subtly moving the goalposts' as what he refers to was part of the original reverted addition. I think the way the server interprets <ref></ref> syntax is ugly, and I try to avoid it. Also, people over use it, they use it for citations instead of just writing out the citation next to the sense which is to be cited, which is the best way to do it. Renard Migrant (talk) 17:23, 4 April 2015 (UTC)
Also verifiable is not the same as verified. Verifiable means if you try to verify it, you can. That's what we want. If something's not contentious, don't add a reference for it, because we end up with references coming out of our ears and the actual definitions become hard to find even for experienced users. Renard Migrant (talk) 17:26, 4 April 2015 (UTC)
@Chuck Entz: I agree with you, "If people use water beetle to refer to a cockroach, so do we- even though a cockroach isn't a beetle." Yet, having an entry that doesn't clearly distinguish for the user between what an authoritative sense and what is another just-as-real usage lacks something. As for "making the standard stricter to exclude what passed the earlier one", the three attestations of usage are unaffected, but it structures entries into the haves and the have not, i.e. those entries that are denied a connection to what is considered factual – that is a systemic problem that will not go away. As to ease of use, a collapsed section has all the content and yet provides a visually simple interface, which could be added by a bot, so that is not the same issue as ignoring what every school child is taught (to provide credit to your source and avoid plagiarism and copyright infringement).
@Renard Migrant: I agree with you, "verifiable is not the same as verified", that is not the same thing as not giving credit to your source. The notion that excluding things so it looks pretty is just style over substance that can be solved by collapsed sections. —BoBoMisiu (talk) 20:31, 4 April 2015 (UTC)
Etymological dictionaries usually do not give credit to sources on a per-entry level. The idea that you need to credit your source of information on a per-entry level or else you have a copyright infringment is wrong. Copyright protects expression, not information; you can take someone else's information but you cannot take their expression - a particular sequence of words or sentences; if you take someone else's expression, you are liable to have copyright infringement and referencing does not make it much better. Nonetheless, I when I was entering etymologies from a public domain source, I nearly always stated my source in the edit summary; I saw many editors of etymologies not do even that much. --Dan Polansky (talk) 18:03, 4 April 2015 (UTC)

@Dan Polansky: "Etymological dictionaries usually do not give credit to sources on a per-entry level" because they are paper. Deciding what is or is not someones creative work or potentially discovery is beyond most contributors; transparency is not beyond most contributors. —BoBoMisiu (talk) 18:22, 4 April 2015 (UTC)

That's not just because they're paper, it's also because they have onymous authors who have reputations to consider and who can generally be trusted to do their research. We don't have that luxury. Because we're a wiki that anyone can edit, we have to show that our etymologies haven't been pulled out of our collective ass. —Aɴɢʀ (talk) 18:41, 4 April 2015 (UTC)
Bravo! —BoBoMisiu (talk) 18:47, 4 April 2015 (UTC)
I refuted the copyright infringement assertion, and none of the things added above counterargued that refutation. As for the other subject whether we want to use <ref> in order to show which sources were used, I object to making that a demand or the recommended practice. An editor was adding <ref> to etymologies of Czech entries, and while I did not like it, I did not revert it. But it is one thing to tolerate the use of <ref>, and an entirely different thing to pretend that the English Wiktionary has a policy that requires contributors of etymology to use <ref>; I oppose such a policy, and claim that the use of edit summaries to state sources should be enough; in fact, we have not been chastising editors who provided zero edit summaries and zero references using <ref>. --Dan Polansky (talk) 18:57, 4 April 2015 (UTC)
Nobody can copyright that the English word lemma comes from the Ancient Greek word λῆμμα (lêmma). Also you've confused excluding things with not including things. You're basically arguing that literally everything should be included in every entry, relevant or otherwise. That's nothing to do with style over substance, that's insanity over substance. Renard Migrant (talk) 19:06, 4 April 2015 (UTC)
@Dan Polansky: where did you refute it?
@Renard Migrant: I agree with you, but chains of assertions that are copied from sources are someone's work. —BoBoMisiu (talk) 19:17, 4 April 2015 (UTC)
Nothing at WT:ETYM requires contributors to use <ref> or to provide sources for their etymologies at all. Doing so is recommended whenever possible; it is not required. Etymologies without references are more likely to be removed as possible bullshit than ones with references, but it is neither the case that all unsourced etymologies are bullshit nor that all sourced etymologies aren't. —Aɴɢʀ (talk) 19:25, 4 April 2015 (UTC)
@Angr: I was reminded WT:ETYM is only a draft, so I started to ignore it, maybe I should follow it. I think references add transparency. Like you wrote, that bullshit still needs to be removed whether it is referenced or not. I disagree with Dan Polansky that an edit summary is equivalent to a fully cited reference. I think Tropylium's "Wiktionary is a work in progress" is valid and I suggest forking w:Template:Citation needed span from wikipedia. That template wraps the particular questionable content in a visually distinct box, making it easy to see where contribution is requested and what is suspect. A "synthesis of sources" is a good idea as long as it references where each piece of information comes from. That way a synthesis is transparent. I also think forking w:Template:Citation/core and changing the R: templates into wrappers with consistent parameters is a good idea and a step away from willy-nilly flat references toward structured machine readable data, but that is a future discussion. —BoBoMisiu (talk) 20:31, 4 April 2015 (UTC)
I think you'll find that forking templates from Wikipedia is not exactly popular around here. The people who propose doing so usually don't understand the difference between Wikipedia, with its large editor population, extensive bureaucratic infrastructure, and loosely-formatted, single-entry structure vs. Wiktionary, with its small editor base, simple rules/procedures and rigidly-formatted multi-entry structure, which requires that everything be concise and to the point. As for hiding the References section: that would still leave lots of superscripts, most of which link to statements of the obvious. If you really want to be thorough, how about word-frequency statistics? Binary checksums? Diagramming of the sentence structures? Those can all be hidden away in boxes, but will all lead to reader resentment for wasting their time if they follow the references. It reminds me of pulling over in the middle of nowhere because of a light on the dashboard: it was time for an oil change based on the mileage. If you have too much pointless information, hidden or not, the important information gets lost in the clutter. Chuck Entz (talk) 22:49, 4 April 2015 (UTC)
Tags like [citation needed] do not have communication between editors as their only function. They also alert the reader to pay attention to the quality of the information, and can help editors easily tell where they left off work previously.
I loosely agree with Dan that a citation provided in an edit summary is good enough to retain the edit, i.e. grounds to not revert it, but if other editors want more explicit referencing, it's not a reason to prevent them from adding some. --Tropylium (talk) 23:09, 5 April 2015 (UTC)

So a discussion about objective etymology referencing comes down to the subjective – a delicate aesthetic preference of looking at a pretty entry without seeing those brutish superscript numbers? That reminds me of the quote from Amadeus: "It's quality work. And there are simply too many notes, that's all."

@Chuck Entz: the solution to your concern "that would still leave lots of superscripts" is to allow the user to choose through a user preference toggle.
References are not just for statements of the obvious. Your propositions about "word-frequency statistics? Binary checksums? Diagramming of the sentence structures?", are an unintentional red herring about potential uses of collapsed sections, they are about different propositions than the one presented by Tropylium which is about references. The revision as of 19:26, 1 April 2015 is:

Etymologies should be referenced if possible, ideally by footnotes within the “Etymology” section, secondarily just by listing references in the “References” section. The Reference templates are useful in this regard.
If a word descends from a common root with several words in related languages, and an appendix page exists for the reconstructed proto-form, references on details of the reconstruction are best placed on that page, rather than duplicated across the cognate entries.

For six weeks, I misunderstood: Etymologies should be referenced if possible, ideally by footnotes within the 'Etymology' section, secondarily just by listing references in the 'References' section.
I, for six weeks, created sub-sections within entry etymology sections. I would like it changed to:
Etymologies should be referencedinclude Wiktionary:References if possible(for policy, see Wiktionary:Entry layout explained#References), ideallypreferably by footnotes within the 'Etymology' section(see Help:Footnotes), secondarily just by listing referencesin theor simply adding the source to the entry 'References' section (for policy, see Wiktionary:Entry layout explained#References).

While Wiktionary:Entry layout explained is a policy, I would like to see a clear explanation about the actual status of Wiktionary:References is. Is Wiktionary:References a policy?

I think that the other paragraph would be more understandable as, If a wordterm descends from a common root with several wordsother terms in related languages, and an appendixa page exists for thethat reconstructed proto-formterm (for policy, see Wiktionary:Reconstructed terms), references on details ofabout the reconstruction are best placedpreferably located on that reconstructed term page, rather thanand not duplicated acrossin the cognate entries.BoBoMisiu (talk) 15:39, 5 April 2015 (UTC)

Ok, after that long tangent on <ref> tags that addresses approximately nothing about my initial post, this is more interesting. Your adjustment of "word" to "term" is obvious, and I've edited that and a couple other language adjustments in. (WT:ETYM is currently only a think tank; you don't need explicit community consensus to edit it, for simple copyediting at least.) Two other comments:
  • I am not convinced that reconstructed terms necessarily require formatting as separate entries, hence the more general expression "appendix".
  • For the same reason, I don't think WT:ELE should be the best place to refer to for reference policy. Appendices are not, necessarily, entries.
--Tropylium (talk) 22:28, 5 April 2015 (UTC)
@Tropylium: the reason I point to WT:ELE is that it seems to be the only policy in a vague hierarchy of how-to pages. Some administrators seem to place little value on reasoning that refers to other than policy pages. I have experienced that, examples include Until the proposal gets past the draft stage and gets voted on, WT:ELE governs. (DCDuring), Using non-policies including Wiktionary:Citations and Wiktionary:References as an argument is rather unconvincing (Dan Polansky) —BoBoMisiu (talk) 00:15, 6 April 2015 (UTC)
You're correct, but this doesn't change the fact that WT:ELE only governs dictionary entries. It says nothing about how things that are not entries should be formatted. If you think this is a problem (I for one would agree; that's the general project that this entire thread is about), the solution should be to work on establishing further policy, not to attempt scavenging policy bits elsewhere that kind of apply if you squint right. (I.e. a non-policy that states "do things as if policy A was in effect here" remains a non-policy.) --Tropylium (talk) 22:42, 21 April 2015 (UTC)

Finding old deletion discussions, and related points[edit]

An attempt to create the entry "get behind" ([4]) leads to a page that says "You are recreating a page that was previously deleted" and then provides a log entry that reads "deleted page get behind (Failed RFD, RFDO; do not re-enter)".

1. How do I find the deletion reason/discussion? The link to "RFD" in the log entry just throws me into the current version of that page, which is useless long-term. Furthermore, the supposed RFD archive page ([5]) says that the archive is no longer active, but that "The current procedure is to archive the RFD discussion to the corresponding article's talk page". Er ... can anyone else spot the flaw in that procedure??

2. The explanation of "Failed RFD; do not re-enter" at [6] says "This term (in a particular language) failed WT:RFD. Do not re-enter it. You may re-enter a different term, especially in a different language." Suerly "Do not re-enter it" is too final? What if something previously rejected gains a valid meaning in the future? (I'm not suggesting this is the case for "get behind", but obviously it could happen.) Also "You may re-enter a different term, especially in a different language" is odd and pointless. 17:27, 2 April 2015 (UTC)

  • Subsequent to writing the above, I have discovered that the talk page http://en.wiktionary.org/wiki/Talk:get_behind exists even though the page "get behind" does not. I did not realise this was possible. Is this how it is generally supposed to work for deleted entries? This is unintuitive, and I propose to updated the documentation to explain it. However, what is the expected procedure for navigating to such a page? I found http://en.wiktionary.org/wiki/Talk:get_behind by first navigating to a known existing talk page and then changing the last part of the URL. Is this what users are expected to do, or is there a more friendly way?
You're right that we should have some message that the deletion debate is on the talk page, or should be eventually on the talk page. Attempts to create talk page archived by bot are underway. I completely agree that sometimes it's really hard to find deletion debates because of change in archive methods over the years, and sometimes debates are not archived anywhere. Renard Migrant (talk) 23:16, 2 April 2015 (UTC)
I made a couple of wording changes here and here. 23:40, 6 April 2015 (UTC)

Possibility of a "Sister projects" report in the Wikipedia Signpost[edit]

Hello, all I'm a volunteer at the Wikipedia Signpost, the Wikimedia movement's biggest internal newspaper. Almost all of our coverage focuses on Wikipedia, with occasional coverage of Commons, the Meta-Wiki, MediaWiki, Wikidata, the Wikimedia Labs; we have little to nothing to say about Wiktionary, Wikiquote, Wikibooks, Wikisource, Wikispecies, Wikinews, Wikiversity, or Wikivoyage. I'm interested in writing a special long-form "sister projects" report to try and address this shortfall. Is there anyone experienced in the Wikitionary project with whom I can speak with, perhaps over Skype, about the mission, organization, history, successes, troubles, and foibles of being a contributor to this project? If so, please drop me a line at my English Wikipedia talk page. Thanks! ResMar 21:04, 5 April 2015 (UTC)

No takers? :(. If not I will contact highly active editors individually a little later on. Resident Mario (talk) 04:14, 9 April 2015 (UTC)
There are probably not many active wikiversians :) --Auvajs (talk) 04:21, 9 April 2015 (UTC)
@Resident Mario: Perhaps you might consider asking for them at Wikiversity instead? —Μετάknowledgediscuss/deeds 04:33, 9 April 2015 (UTC)
He is asking for representatives of all "sister projects" (including Wiktionary). I have responded on his WP talk page. bd2412 T 14:18, 9 April 2015 (UTC)
@Auvajs, Metaknowledge, BD2412: Apologies. This is a copy-paste error I made when I was sending out these messages do to not paying enough attention, I went back through my messages and had thought I fixed this error. As you can see I did that wrong, too. Mass messaging manually is a pain... Resident Mario (talk) 18:10, 9 April 2015 (UTC)

Creating a new RFVA section for 2015[edit]

I have archived a few RFV discussions at Wiktionary:Requests for verification archive/2015. I wonder why nobody bothered to archive from 2014? Hillcrest98 (talk) 14:47, 6 April 2015 (UTC)

We archive RFVs on talk pages now (e.g. Talk:azur). — Ungoliant (falai) 15:01, 6 April 2015 (UTC)
  • Read the top of WT:RFV. Only closed discussions can be archived. In case of doubt, check the common practice. I have emptied Wiktionary:Requests for verification archive/2015. --Dan Polansky (talk) 15:16, 6 April 2015 (UTC)
    • Typically a successful RfV will result in citations being placed on the questioned page, so there will not be any reason to RfV it again. bd2412 T 14:19, 9 April 2015 (UTC)

Entries for typos[edit]

Consider oredaceous; in hindsight it seems an obvious typo for predaceous, but not so obvious that one could deny its existence on first principles. It could well be an example of pretentious use of an obscure but defensible technical term. So, on encountering the term I checked in case

(1) it meant something that I should know but didn't or

(2) it was illegitimately intended to mean something that I knew, but could not assign a meaning to

(3) an error.

I did a bit of googling and concluded that

1) It was indeed a typo, but

2) That a few people had indeed adopted it in their published (not necessarily peer-reviewed) documents as (possibly) legitimately meaning something along the lines of:

"'Oredaceous' sounds like 'predaceous', Did you mean predaceous when searching for oredaceous? 1. (a) Living by or given to victimizing others for personal gain" as very reasonably appears in Dictiosaurus.com.

NOW THEREFORE I am not a frequent lexicographer, even in WK and have not yet fallen foul of typos here. Would it be correct or acceptable to create an entry for a completely redundant finger-trouble typo such as oredaceous, if only as a redirection to predaceous, or to put it into the predaceous entry as a common typo, or do we leave it to users to blunder and assume that WK doesn't have such hard words, so it must be a very special term of great value to impress readers? JonRichfield (talk) 16:00, 9 April 2015 (UTC)

It does fit the general principle of WT:CFI: "A term should be included if it's likely that someone would run across it and want to know what it means." I think that including "likely" is inappropriate there, because we do include very rare terms that the average person is very unlikely to ever come across. That would certainly include a rare typo, which would nonetheless have readers baffled. If a word is a typo, we can't necessarily expect the reader to know what the correct word is. At the same time, it would be a rather insane task to start documenting all attested typos.
I think the best we can do is to make more of an effort to get the search function improved so that it recognises typos better. Right now it does not give any results for "oredaceous" even though it's only one (!) character away from predaceous, an existing entry. The fact that the search does not catch this is clearly a failing on its part and definitely needs improving.
However, it's also possible in other cases that a word is a typo in one language but a legitimate spelling in another language. Our search will then happily send the user to the entry for that other language, but the user will not find an entry there for the language they are looking for. They might think that Wiktionary is incomplete. —CodeCat 16:11, 9 April 2015 (UTC)
It also says in that sentence 'term'. Is a typo a 'term'? Surely not. I say we exclude them all together. All words in all languages does not mean all strings of characters in all languages. Renard Migrant (talk) 17:05, 9 April 2015 (UTC)
You're looking at it from a principled point of view. But a dictionary exists for a purpose; our purpose is to help our readers find lexicographical information that they are looking for. People will be looking for typos, so we have to help them find what they need, too. Just saying "sucks for you that the writer of the text you are reading made a mistake, too bad" doesn't get anyone anywhere. —CodeCat 17:09, 9 April 2015 (UTC)
I am adding to entries about the words Sophy, Sofie, Sofee, Soffi, Sofi, Sophi, Sophia, Sophie, etc. After several weeks, I still don't know at what point those words, which regularly described at least two senses, stop being spelling mistakes (or if they were in the first place) in over four centuries of attested usage and become {{alternate spelling of|Sophy}} and {{alternate spelling of|Sufi}}, the two more common contemporary terms.
My point about oredaceous, which is a different type of case, is that a reader could also not know if the search term is a spelling mistake, but the wiktionary search results page provides a link to "try searching the site using Google" which provides alternative wiktionary entries for the reader. Which, in my opinion, is adequate for just typos. Nevertheless, character recognition software also provide erroneous results, for example, oredaceous instead of the correctly written predaceous that the search engine then uses, and the reader still needs to interpret what is being presented. —BoBoMisiu (talk) 18:50, 9 April 2015 (UTC)
I think the way to deal with typos is using the search engine to suggest similar entries based on spelling. Not having entries. That's dubious as you have to start making assumptions about what word is intended. Renard Migrant (talk) 19:01, 9 April 2015 (UTC)
  • Isn't this something that software even better than our search engine should do? I could imagine approximate searching for each language based on whether something was heard (using a descendant of soundex), read from a scanned print document, read from an a keyboarded document, read from a manuscript, all with the possibility of specifying date of the document and of any transformations and script and typeface. I really don't see that we are helping much with the paltry few errors we would have entered in a year or two if we were to go down this path. DCDuring TALK
  • Slightly off topic but on topic as per subject title: Now as before, I see zero added value for the user of the dictionary in our excluding attested high-frequency typos. Therefore, I oppose removal of high-frequency typos. Some people seem to distinguish misspellings from typos; for misspellings, we have a voted on policy in WT:CFI#Spellings. If misspelling is understood to include typo as a subclass or more specific case, our present policy excludes rare attested typos but not common, high-frequency attested typos. Let me remind all newbie readers that Wiktionary includes common misspellings as misspellings (declaring them as such) so as to not mislead the reader, e.g. in concieve. --Dan Polansky (talk) 19:13, 9 April 2015 (UTC)
I agree with Dan Polansky that misspelled words should not be excluded, and I agree with Renard Migrant that the search engine is the better way to provided suggestions, and I agree with DCDuring that, especially for terms like oredaceous, other software does a better job. The wiktionary search results page has the suggestion to contribute something that is not found in wiktionary. These are complementary processes. I don't think oredaceous is a misspelled word since I think it, oredaceous, does not involve people. I think it is a recognition software error that was scraped by other sites and will eventually correct itself, its ephemeral without usage outside the error and scrapings. —BoBoMisiu (talk) 20:09, 9 April 2015 (UTC)
With the prevalence of bad OCR in search engine databases, there are tons of scannos with "cl"="d", "1"="l"="!","o"="e"="a", "u"="n"="ii", etc., and web pages are stiff with transposed and/or missing letters, doubling of the wrong letters, etc. These are often amply attested, but utterly useless as entries.
I think the best solution would be a list of non-wikilinked examples of common typos and/or scannos for a term on the page for the term, but hidden. That way the page would show up in search results, but there would be little visual clutter or use of resources, and fewer clicks to get to the right pages.
I suppose we could put the list in a collapsible box, or we could wrap it in a template that would make it completely invisible, like the keywords you often find in html header sections. Chuck Entz (talk) 02:11, 10 April 2015 (UTC)
@Chuck Entz: Can you give an example of an attested, high-frequency scanno? To me, it seems like an unlikely occurrence. As for oredaceous, is it attested? Like, we don't count a scan showing a scanno as an attestation since what a human says about what they see in the scan prevails over what the scanning algorithm sees. As frequency ratio, "oredaceous" does not show in GNV at all, so is not a common typo/scanno by any stretch, and shall be excluded per WT:CFI#Spellings: oredaceous, predaceous at Google Ngram Viewer. --Dan Polansky (talk) 05:01, 10 April 2015 (UTC)
Sorry- I didn't think that all the way through. No, the durably-archived part doesn't include the OCR, so scannos generally aren't attested per CFI. Still, if someone uses the clipping tool in Google Books, they get the OCR, rather than the visible text, and there are sources such as Project Gutenberg and the Germanic Lexicon Project which have OCRed texts in various stages of proofreading. The lack of CFI-compliant attestation is all the more reason to avoid making entries, but the possibility that people will use them in searches is reason to have them present in some form. Chuck Entz (talk) 19:57, 10 April 2015 (UTC)

Well, I hadn't intended to start so much reaction; my apologies. My own feelings are in tune with those such as CodeCa but I am not inclined to agitate for any major change of policy. At Finedictionary I found that they give lists of possible typos (apparently not necessarily known to have occurred) for their entries. Something along those lines had occurred to me, but I had rejected it as perhaps a bit over the top. Also I am not sure how well it would work in practice. It seems likely that such a feature would give nearly every entry such a long tail of "Containing..." entries that it might render the "Containing..." feature almost unusable. OTOH, I certainly would militate against creating headwords for every misspelling or typo. Bottom line... I probably must resign myself to the fact that noise happens and that sometimes one simply must be resigned to sacrifice an hour to guarding against accidents and illiterisms. Thanks everyone for your contributions. JonRichfield (talk) 12:14, 10 April 2015 (UTC)

  • I would oppose having entries for typos that are not actual instances of the writer mistakenly thinking this was the correct spelling of the word. bd2412 T 20:22, 10 April 2015 (UTC)
  • Same. Suggesting correct spellings is something for a search-engine heuristic, not something to be manually entered, IMO. We have enough work to do on improving the actual content. Equinox 21:29, 10 April 2015 (UTC)
    What about typos that are adopted as real spellings, like pwn for own, teh for the or kik for lol? —CodeCat 22:28, 10 April 2015 (UTC)
    They're words pure and simple, what in your opinion needs discussing? Renard Migrant (talk) 22:32, 10 April 2015 (UTC)
    Well, once they're adopted as real spellings, they are no longer typos, and become dictionary material. Equinox 22:40, 10 April 2015 (UTC)
    The difficulty would be in distinguishing these cases. For example, what if there are four attestations of one of these, but we can't tell which ones were intended (and thus meet CFI) and which were accidental? Depending on how we decide, it might meet CFI or it might not. —CodeCat 22:43, 10 April 2015 (UTC)

Stewards confirmation rules[edit]

Hello, I made a proposal on Meta to change the rules for the steward confirmations. Currently consensus to remove is required for a steward to lose his status, however I think it's fairer to the community if every steward needed the consensus to keep. As this is an issue that affects all WMF wikis, I'm sending this notification to let people know & be able to participate. Best regards, --MF-W 16:12, 10 April 2015 (UTC)

Excessive Footnoting in Etymologies[edit]

I'm posting this as separate from Tropylium's topic, above, because it's a separate issue and it would have overwhelmed the discussion there.

In the etymology for Sophy, User:BoBoMisiu has, by my count, 23 citations to 8 references in 8 places. This is nuts!

The first two references cover everything that would be needed to verify the accuracy of the etymology (maybe a third to verify the statement that it's not related to Sufi), and there's no need to cite the same source multiple times in the same etymology. I really don't care what dictionary was used to supply the Persian spellings used to replace the transliterations in one of the sources, nor do I care where the glosses for those terms came from. The other references might be useful for an encyclopedia article, but in an etymology, they're just clutter.

I don't know whether they want to show off their erudition, or they're afraid they're going to get audited by the dictionary police, but, as far as I'm concerned, this referencing style adds nothing to the dictionary that's worth the massive clutter.

What does everyone else think? Chuck Entz (talk) 22:23, 10 April 2015 (UTC)

The etymology is constructed but I can't read Turkish or Arabic to complete it, or for that matter to decide what could be eliminated in the chain of assertions. Not all sources show the same thing. —BoBoMisiu (talk) 22:26, 10 April 2015 (UTC)
The word that comes to mind is dim. The same citations are repeated within words of each other. Just stick 'em at the bottom like everyone else does. You can read the whole entry without scrolling anyway. And not that many anyway, five links saying the same thing aren't better than one, unless the one is unreliable. How many of your links do you consider to be unreliable? Renard Migrant (talk) 22:31, 10 April 2015 (UTC)
So your suggestion is to leave out the Turkish? Or not show that the Turkish is found in one English language source? I consider all the sources reliable, yet I don't know how the Turkish fits in; I did the work and a future editor, who has insight into the Turkish and Arabic, will not have to repeat the research for each step in the chain of assertions. Both of your suggestions are to limit the content – my suggestion is to give the reader more control. Like I wrote above (diff), seeing reference numbers should be a user preference for those who prefer not to see those brutish superscript numbers, it is a simple style sheet solution. The references section, in my opinion, should be collapsed by default, as should the etymology section and pronunciation section. —BoBoMisiu (talk) 23:34, 10 April 2015 (UTC)
  • I generally agree with the above by Chuck Entz: clutter with little added value. To my dismay, I have now realized that I am to blame in part since it was me who started using the ref technique in Sophy in diff; I even said in the edit summary that "it would be better to have a reference per individual etyological claim". As far as I am concerned, the ref technique can be abandoned in Sophy, and the number of references to be placed at the end of the entry can be reduced to the minimum needed to cover the information presented. I am still of the opinion that providing references in edit summaries is a useful practice that provides tracing to them without cluttering the entry. I admit that the detailed style of referencing currently used in Sophy would be useful to source claims that really matter; I don't think etymology matters so much to be worth the clutter. --Dan Polansky (talk) 06:28, 11 April 2015 (UTC)
Why aren't any of you addressing the differences between the three things that are commonly seen as different:
  1. the the structure of a entry
  2. the content of an entry
  3. the presentation of an entry
The concept is "separating presentation from content and structure" (search on DuckDuckGo), you are all discussing how to solve an aesthetic preference, which is the presentation, by altering the content instead of the presentation. "Clutter" is a aspect of presentation and not an aspect of content or structure. Removing content is not a solution for a presentation preference; adding a user preference toggle (set by style="visibility:hidden" for the tag) to control the visibility of the superscript reference numbers is a solution for a presentation preference. I believe none of you have any information about the readers preferences about this other than your own opinions. The comment that "I don't think etymology matters so much to be worth the clutter" shows the misunderstanding caused by conflating issues of content, presentation, and structure.
As far as etymology, it is not a definition that a contributor crafts. It is usually based on expert, or at least informed, opinion which is someone's work and should fully cited. —BoBoMisiu (talk) 16:30, 11 April 2015 (UTC)
References in this context aren't content themselves, they're there to support the actual content. Most of your references are useless in this context, so the discussion about clutter is to show that the unnecessary references aren't harmless filler that can be ignored, but actually harmful to the entry. Yes, you could probably find ways to make it less conspicuous, but you haven't really justified their presence in the first place. As for your last point: an etymology actually is something the editor crafts. Yes, it's based on expert opinions, but you don't need to reference every part of crafting the definition. Like I said, the first two sources cover the task of verifying the etymology just fine. Details like providing glosses and converting transliterations to the actual terms in the original scripts aren't the kind of thing that you need to reference- editors who know the languages in question often do that without consulting any sources at all. Yes, it's a good idea to be sure you know what you're talking about, and that might require consulting various sources- but that's for your benefit, not for everyone who reads or edits the entry in the future. Also, an etymology isn't like an article, where a point that needs referencing is separated by paragraphs of text from other places where the same reference is used. If a source supports multiple parts of the etymology, you only need to have one reference to the source, which can go at the end of the etymology. It's only when a reference narrowly deals with only one part of the etymology that you would put the reference inline in the middle of the etymology. Chuck Entz (talk) 17:53, 11 April 2015 (UTC)
@Chuck Entz: References are content. They are part of providing the curated selection of details about the subject of the entry, as are the selections of quotes used to attest usage. References are a selection of further reading directly connected to other parts of content. Selected external links are content for the same reason. All these things provide a user with content – that contributors compile and curate – that is insightful. I use a giant late 1960s dictionary as my go-to reference because I believe it provides me with reliable information about things – not just words – that I want to know about. It is reliable because it was fact checked. I use OED online for the same reason. Wiktionary is different because the content contributors add into etymologies or usage notes without high quality sources does not reach the same standard. I have seen many poor etymologies that lack credibility, and – of course – lack references. While Chuck Entz believes references are not for everyone who reads or edits the entry in the future, I believe that anyone repurposing a fully referenced assertion does benefit by using reliable content and not spreading unvetted worthless bullshit like in the Chinese whispers game. I am not one of the editors who know the languages in question often do that without consulting any sources at all. I can reasonably assume that few contributors read Farsi, Turkish, or Arabic, providing a reference about something that uses a different script, that most readers cannot comprehend, demonstrates the reliability of the assertion. Including the original language term allows a reader to cut that term and paste into a search engine. The reliability of each assertion (just a few words) in a chain of assertions demonstrates the reliability of the chain. To say that I haven't really justified their presence in the first place is just assuming that a reader who visits wiktionary is the kind of reader who doesn't want to know more about the minutia of an etymology, and the kind of reader who doesn't want to know what is factually accurate through reliable references. I don't assume that. Please explain to me how those references are actually harmful to the entry, other than a personal aesthetic. —BoBoMisiu (talk) 13:13, 14 April 2015 (UTC)
I would shorten here the following points:
  1. No reference needs to be provided for the Italian and Spanish words; references for their origin should be located on their entries (although since they do not exist yet, I can understand for the time being leaving the referencing on the English entry). It's not obvious whether they need to be mentioned at all.
  2. It's unclear to me why is صفوی (ṣafawī) mentioned; the existence of this does not seem to be relevant to the etymology of the English term, which appears to be straight from ṣafī. (If there is some relevance — e.g. if Persian ṣafī is actually a back-formation from ṣafawi — this should be mentioned explicitly.)
  3. General sources, here the Century Dictionary, the OED, and perhaps Skeat's etymological dictionary, seem to provide nothing that the more specialized sources do not, and they could probably be safely cut away from the entry. References exist for the purposes of showing where a claim originates, not for showing who else has read the same source material.
--Tropylium (talk) 23:02, 21 April 2015 (UTC)

Trimming CFI slightly[edit]

I wanted to suggest a minor trim to WT:Criteria for Inclusion#Constructed languages, where it says:

That's true, but as it says above, any constructed language without consensus to be included is by default not included in the main namespace. There's no reason to list these in particular, and in fact the list is very arbitrary and the choice of mentioning these specific constructed languages makes it appear that we are giving them special weight or something. So it seems reasonable to delete that bullet point and thus make the section slightly simpler. Thoughts? —Μετάknowledgediscuss/deeds 01:05, 13 April 2015 (UTC)

I support that, but I would actually support a broader rule that disallows terms in languages that have not been passed between generations at least x times. That would exclude "new" conlangs. It's kind of like the "spanning at least a year" rule but for whole languages. —CodeCat 01:21, 13 April 2015 (UTC)
Maybe you misunderstood; I'm not proposing any change to policy, just a removal of redundancy in the CFI. —Μετάknowledgediscuss/deeds 02:01, 13 April 2015 (UTC)
@Metaknowledge: For what it's worth, listing the most common conlangs which are likely to be readded (such as Toki Pona) is probably helpful as it will provide documentation here saying that it's not accepted. Plenty of editors will see documentation here before checking an external list. —Justin (koavf)TCM 03:21, 13 April 2015 (UTC)
It would also exclude pidgins and creoles though. -- Liliana 23:21, 13 April 2015 (UTC)

Anybody should be allowed to nominate people for whitelist[edit]

At present, nominations for the whitelist are supposed to be only done by administrators. That seems:

  1. excessive (why should we need so many hoops to jump through with whitelist? It's harder to be whitelisted here than many other projects; to say nothing of the fact that many projects don't even bother with whitelisting)
  2. unfair (it grants too much power to sysops and not enough power to Joe users), andd
  3. time-consuming (it'd be so much easier for people to self-nom).

I am seriously considering starting a vote about this matter, but I thought I'd discuss it here first. Purplebackpack89 17:43, 13 April 2015 (UTC)

support. But with one little correction: not for anybody but for only autopatrolled users.Dixtosa (talk) 17:51, 13 April 2015 (UTC)
I think the logic behind having only admins nominate people is that admins are the only(?*) people affected by the whitelisting vs non-whitelisting of a user, since admins are the ones who have access to the "patrol" feature. (*Are there any edits Joe User could make if he were whitelisted that he couldn't make if he weren't whitelisted?) If we do expand the nomination franchise, I support the restriction that Dixtosa seems to be proposing, that nominations have to be made by a user who is already autopatrolled (this also prevents self-noms). - -sche (discuss) 18:16, 13 April 2015 (UTC)
@-sche: What's so inherently bad about self-noms, though? I have major problems with restricting nominating people to the whitelist to people who are already on the whitelist. This is an open community and groups should be open to everybody, not selected only by people who are already in those groups. Plus, there's the issue that sysops or whitelisters have to actively be looking around for people to whitelist, rather than potential whitelist candidates coming to us. FWIW, I also think the claim that admins are the only people affected is a stretch, as it defines "affected" pretty narrowly. At the very least, "affected" should include non-admins who are on the whitelist. Purplebackpack89 18:35, 13 April 2015 (UTC)
I think you're under some misapprehension that whitelisting is a special right or privilege. It isn't. It's just a way we patrollers (who are almost exclusively admins) ease the load of checking every edit, and it really doesn't affect the editor in question. I suspect that this is just related to your issues with the establishment and your own personal lack of power against people you annoy. —Μετάknowledgediscuss/deeds 19:54, 13 April 2015 (UTC)
If whitelisting isn't a special right or privilege, why did another editor fight to try and take it away from me? And why are you assuming that I'm doing this solely because I don't like getting kicked around by bullying admins (like the guy who tried in vain to take away my whitelist rights)? And why hasn't anybody given a good reason on why self-nom should continue to be forbidden? Self-nom doesn't mean you automatically get it. Purplebackpack89 20:18, 13 April 2015 (UTC)
"If whitelisting isn't a special right or privilege, why did another editor fight to try and take it away from me?" One could equally ask, if whitelisting isn't a special right or privilege, why are you fighting over its rules? Equinox 22:25, 13 April 2015 (UTC)
I think you're confused, @Equinox:. I'm the one suggesting it is a right or privilege. Metaknowledge is the one arguing that's a near-meaningless distinction. Purplebackpack89 22:37, 13 April 2015 (UTC)
I don't see a problem with anyone nominating editors for whitelisting, but approval for whitelisting has to be left to admins. In my experience, the people who need patrolling the most are the people who think their edits are perfect, in spite of massive evidence to the contrary. Usually such people are easily able to find others of similar temperament who are only too happy to support them. Please note that I'm not talking about you, but there are people who pop up from time to time, such as Gtroy/Luciferwildcat, Drago, and many, many others who would all have liked to have no one checking their edits so they wouldn't have people pointing out what they were doing wrong. Chuck Entz (talk) 01:26, 14 April 2015 (UTC)
Likewise, I don't have any problem with admins approving people for the whitelist, so long as anybody is allowed to nominate for it. Purplebackpack89 02:06, 14 April 2015 (UTC)
I have no objection to anyone nominating a user to be whitelisted but, really, why would they bother? It only affects the small subset of sysops who patrol Recent Changes. SemperBlotto (talk) 13:00, 14 April 2015 (UTC)
We don't really have any patrolers who aren't admins here. That's why it's on admins who can nominate and approve, as it only affects admins (since there are no other patrolers). I have no objection to changing that, I self-nominated on the Beer Parlour as a roll-backer and patroler, but nobody replied whatsoever. Renard Migrant (talk) 18:28, 14 April 2015 (UTC)
Hardly surprising. patroler / patroller?

Detailed linguistic maps[edit]

If they're of interest to anyone, I just stumbled onto this set of very detailed linguistic maps. - -sche (discuss) 21:28, 17 April 2015 (UTC)

A new language code needed[edit]

for Proto-Georgian-Zan. @Simboyd said the reason on my talk page. --Dixtosa (talk) 15:55, 18 April 2015 (UTC)

Ruakh requesting de-bureaucrating.[edit]

Given my activity level here, I don't think it makes sense for me to still be a bureaucrat — I'm never the first bureaucrat to respond to something, and nothing ever needs multiple bureaucrats — but I figured I should check in with the community before actually posting at [[m:Steward requests/Permissions]] to request that my access be removed, just so y'all don't feel blindsided or anything. (Note: SemperBlotto, Stephen G. Brown, and Hippietrail are all quite active at the moment, so I don't think I need to wait for a replacement to be appointed, or anything like that.) —RuakhTALK 03:29, 19 April 2015 (UTC)

@Ruakh: Trust once granted does not disappear with inactivity, I think. To the contrary, it is certain type of activity that leads to trust diminished or disappearing, such activity that would show trust was misplaced. It is only activity that can produce refuting instances against trust, not inactivity. I do not oppose or put obstacles to your self-nomination for de-bureaucrating; I merely present an alternative view that may lead you to reconsider. ---Dan Polansky (talk) 07:11, 19 April 2015 (UTC)
You still seem fairly active; doesn't look like a security risk; you might as well keep it. Equinox 04:39, 20 April 2015 (UTC)
Hmm. Well, OK. I guess it doesn't really matter one way or the other. —RuakhTALK 06:31, 21 April 2015 (UTC)
  • Weak oppose If we take away your tools, there's a pretty long list of other people who should lose tools, and not just for inactivity. Purplebackpack89 17:09, 20 April 2015 (UTC)

"Alt form of" caps[edit]

Did I dream it, or did "alternative form of..." use to begin with a capital A? Why was this changed? Equinox 04:39, 20 April 2015 (UTC)

It's a result of diff. You're not the first person to wonder why it was changed. I've undone the change. - -sche (discuss) 17:44, 20 April 2015 (UTC)
Thanks Equinox for this thread; thanks -sche for restoring the capital A; please keep it restored. --Dan Polansky (talk) 19:20, 20 April 2015 (UTC)
I actually preferred it with a small "a". You're never going to please everybody, no matter what you do. Donnanz (talk) 08:28, 21 April 2015 (UTC)
True enough. I don't mind either way, but I don't like arbitrary changes: they've ruined enough of my favourite Web sites already, usually because of idiot marketing departments, or a misguided idea that mobile phones are the primary target and computers are secondary. Get off my lawn. Equinox 02:09, 22 April 2015 (UTC)
There's been a lot of this going on. It's why I use nodot=1 and nocap=1 even in form of templates that don't have automatic dots or capital letters; because they're constantly being changed and the chance of such templates having an automatic dot, cap or both in the future is very, very high. Renard Migrant (talk) 08:42, 22 April 2015 (UTC)
nocap=1 is a useful tip. I have started using it. Donnanz (talk) 12:43, 26 April 2015 (UTC)
That came about after I requested it some time ago. It was because our form-of templates are horribly inconsistent. Should we have a vote to decide, once and for all, whether to start them with a capital or lowercase letter across the board? This, that and the other (talk) 08:46, 23 April 2015 (UTC)
I would support that, as it would be a non-foolish consistency. I would also support capitalizing the first word of definitions consistently throughout the project. bd2412 T 12:45, 23 April 2015 (UTC)
Me too.
@BD2412: Do you include definitions only in English, in Translingual, in all languages in your initial capitalization support? I do. DCDuring TALK 13:15, 23 April 2015 (UTC)
I would apply this only to (multi-word) English definitions. I view such a definition as a sentence answering the implied question, "what does foo mean". Translations are a bit different, since they ideally require only the single English word that corresponds to the foreign word, and we should avoid giving the impression that the word must be capitalized when written in English (unless, of course, it is a word that should be capitalized in English, like English or January). bd2412 T 13:21, 23 April 2015 (UTC)
What should be done for English terms that are just synonyms of another term? And what about non-English terms that require full definitions because there is no simple English equivalent? And what if a single entry contains a mix of different types, do we capitalise some definitions but not others? —CodeCat 13:29, 23 April 2015 (UTC)
If we decide to (continue to) capitalize English and non-English definitions and translations differently, we have the option of making the form-of templates use a capital+dot or not based on what lang= is set to (or we could make their display be language-independent). If we decide the templates should end with a dot, they should allow nodot=1 to suppress that dot in case additional information needs to be added manually after the template (see e.g. Karman street).
I capitalize English definitions, and don't capitalize non-English translations.
We could set up a vote with three sections, (1) English, (2) Translingual and (3) other languages, and have options under each like (a) always begin with an uppercase letter and end with a dot,* (b) never begin with an uppercase letter or end with a dot,* (c) begin with an uppercase letter and end with a dot if ___, otherwise do not. We could even have an option (d) make no formal rule. (*With exceptions for, pardon the tautology, exceptional circumstances, like it we for some unforeseen reason must begin a definition with "iPhone" or "isiZulu", or with "January" or "Chile", respectively.) - -sche (discuss) 21:27, 23 April 2015 (UTC)
I'd like everything in cap-dot format (as I call it). I have no strong feelings though. Renard Migrant (talk) 21:34, 23 April 2015 (UTC)
  • Alt form and any other high-profile template should be supported in both capital and lowercase forms, for ease of editing. Purplebackpack89 20:55, 24 April 2015 (UTC)
Yes, totally agree, though I think the question here is the flip-flopping on the default settings. These seem to change every few months. Perhaps even every few weeks. Renard Migrant (talk) 15:34, 26 April 2015 (UTC)

Cross-referencing etymological root words to their descendants[edit]

Has there been any past discussion or efforts to make root words consistently link to words which mention them in their etymology? —This unsigned comment was added by Technical-tiresias (talkcontribs).

Sounds like you're looking either for "What links here" or for the descendant/derivative lists. The latter are obviously works in progress, and I've no idea how well they're up to date.
I'd certainly endorse adding a clause in our etymology guidelines that when adding an etymology for a word, one should also check for a backlink at the parent word.
(Also, I'd love having a software framework that did this automatically, but that is probably beyond what can be reasonably done on Wiktionary.)
--Tropylium (talk) 23:15, 21 April 2015 (UTC)

Nominations are being accepted for 2015 Wikimedia Foundation elections[edit]

This is a message from the 2015 Wikimedia Foundation Elections Committee. Translations are available.

Wmf logo vert pms.svg


I am pleased to announce that nominations are now being accepted for the 2015 Wikimedia Foundation Elections. This year the Board and the FDC Staff are looking for a diverse set of candidates from regions and projects that are traditionally under-represented on the board and in the movement as well as candidates with experience in technology, product or finance. To this end they have published letters describing what they think is needed and, recognizing that those who know the community the best are the community themselves, the election committee is accepting nominations for community members you think should run and will reach out to those nominated to provide them with information about the job and the election process.

This year, elections are being held for the following roles:

Board of Trustees
The Board of Trustees is the decision-making body that is ultimately responsible for the long term sustainability of the Foundation, so we value wide input into its selection. There are three positions being filled. More information about this role can be found at the board elections page.

Funds Dissemination Committee (FDC)
The Funds Dissemination Committee (FDC) makes recommendations about how to allocate Wikimedia movement funds to eligible entities. There are five positions being filled. More information about this role can be found at the FDC elections page.

Funds Dissemination Committee (FDC) Ombud
The FDC Ombud receives complaints and feedback about the FDC process, investigates complaints at the request of the Board of Trustees, and summarizes the investigations and feedback for the Board of Trustees on an annual basis. One position is being filled. More information about this role can be found at the FDC Ombudsperson elections page.

The candidacy submission phase lasts from 00:00 UTC April 20 to 23:59 UTC May 5 for the Board and from 00:00 UTCApril 20 to 23:59 UTC April 30 for the FDC and FDC Ombudsperson. This year, we are accepting both self-nominations and nominations of others. More information on this election and the nomination process can be found on the 2015 Wikimedia elections page on Meta-Wiki.

Please feel free to post a note about the election on your project's village pump. Any questions related to the election can be posted on the talk page on Meta, or sent to the election committee's mailing list, board-elections -at- wikimedia.org

On behalf of the Elections Committee,
-Gregory Varnum (User:Varnent)
Coordinator, 2015 Wikimedia Foundation Elections Committee

Posted by the MediaWiki message delivery on behalf of the 2015 Wikimedia Foundation Elections Committee, 05:03, 21 April 2015 (UTC) • TranslateGet help

"Adverbial forms" and cases of adverbs in Finnish[edit]

Including these in declension tables seems like a bad idea. The entire point of declension tables is to provide those wordforms related to the lemma that are generally predictable; not to list non-productive derivational items.

Some more detailed examples of problems:

  • ulkona is in no way an essive. It shares the ending -na, yes, but in its older locative sense (same as kotona or siinä).
  • ulko- is a prefix, not a nominative, despite both consisting of the unmarked word root.
  • yhä is not productively linked to yksi at all, and would be best treated as a derivative.

I get the impression that many of these categories ("situative", "oppositive") are original research, sort of. Is anyone else particularly attacted to them, or shall I add them to my mental checklist of items to clean up at some point? --Tropylium (talk) 23:29, 21 April 2015 (UTC)

While I agree that these are not inflections, there is value in showing certain sets of derivations in a schematic way. So maybe a separate table would be preferable. —CodeCat 23:36, 21 April 2015 (UTC)
A tabulative approach is informative, sure enough, but aside from relocation to derivatives it would need consideration on what to include. By comparison for base verbs: we could consider listing some of the more common categories, such as — the following based on hypätä (to jump) — inchoative (hypähtää), frequentative (hypellä) habitual (hyppiä), and reflexive (none for this, but e.g. kaataakaatua), as well as the name-of-action noun (which is somewhat non-predictable: hypätähyppy, hypähtäähypähdys, hypellähyppely, kaataakaato). The stacking of these gets complicated fast though (kaatua → habitual/frequentative kaatuilla → causative kaatuiluttaa → frequentative kaatuilutella → name-of-action kaatuiluttelu → …) so here we can easily see that trying to shoehorn every derivative into a single table would not be feasible. --Tropylium (talk) 00:58, 22 April 2015 (UTC)
We wouldn't have to show derivatives of derivatives, only immediate ones. —CodeCat 01:37, 22 April 2015 (UTC)