Wiktionary talk:Votes/pl-2010-05/Placenames with linguistic information 2

Definition from Wiktionary, the free dictionary
Jump to: navigation, search

Purpose of the vote[edit]

This vote was created to address the problems discussed in the failed vote Wiktionary:Votes/pl-2010-03/Placenames with linguistic information are accepted. The wording is tentative and can be totally changed. Here are some items for discussion:

  1. Removing the attributive use CFI for placenames. This section becomes a stub, but adding new material or examples to it would be a subject of another vote.
  2. The 5 requirements remain the same here, since some voters found them too liberal , and others too strict.
  3. Use of "should" or "must" for the 5 requirements.
  4. Status of existing entries that don't meet the new CFI. Notice that many of them didn't meet the attributive use CFI either.
  5. Multiple word placenames - notice that according to the 5 requirements, they should also have non-identical translations.
  6. General definition of placenames, described as following the current practice.--Makaokalani 14:32, 14 May 2010 (UTC)

removing attributive-meaning criterion[edit]

See Dan's comment in the BP. Summarizing, he's concerned that this vote further establishes an attributive-use criterion for personal names, and suggests instead wording it as "There is no agreement on specific rules for the inclusion of names of specific entities, but there is a specific regulation for names of geographic entities, specified in the section 'Names of geographic entities'.". This sounds like a valid concern to me: that the current criterion on attributive use for personal names will become further entrenched: there are a number of people who don't like it. However, I don't see removing it as an option that will pass in this vote, either. So I suggest leaving it alone (not including it in the wording of this vote), and instead voting on this:

  1. Removing "For example: New York is included because “New York” is used attributively in phrases like “New York delicatessen”, to describe a particular sort of delicatessen. A person or place name that is not used attributively (and that is not a word that otherwise should be included) should not be included. Lower Hampton, Sears Tower, and George Walker Bush thus should not be included. Similarly, whilst Jefferson (an attested family name word with an etymology that Wiktionary can discuss) and Jeffersonian (an adjective) should be included, Thomas Jefferson (which isn’t used attributively) should not." and
  2. adding a placenames section worded as currently proposed.

Thoughts?​—msh210 17:08, 14 May 2010 (UTC)

So only the first sentence of the "Names of specific entities" would remain? Fine with me. I don't want to add new material there. It's difficult to have two votes on the same subject at the same time. Or can two votes run simultaneously, if the timeline is just a little different? If this vote should pass, I don't see how it could stop Dan Polansky from changing the wording later. --Makaokalani 09:06, 15 May 2010 (UTC)

Other kinds of specific entities?[edit]

This vote seems to presuppose that there are two kinds of specific entities, and it proceeds to change the handling of one (places) while leaving the other (persons) alone. But if that presupposition is wrong — if there are any other kinds of specific entities — then the proposed change also changes the handling of all those other kinds, by no longer addressing them. I imagine this would affect terms as diverse as the 1870s, John L. Sullivan (a famous elephant, according to Wikipedia), and Harry Potter and the Sorcerer's Stone. —RuakhTALK 01:37, 15 May 2010 (UTC)

Agreed. Some name-bearing specific entities other than persons and geographic entities:

  • name
    • mythological being name
      • god name – "Zeus", "Odin", "Ganesha" (see also Wikisaurus:god)
      • other name - "Calliope" (muse), "Cronus" (titan), "Medusa" (gorgon)
    • celestial object name
      • the Earth, the Moon, the Sun
      • Mercury, Venus, Mars, Jupiter, Saturn, Uran, Neptun
      • the Milky Way
      • star name - Aldebaran, Arcturus, Betelgeuse, Proxima Centauri, Sirius
      • moon name - Atlas, Callisto, Calypso, Europa, Mimas, Triton, Umbriel
      • Halley's Comet
      • ...
    • name of a work of art
      • name of a play – "Lysistrata", "Much Ado About Nothing"
      • name of a novel – "The Old Man and the Sea", "The Lord of the Rings"
      • name of a short story – "Boule de Suif"
      • name of a statue – "Hermes and the Infant Dionysus", "The Statue of Liberty"
      • name of a painting – "The Adoration of the Magi"
      • name of a movie – "20,000 Leagues Under the Sea"; see also W:List of Academy Award-winning films
      • name of a computer game – "Manic Miner", "World of Warcraft"
    • various name
      • name of a human organization - "United Nations", "General Electric", "Microsoft", "Google", "Bell Telephone Company"
      • name of a theorem - "Pythagorean theorem"
      • name of a battle - "Ragnarok", "The Battle of Waterloo"
      • name of a ship - "Beagle"*.
      • name of a conference - "Wikimania"
      • ... probably a hard-to-overview set of names for varied types of individual things

--Dan Polansky 06:48, 15 May 2010 (UTC)

I have updated the list above with a dedicated sublist for works of art. --Dan Polansky 16:59, 31 May 2010 (UTC)

placename[edit]

Could we change that to place name or place-nameMichael Z. 2010-05-15 06:25 z

This seems not so critical and can be fixed later, yet: "name of geographic entity" seems better than "placename" or "place name", for the latter can be understood to include only settlements including cities, towns, and villages. This I judge from a cursory search in Google books. Of course, most Wiktionary editors will understand that "placename" refers also to continents, lakes, rivers, seas, etc. --Dan Polansky 06:43, 15 May 2010 (UTC)

A fairly common term seems to be "geographic name" instead of the longer "name of geographic entity". Again, this is not a reason for opposing this proposal, merely a point where it can be improved. --Dan Polansky 06:49, 15 May 2010 (UTC)

That's right, "geographic name" is better, because it's certainly not the intention to apply these rules to star names, or to names of mythical places (Valhalla, Narnia).--Makaokalani 09:06, 15 May 2010 (UTC)

Err.. shouldnt' the "Placename" header say "Names of specific places". The words "hill", "church", and "school" could be termed placenames. That applies to the suggested "geographic name" as well. --EncycloPetey 01:04, 17 May 2010 (UTC)

If you read "name" as "proper name", which is what Wiktionary does, then neither "placename" nor "geographic name" applies to "hill". The defaulting of "name" to "proper name" seems to occur also outside of Wiktionary. The term "geographic name" looks flawless to me. It is also used in titles of some dictionaries: google books:"geographic names". --Dan Polansky 12:44, 17 May 2010 (UTC)
I have sprinkled some "geographic names" in the text. Can be fixed later. Place names are normally defined as specific places ("A river in Ireland"), because translations would not make sense otherwise. But is there a need to prohibit the definition "A place name"? Darlington and Hastings, given as examples in WT:CFI#Wiktionary is not an encyclopedia are defined as "A place name". They'd be easy to fix, though. --Makaokalani 16:23, 17 May 2010 (UTC)

First sentence[edit]

The first sentence to be placed to CFI reads:

  • "Placenames are subject to the same criteria for inclusion as other words and terms, extended with the following additional requirements."

The first sentence to be placed to CFI, of the previous vote, reads:

  • "Placenames are subject to the criteria for inclusion specified in the section "General rule", extended with the following additional requirements."

That was for a reason. It is not true that geographic names are subject to the same criteria for inclusion as other words and terms, merely extended. For instance, there is specific regulation for inclusion of brand names, to which geographic names are not subject. Hence the explicit reference to the section "General rule". --Dan Polansky 06:56, 15 May 2010 (UTC)

I took it from the "Given names and surnames" section, but I cannot see any harm in your wording either, so I'll change it (after the weekend, when I have more time).--Makaokalani 09:06, 15 May 2010 (UTC)

Requiring idiomaticity[edit]

So, if I understand the proposal correctly, we would have entries for Saint Louis, New York, and Little Rock, since these two word names cannot be figured out from their parts. However, Vatican City, Czech Republic, Bosnia and Herzegovina, and North America would be forbidden as entries since they're not idiomatic and can be figured out from their parts? --EncycloPetey 06:23, 17 May 2010 (UTC)

It's a bit ambiguous. Vatican City does not say it's a country, Bosnia and Herzegovina could be understood as Herzegovina and Bosnia, North America does not explain where the southern limit is. Czech Republic is easier to guess, except it's a new country. How would you restrict multiple-word names then? The five requirements actually say that (English) multiple-word place names must have a non-identical translation. Is that rule enough?
The real purpose of the multiple-word paragraph is to count out street names and the like, whether idiomatic or not.(Actually Msh210 added that they are not idiomatic.) Would it be enough just to say: "Multiple-word names of streets and othe minor landmarks needed to create addresses, such as Madison Street and Elm Avenue, are not included if they can be recognized from their wording as street names and the like." ( Never mind if they're idiomatic or not, they just aren't accepted).--Makaokalani 16:35, 17 May 2010 (UTC)
Arguably — and similar arguments have carried the day at wt:RFD in the past — Vatican City, Bosnia and Herzegovina, North America are idiomatic, as there is no way to determine from the parts that these places are, respectively, not a city, a single entity, and an entity (as opposed to component). I think you're right that Czech Republic would fail under the idiomaticity requirement. How would you word it in a way that allows Czech Republic but disallows St. Louis, Missouri and Elm Street? Or perhaps we don't actually want Czech Republic?​—msh210 16:55, 17 May 2010 (UTC)
You could argue that Czech Republic has no linguistic interest, it can be translated as sum of parts, so we could do without it. I don't want to go back to defining how important a place must be; those arguments never seem to lead anywhere. Can somebody think of a really horrific example, an important multiple-word place name that cannot be called idiomatic in any way?--Makaokalani 13:33, 18 May 2010 (UTC)
How about Northern Ireland, Mexico City, European Union, Roman Empire, and the Great Wall of China? I know some people don't like the idea of using "importance", but in all the years of discussion, that's one of the only two sensible approaches I've ever seen mentioned. The other is to make a list of what sorts of places are and are not allowed. I also disagree about Czech Republic, since (1) it's a synonym of Czechia, which would be allowed, and (2) it's the standard name used in the US for that country, with most Americans never having heard that latter term. --EncycloPetey 14:40, 18 May 2010 (UTC)
Not too horrific. European Union isn't even a place name. Great Wall of China is an exact description of what it is, but suppose you have no prior knowledge of its existence, couldn't you mistake it for something abstract, like a missile prevention system, or a psychological inability of westerners to understand China? How much prior knowledge do you expect from the reader? I have added the sentence "Common-sense exception can be made for well-known places that are rarely referred without qualifiers". Defining well-known might be a problem, though, and the text of the vote is getting full of exceptions of exceptions of exceptions. Also Elm Avenue can be recognized as a street name, but it says nothing about where it is, or even if there are elms, so calling it non-idiomatic is a bit shaky. Another way is not to mention idiomaticy at all, since it is already included in the "General rule", and turn the text into this:

Place names[edit]

Wiktionary includes place names, that is, names of geographic entities such as Asia, France, New York, Darlington, Pacific Ocean, Nile and Andes. Geographic names are subject to the criteria for inclusion specified in the section "General rule", extended with the following additional requirements. A place name entry should initially include at least two of the following:

  1. An etymology. This is insufficient as one of the two criteria necessary items for a multiple word place name, such as South Carolina.
  2. A pronunciation.
  3. Information about grammar, such as the gender and an inflection table.
  4. A translation that is not spelled identically with the English form. A place name that is in itself such a translation, like the French entry for Londres, also meets this requirement.
  5. An additional definition in the same language as something else besides a place name, for example as a surname.

Multiple-word names of streets and other minor landmarks needed to create addresses, such as Madison Street or Elm Avenue are not included if they can be recognized as street names from their wording. Special rules may be given on language-consideration pages for languages where street names or the like are spelled as one word. This does not prevent including other idiomatic definitions that a street name has acquired, such as Harley Street.

Only minimal information about the place in question should be given. If the name is shared by several places, there should not be a separate definition line for every place with the name. For example, although London is the name of cities around the world, a single definition line as "Any of a number of cities in Anglophone countries" suffices. If there is a particular London which is interesting linguistically, for example as the etymological source of an additional meaning of the word London, as the city after which most other Londons are named, or for its name's pronunciation's or translations' being different from all other Londons', then it can also have a definition line. Separate entries such as London, Ontario should not be made.

Any thoughts which wording is better?--Makaokalani 12:46, 21 May 2010 (UTC)

Section "Names of specific entities"[edit]

Now:

  • Heading: "Names of specific entities"
  • Text: "A name should be included if it is used attributively, with a widely understood meaning."

I propose:

  • Heading: "Names of specific entities"
  • Text: "Unless it is a geographic name, a name of a specific entity should be included if it is used attributively, with a widely understood meaning. Geographic names are regulated in a dedicated section 'Geographic names'".

And then I would replace the section heading "Place names" with "Geographic names".

I am still going to oppose because of "A name should be included if it is used attributively, with a widely understood meaning." Nevertheless, this proposal should substantially improve the vote, it seems to me.

If you wait one more month or so, the vote that I have recently created and that starts tomorrow should be over. Then it will be perfectly secure to start another vote on geographic names. --Dan Polansky 16:35, 17 May 2010 (UTC)

The two votes cannot run at the same time, and the place name vote obviously needs fixing. Go ahead and start your vote if you think it's not too much of a risk. I hope it doesn't fail, then we have wasted a month:-) --Makaokalani 16:43, 17 May 2010 (UTC)
It is this vote: Wiktionary:Votes/pl-2010-05/Names of specific entities. It starts tomorrow. No one has commented on the vote so far; maybe no one really noticed the vote. I do not know whether it will pass, but there is some hope stemming from Wiktionary:Beer parlour archive/2010/April#Straw poll on the 'names of specific entries', April 2010. --Dan Polansky 16:55, 17 May 2010 (UTC)
Let me repeat what I have already posted to Beer parlour, adjusted to use the term "geographic name":
Once this text is accepted, it can be further amended as follows:

This section regulates the inclusion and exclusion of names of specific entities, that is, names of individual people, names of geographic entities, names of mythological creatures, names of planets and stars, etc. ¶ Many names of specific entitites should be excluded while some should be included. There is no agreement on specific rules for the inclusion of names of specific entities, but there is a specific regulation for geographic names, specified in the section "Geographic names".

--Dan Polansky 17:05, 17 May 2010 (UTC)

"should"[edit]

What exactly is meant by "should" in "A place name entry should initially include at least two of the following"? Does this mean that where a place name entry does not include these, the entry should be deleted? --Yair rand (talk) 23:08, 18 May 2010 (UTC)

Old entries that don't meet the new requirements cannot be deleted without discussion. If somebody decides to rfd them one by one, they will be deleted unless somebody adds the missing information. Adding the information only takes a little research, and it will make this dictionary better. New entries by regular contributors certainly must include the required info. If an anon makes a new entry with nothing but "A town in Texas" it should be instantly deleted. If there is some information like etymology, and the contributor is a promising newbie, a patroller might decide to rfc the entry instead, and ask for the missing information. If it doesn't come, the entry should be deleted.
What's so dangerous about deleting Wikipedia stubs? It's easy to copy thousands of missing place names from the Wikipedia or a gazetteer. If you only want to know where the place is, or its history and population, use the "search" button and it will take you to the Wikipedia. A dictionary entry should be about the word, and it should be made by somebody who really knows the language in question.--Makaokalani 13:01, 21 May 2010 (UTC)
I'm mainly worried about foreign language place names, as very many of those have nothing but the translation. Hopefully I can safely assume that no one will go out of their way to rfd a zillion place name entries. This proposal is acceptable to me. --Yair rand (talk) 18:29, 21 May 2010 (UTC)

A proofreading[edit]

Every time I reread this proposal, it seems better to me. This kind of criteria is supported by Mufwene (1988, “Dictionaries and Proper Names”, in International Journal of Lexicography, v 1, n 3, p 268).

I have a few questions and concerns. If we're going to use examples, then we should use them specifically and thoroughly correctly.

Minor: why is there a pilcrow (¶) in para. 1?

Also, a multiple-word place name, such as New York and Middle East, must be idiomatic, — The reader is left wondering whether New York and Middle East are considered idiomatic. May as well use the examples to show how this would be resolved. (E.g. I would consider New York to be sum-of-parts, but someone else may consider it idiomatic because the name doesn't indicate which city is the referent. In either case, this guideline may justify its inclusion because the etymology should state that it was named after the Duke of York and Albany.)

Common-sense exception can be made for well-known places that are rarely referred to without qualifiers, such as Great Wall of China or Czech Republic, but not for street names and the like. — Confusing. Does this acknowledge that these names are not idiomatic, but may be included because they are well known? Or because they are used without a qualifier (and what does that mean? Examples?)?

What is said in the preceding paragraph does not prevent including other idiomatic definitions that a place name has acquired, such as Harley Street. — Isn't this a restatement of requirement no. 5? However, no. 5 implies that the place Harley Street would be defined as well, but this one implies that only the idiomatic name should be defined.

"Any of a number of cities in Anglophone countries" — Minor: maybe not an ideal example definition, because the Londons in Canada and Kiribati are not in exclusively Anglophone countries. In any case, Anglophone is a fact, but not defining. Why not just “any number of cities?”

If there is a particular London which is interesting linguistically, — Maybe set a more concrete criterion than interestingness, like “which carries important linguistic facts,” or “which is itself distinguished in the majority of the five items listed above.”

for example as the etymological source of an additional meaning of the word London, as the city after which most other Londons are named, — These are facts that could be mentioned in the etymology. No need to mandate a definition line for them. or for its name's pronunciation's — Likewise, this could be sufficiently noted with a qualifier in the Pronunciation section. No need to add a definition line if the only additional information won't be in the definition line, but in another entry section.

 Michael Z. 2010-06-17 15:28 z

To respond point by point (but not to all points):
The pilcrow: Fixed.
Harley Street and requirement 5: I think you read it correctly: requirement 5 says the term can be defined as a placename, the subsequent text removes that possibility for streets (which are named similar to the way Harley Street is), and the "What is said in the preceding..." text merely notes that nonetheless Harley Street can be defined — though not as a street — as a profession (or whatever).
Anglophone countries: Fixed.
Interesting linguisticaly: The text after that tries to give examples of interestingness, but you don't like them, either. You suggest a criterion of being "distinguished in the majority of the five items", but most of those face the same objection you raise to "for example as the etymological source...". (Really, all do.) I don't have a ready solution to this: I'm just pointing it out.
​—msh210 15:54, 17 June 2010 (UTC)
I have added that New York and Middle East are idiomatic. If you look up new + York or middle + East in this dictionary, you won't understand them. But you will understand Mississippi + river and St. Louis + Missouri.
Great Wall of China and Czech Republic are not idiomatic, but they can be included because those well-known places are rarely called anything else. See above #Requiring idiomaticity.
The sentence about Harley Street is a bit needless . Msh210 explained it the way I mean. It seems harmless but it's fine with me if you want to take it out.
The section giving examples about London says they can have a separate definition line; it doesn't mean they have to. I think we could let the editors decide which is the practical way to arrange information for a specific entry.--Makaokalani 11:40, 18 June 2010 (UTC)

Sense line per specific entity[edit]

re "There should not be a separate definition line for every place with the name. For example, although London is the name of cities around the world, a single definition line as "Any of a number of cities in Anglophone countries" suffices."

I will oppose because of this sentence, added in this series of edits by Msh210. It follows from this sentence that the entry "London" should have exactly one definition line for cities, a principle that I cannot accept.

In addition, the sentence is contradicted by this passage following it:

"If there is a particular London which is interesting linguistically, for example as the etymological source of an additional meaning of the word London, as the city after which most other Londons are named, or for its name's pronunciation's or translations' being different from all other Londons', then it can also have a definition line."

At first, I read that London should have only one sense; later, there is a sentence stating an exception to that. Each sentence to which exceptions apply should already have the exceptions stated as part of that sentence, or at least mentioned that exceptions are detailed in subsequent sentences. Counterexample: "Each dog has four legs. However, some dogs have three legs." Improvement: "Dogs generally have four legs, but some dogs have only three legs."

IMHO, the question of how many sense lines there should be per entry should be completely omitted from the vote. Thus, I would delete from the proposal this paragraph:

Only minimal information about the place in question should be given. If the name is shared by several places, there should not be a separate definition line for every place with the name. For example, although London is the name of cities around the world, a single definition line as "Any of a number of cities" suffices. If there is a particular London which is interesting linguistically, for example as the etymological source of an additional meaning of the word London, as the city after which most other Londons are named, or for its name's pronunciation's or translations' being different from all other Londons', then it can also have a definition line.

--Dan Polansky 09:37, 19 June 2010 (UTC)

Sorry, I didn't notice this section on Saturday. It's true that London is an illogical example since there is a particular London that needs a separate definition. I have shortened the paragraph, but I don't want to delete it, because there have been arguments whether place names should be defined as "A place name" or have a separate definition line for each place. I want to point out that there is a middle way, and actually we have been using it for years. Also the definition "A place name" would make translations difficult. But I removed the acceptable reasons Msh210 gave for having a separate definition line, because it's difficult to be precise about them. Better that editors use their common sense. --Makaokalani 11:48, 21 June 2010 (UTC)
Then again, this point should be better left unadressed by the proposal, exactly because it is a point of contention. Some people have argued for having only "place name" as a definition, but they have been in a minority as far as I can tell. I cannot support a proposal that pleases an outspoken minority including Mzajac and msh210 only to disregard a majority. In particular, the current entry for "London" with five senses looks okay to me, while your current regulation would reduce it to two senses. I still think that the section that I have quoted should be deleted altogether. Even the first sentence "Only minimal information about the place in question should be given" is superflous; it goes without saying that the definition line of a place should provide only minimal information sufficient to identify the place. If the omission of this sentence turns out to lead to some misunderstandings, this can be handled in a separate vote. --Dan Polansky 13:42, 21 June 2010 (UTC)
Hmm, the sentence "Only minimal information about the place in question should be given" is probably actually not superflous, and should be detailed instead. This dictionary entry for "Prague" shows a section on the history of Prague, one that Wiktionary probably should not have. --Dan Polansky 14:02, 21 June 2010 (UTC)
I think there should be only two definition lines for London to avoid duplication of translation tables. As far as I know, all the non-Thames Londons are translated the same. Population doesn't belong in a dictionary, either. But I've made the example sentence more general, removing London as an example.--Makaokalani 11:06, 24 June 2010 (UTC)
I would like to see at least two non-summary senses in London, for London in U.K. and London, Ontario. Your recent edits in the proposal make this possible.
I am not so sure that population harms in a dictionary, or does not belong to it. It comes down to what characteristics you consider defining and what extra-definition. With a geographic entity, its entity type ("city", "river", etc.) seems key, as well as its approximate location. When the two classes of information are provided for a geographic entity, the size or magnitute of the entity--be it the length of a river, the population of a city, or the height of a mountain--may seem extra-definition and thus uneeded. However, so could gold be defined merely as "the chemical element with the atomic number 79", rather than "A heavy yellow elemental metal of great value, with atomic number 79 and symbol Au", or "A heavy yellow elemental metal, with atomic number 79". The general magnitude of a geographic entity seems so relevant and so succinctly stated that I see no need to exclude this sort of information from a dictionary definition.
I wonder what example real geographic dictionaries you are using to arrive at the claim that the population does not belong to a geographic dictionary.
To the specific changes in wording that you have made, I would make the following edit, to make it more explicit and positively stated:

If the name is shared by several places, some of the places bearing the name can have a dedicated sense line, while other ones can be covered under a summary sense line such as "Any of a number of cities in Anglophone countries".there should not be a separate definition line for every place with the name. Places can be grouped together in a single definition line such as, for example, "Any of a number of cities in Anglophone countries".

If you prefer "definition line" over "sense line", no problem. --Dan Polansky 14:43, 24 June 2010 (UTC)
In fact, when you read "there should not be a separate definition line for every place with the name" literally, you forbid dedicated sense lines for names that refer to only a single place: if you do provide that sense line, it becomes true that "there is a separate definition line for every place with the name". You basically require that there is at least one summary sense line for each geographic name, a requirement that you probably do not intend to make. --Dan Polansky 14:55, 24 June 2010 (UTC)
I agree that the issue of how many sense lines there should be and what should be in the definition line should be removed from the proposal. Personally, I think that sense lines should only be split where there are linguistic differences between senses (different etymologies, pronunciation, translations, synonyms, etc.), and that information about the places should only be given where senses need differentiation, where there's only one place, or where all senses share some common attribute(s), but the issue is not really relevant to the rest of the proposal, and might not even be a CFI issue. --Yair rand (talk) 19:17, 24 June 2010 (UTC)
The sentence has been watered down by now. In essence, it says that this is a dictionary, not a gazetteer. It's easy to make links to Wikipedia, where they have a separate article for every London, with population. And unlike other geographic dictionaries, we give translations, so they should be provided for.--Makaokalani 11:28, 30 June 2010 (UTC)

Idiomacity 2[edit]

Re "Also, a multiple-word place name must be idiomatic, that is, its full meaning cannot be easily derived from the meaning of its separate components. For example, New York and Middle East are considered idiomatic, but Mississippi River and St. Louis, Missouri are not."

The predicate "idiomatic" seems not to meaningfully apply to proper names. I understand that "Mississippi River" and "Baker Street" have the common names describing their entity type ("river", "street") as part of the full name, and that it is these that the rule is trying to exclude. Also, "St. Louis, Missouri" contains the disambiguating "Missouri", which we may want to exclude.

Yet the fact that English considers "Mississippi River" (with capital "R" in River) a full name is conspicuous from the Czech point of view. In Czech, it is just called "Mississippi". I think not much harm can be done if "Mississipi River" gets included as a whole term, just like "Mount Everest", "Lake Ontario", and "New York City". The work California Place Names: The Origin and Etymology of Current Geographical Names has "Bigelow Park" and "Bighorn Lake", for example[1]. The work Nevada Place Names: A Geographical Dictionary has "Godfrey Mountain"[2]. It seems natural to include "Grand Canyon", in spite of "canyon" being the entity type of "Grand Canyon". I see no harm in including "Madison Street" and "Elm Avenue", given that the parts "Street" and "Avenue" are parts of the full name, as witnessed by the capitalization of "Street" and "Avenue".

It seems to me that all requirements of idiomacity on place names should better be removed. If the requirement of idiomacity would be implicitly taken from "General rule" section, this could be amended by explicitly discarding the requirement of idiomacity by stating its inapplicability to proper names. Where I see no problem is in exluding terms modeled on "St. Louis, Missouri", meaning "<place>, <containing place>".

The specific implementation could be as follows:

Place names, that is, name of geographic entities, are subject to the requirement of attestability as criteria for inclusion specified in the section "General rule" , but not the requirement of idiomaticy, and they are also subject to extended with the following additional requirements.

And, there would be the additional requirement:

Also, each place name that contains a disambiguating place name after a comma, such as "St. Louis, Mississippi", should be excluded.

--Dan Polansky 10:23, 19 June 2010 (UTC)

Well, I disagree. I see much harm in including street names. Do you realize how many there are? "Street", "Avenue" etc. can be affixed to almost any place name, surname and common noun . A dictionary is meant to explain words, not places. Aren't entries for Bigelow, Bighorn, Godfrey enough?
Of course Grand Canyon should be included. You cannot explain it by grand + canyon, and it's not a minor landmark. I thought that was clear. Just how should this vote be worded to make it understood? I don't want to take out the requirement of idiomaticity, but if defining idiomaticity is too difficult, what about the suggested text in the blue part of #Requiring idiomaticity? It specially excludes street names. I'll change the "Common exceptions" sentence a little, and move the day again since I cannot be back before Monday. --Makaokalani 11:14, 19 June 2010 (UTC)
Your criteria designed to exclude all street names is untenable, however. I have corrected the entry for Harley Street, from which you incorrrectly removed the more literal definition. That definition, although mostly literal, is also idiomatic because it carries implications even when referring to the street itself. British writers mentioning "Harley Street" expect others to know why that particular street is different from all others; it is not simply another street, but carries implications in its usage. Placing that information in the etymology is wrong, because it's not etymological information. The etymology should explain where the word comes from; in this case it is an eponym from the name of an individual who developed the region. --EncycloPetey 16:04, 19 June 2010 (UTC)
If you want to exclude street names, you have to find another way, such as "All street names should be excluded". In Czech, street names usually consist of one feminine adjective, such as "Klíčová" or "Masarykova". In German, street names are written as one word, such as "Hauptstraße" or "Ringstraße". I don't know about other languages. In either way, your rule on idiomacity does nothing to exclude Czech street names and German street names.
The exclusionist treatment of names containing entity type as part of the name stands no chance of preventing a huge expansion of place names in Wiktionary. Even with names of the form "<name> <entity type>" excluded, the section for place names in Wiktionary is going to overshadow by number the section for words found in general dictionaries. If your purpose with the rule on idiomacity is to significantly reduce the number of included geographic names, you should better find a rule that actually achieves this.
I remain of the conviction that "idiomatic" does not apply to proper names in any way. And I have linked to two geographic dictionaries that do not omit the entity type part from geographic names.
--Dan Polansky 08:46, 20 June 2010 (UTC)
Defining what is idiomatic seems to be just as tricky as defining attributive use. I'll give up trying to define it in the "Place names" section. The word "idiomatc" has been removed from the vote. The present wording is much like the one I suggested in #Requiring idiomaticity. Any problems with multiple word place names should be solved by using the CFI section of idiomaticity. Removing the idiomaticity requirement for place names would be too drastic. I have moved up the date by a week again. By then I really want to open the vote and be done with it. We can never find a wording that would satisfy everyone.--Makaokalani 12:47, 21 June 2010 (UTC)
But this round seems more promising thus far to me because multiple people keep checking in and commenting. I have hopes that, although we may wrangle a bit before the vote, I think we may actually make progress this time around, even if we don't reach a "permanent" solution. --EncycloPetey 05:14, 22 June 2010 (UTC)

There is idiomaticity here, but that's not all we're talking about. There's a variety of forms, of course: Everest is also Mount Everest, but the Matterhorn is never Mount Matterhorn, while Mount Trudeau practically always is thus. But we are writing about the names here, not the things. Although Lake Ontario is never called just Ontario, the dictionary is not that concerned with the thing or place, just in the name Ontario. Do we really need entries for Lake Ontario, Western Ontario (not the western part of Ontario), Ontario South (historical electoral district), Ontario Place (amusement park), Ontario Street (probably dozens in Canada), Ontario Council for the Arts, Green Party of Ontario, etc?

At the very least, shouldn't we conventionally “lemmatize” proper names at their irreducible form? Why not redirect or form-of link Thames River, River Thames, &c, to ThamesMichael Z. 2010-06-22 05:46 z

Maybe we are trying to say too much on the CFI page? Maybe, if the vote should pass, we could create Wiktionary:About place names and give more specific rules there? --Makaokalani 13:47, 22 June 2010 (UTC)
If Lake Ontario is never called "Ontario" but rather always called "Lake Ontario" as you say, a dictionary should reflect that and include the name of the entity (that which the entity is called), hence "Lake Ontario". Interestingly, some online dictionaries have "Ontario, Lake" instead[3][4][5]. It is interesting to know that the name is "Lake Ontario" rather than *"Ontario Lake", which would be formed on the model of "Amazon River" or "New York City". --Dan Polansky 18:17, 22 June 2010 (UTC)
Come to think of it, that's not quite true. In the context, one might correctly say “the Great Lakes include Superior, Huron, and Ontario.” I presume Ontario, Lake would be for the sake of consistent and self-evident alphabetizing. Michael Z. 2010-07-03 15:48 z
The entry for Ontario could have ====Usage notes==== explaining whether it's Lake Ontario or Ontario Lake.--Makaokalani 10:35, 6 July 2010 (UTC)

Street names[edit]

I am opening a dedicated section on street names. In Czech, street names usually consist of one feminine adjective, such as "Klíčová" or "Masarykova". In German, street names are written as one word, such as "Hauptstraße" or "Ringstraße". It should be made clear whether the proposed regulation aims to include or exclude street names, and how it achieves this. --Dan Polansky 08:48, 20 June 2010 (UTC)

  • In UK English most streets are two words - a name followed by street, road, avenue etc., but many are a single word e.g. Eastgate. I've no idea what to do with them. SemperBlotto 08:52, 20 June 2010 (UTC)
The relevant Wikipedia articles include w:Street or road name, de:w:Straßenname, and nl:w:Straatnaam. There is in particular w:Street suffix, whose content is also included at w:Street or road name#List of street type designations, and includes "street", "alley", "drive", "promenade", etc. --Dan Polansky 08:46, 21 June 2010 (UTC)
The wording of this vote excludes multiple-word street names that can recognized as such, that is, those that end in "Street" and other suffixes in your Wikipedia list. (And, for example, French street names beginning with Rue.) One-word street names like Eastgate, Masarykova and Hauptstraße are acceptable. A street named, for instance, Michigan Lake or One Two Three could also be included, because nobody can guess that it is a street. But if we would later want to exclude , for instance, German one-word street names ending in -straße, it could be done in Wiktionary:About German without a need to change the wording of WT:CFI.
I don't understand why anybody would look up Elm Street, Elm Avenue, Elm Road, Elm Square in a dictionary. The etymology and pronunciation are obvious. Translations, maybe? But they would rarely be translated in Latin script languages. I don't know how they are transliterated into Chinese. My guess is it's just SOP, elm + street. What we could do, though, is make separate entries for Street, Avenue ( blue links! - but redirects ), and so on, and define them as common second parts of street names, and maybe give translations or transliterations for them.--Makaokalani 12:04, 21 June 2010 (UTC)
(unindent) I would rather exclude street names altogether, except for the generics like Wall Street. My point is that I do not see much purpose in excluding some street names while including a host of other street names, and doing so in a highly uneven manner per language. The Wiktionary database can easily keep all street names. Excluding English street names while including German and Czech ones makes little sense to me; if you are trying to exclude street names per their being too numerous, you should exclude them per their being street names rather than per their containing entity type in their name.
I am unsure whether streets are considered geographic entities. I have seen no streets in geographic dictionaries (e.g. this one). There is the dedicated word "odonym"--an identifying name given to a street.
I would thus remove or replace the following section (from this revision):

Also, multiple-word names of streets and other minor landmarks needed to create addresses, such as Madison Street or Elm Avenue are not included if they can be recognized as street names or the like from their wording. One-word street names like Eastgate are acceptable. Special rules may be given on language-consideration pages for languages where the system of naming streets is different from the English one.

What is said in the preceding paragraph does not prevent including other definitions that a street name has acquired, such as of Harley Street.

The regulation that I would probably support could read as follows:

The name of a street, square, or a minor landmark is included only if it has acquired a generic meaning, as is the case with Wall Street.

The problem with this style of regulation is that it may deter street-inclusionists, if there are some. Positions on the inclusion of street names could be found out in a poll in Beer parlour. It could be worth it to clarify street names in a Beer parlour discussion, and get input from place-name-exclusionists on what sort of regulation they would like to see for street names.
--Dan Polansky 13:32, 21 June 2010 (UTC)

ON THE ONE HAND, why would we exclude streets, and not, for example, squares, villages, or city districts? The names of St. Mark's Place, the Mall (oddly lemmatized as The Mall), Sussex Drive, or the Champs-Élysées are as significant as those of Times Square, Trafalgar Square, the Left Bank, the East End, Kensington Market, etc.

ON THE OTHER, we will end up with 1,000 senses of Smith Street, identifying the various eponyms of Smith Streets throughout the anglophone world. (Or would it be Smith Street#Etymology 67?) Onomastic information isn't really the same as etymological—the word and name Smith has one etymology (Old English smiþ, from Germanic *smiþaz), but Smith Streets worldwide are probably named after hundreds of Mr. and Mrs. Smiths. This would be an encyclopedia-flavoured exercise.

Of course, this problem is not specific streets. It also applies to municipalities, water bodies, regions, mountains, etc. Michael Z. 2010-06-22 05:11 z

I'm not so interested in street or square name but in Russian there can be different forms of street names - adjectives like in Czech (улица Тверская - úlica Tverskája) or genitive case (улица Сергея Эйзенштейна - úlica Sergéja Ejzenštéjna), which often makes the translation difficult if you want to make them consistent with the original - is it Sergey Eisenstein Street or ulitsa Sergeya Eyzenshteyna in English? --Anatoli 06:03, 22 June 2010 (UTC)
But whichever way you choose, it can be translated as sum of parts? There's no need for an entry for every street name? I removed the "multiple-word" specification, because it's true that any English street name that can be recognized as a street name is a multiple word. That way the language-specific sentence can be removed, and all languages are equal. --Makaokalani 13:37, 22 June 2010 (UTC)
Why trying to make rules complex? In English or French, most street names are not considered as words, but some are, and they should be included only if they can be considered as words (e.g. Strand, Canebière or Champs. If all streetnames are considered as words in some language, then, all of them should be accepted, but only with usual attestation rules: citations should be required in this case. This makes sense: in some languages there are more words than in other languages because of how they create words. It's not because there are few words in Tokipona that we should not include all English words.
This should apply to all placenames. Of course, London and New York are considered as words by linguists (the space does not change anything). Texas and North Carolina too. Placenames should be includable only when, usually, they are viewed as words (which is not the case for Excelsior Hotel). We should first adopt this principle, then discuss difficult cases (otherwise, there will be no end to the discussion). Lmaltier 21:43, 4 July 2010 (UTC)
The original draft for this vote began "Placenames are words..." and it was instantly challenged. So I'm afraid there is no magical solution that would end the discussion. But having some accepted CFI would make discussion more productive.--Makaokalani 10:43, 6 July 2010 (UTC)
It was challenged on the ground of the typographic sense of word. Word has several senses, including a typographic one and a linguistic one. I always use word in its linguistic sense, and, if you use this sense, New York is a word. CFI should always add "in its linguistic sense" after word (and explain this sense), to disambiguize what it states. Lmaltier 05:57, 7 July 2010 (UTC)
I want to add that Placenames are words is not always true. I think that nobody considers Excelsior Hotel as a word. The right question is in which cases are placenames usually considered as words (in the linguistic sense), and therefore includable? The criteria you provide are good clues, but they might allow such entries as Excelsior Hotel, and I would disagree. It's about the same as person names: Winston Churchill is a name, but is not considered as a word, while Confucius or La Fayette are words. Lmaltier 15:55, 10 July 2010 (UTC)

I would change the proposal to Placenames are accepted when they can be considered as words of the language. This would leave open the question when are they words?, but this would be a good beginning, this would solve most of the issues, and would be a good reason to accept The Mall and not площад Александър Невски (notoriety is not a good reason!). Lmaltier 21:18, 2 August 2010 (UTC)

I am responding to this in a dedicated section #Wordhood. --Dan Polansky 17:26, 3 August 2010 (UTC)

Wordhood[edit]

Responding to Lmaltier, 21:18, 2 August 2010, from the previous paragraph, on what constitutes a word:

You have yet to explain why "New York" and "Grand Canyon" are words, while "Much Ado About Nothing" (play) and "King Lear" (play) are non-words. Even with your notion of linguistic word that includes "black hole", what you mean by "word" remains unclear, to me anyway. What could help is if you would create an extensive list of various items that you consider words and an extensive list of various items that you consider non-words, focusing both lists on debatable items, or items in the grey area of wordhood. The black area of wordhood includes "cat", and I would hope that the white area of wordhood would include the Latin letter "a". The objects that need to be classified as words or non-words include those in the classes listed in the section #Other kinds of specific entities?. I wonder whether you consider proverbs to be words, the way you use the term "word". --Dan Polansky 17:26, 3 August 2010 (UTC)

It's more about perception like some users said. The definition of wordhood may not help with this exercise. New York is a word but New York City perhaps not (in my perception it's two words) but is Mexico City a word? We should probably concentrate on what part of a place name should be included. Is it Netherlands or the Netherlands? Wall Street is too well-known to be ignored and you can't remove Street from it, although I think it's two words. The question is difficult. --Anatoli 00:04, 4 August 2010 (UTC)

See vocabulary (defined as a collection of words). The key question is is this a part of the vocabulary of the language?. Another equivalent question is Could it be useful to someone learning the language to learn this (as part of the vocabulary) to be able to speak and to understand the language? (or would it have it been useful at the time the word was used?). Clearly, the answer if yes for New York or little boy. Much Ado About Nothing or Winston Churchill are famous names, but nobody would consider them as words (they are composed of words that may be combined at will, it depends only on the authors, or parents). There are intermediate cases, of course, and they may be difficult, but you understand the idea. Lmaltier 20:53, 14 October 2010 (UTC)