Appendix talk:Japanese Swadesh list

From Wiktionary, the free dictionary
Jump to navigation Jump to search

There's some information at Wiktionary:Japanese Swadesh list about usage notes that could be merged into this page. --Keene 21:07, 12 January 2008 (UTC)[reply]

Native Words Need to Be Used[edit]

Plain and simple: as with all Swadesh lists at Wiki, native words for each language need to be used. That's the whole point of a Swadesh list: to provide a very simple list of native lexemes which may be compared to other lists.

For example, the current Swadesh list for Japanese has the Sino-derived numerals meaning one through four. The native lexemes need to replace, or at least included with, the Chinese-derived ones.

  • Yes, all kanbun-derived terms (a.k.a. kango, “Chinese terms”) are wholly irrelevant to any Swadesh-list comparison for Japanese. These were all borrowed from Chinese. For the learners out there, any word spelled in kanji and using the on'yomi does not belong in this list.
For instance, 沢山 (takusan, many) and 若干 (jakkan, few) are both essentially borrowings from Chinese. (沢山 may be a Japanese coinage, but if so, it was nonetheless coined using Chinese roots.) Neither of these terms belongs in this list. Both concepts, many and few, have perfectly good wago (“Japanese terms”, i.e. native Japanese terms not identifiably borrowed from anywhere else) that should be used instead -- 多い (​ōi) and 少ない (​sukunai).
Then there are the terms that have appeared just during the course of the historical record, such as (watashi, I, originally just meant “private”) or (sakana, fish, originally a compound of (saka, alcoholic beverage) + (na, side dish), i.e. “snack when drinking”), that also do not belong on this list. These terms are relative innovations, and thus in no way represent root words that could possibly be useful in any interlingual comparison -- except for the very specific case of comparing word formation trends in different languages, but then that's not what Swadesh lists are generally for.
And then there are the terms listed as straight translations of Swadesh English terms, but that in Japanese are in fact inflections of other terms with different meanings, that just happen idiomatically to be used in roughly similar ways as the English gloss. As best I understand the principles of a Swadesh list, inflected forms have no place here either -- only root words. This rules out terms like 全て (subete, glossed as “all”, but actually the te form of verb 統べる (suberu, to gather together in a bunch)), or (nawa, rope, actually derived from underlying root verb 綯う (nau, to braid, twist, or plait together, such as into cord or rope)).
In short, this list is a complete dog's breakfast. I may poke at it some, but frankly, I'm much more interested in developing the actual term entries here at Wiktionary.
Pro Tip: If you are thinking of using this list as a basis for any real Swadesh list comparison, before doing so, do your own legwork and look up the etymologies of each of these terms. Note too that a single kanji spelling may have multiple possible readings -- in those cases, focus on the etymology for the kun'yomi (native Japanese reading). I'm doing my best to add etymology sections to all Japanese entries based on reputable sources, so hopefully at least some of these etymologies can be found right here in Wiktionary. Past there, caveat usor.
-- Eiríkr Útlendi │ Tala við mig 19:12, 1 June 2013 (UTC)[reply]
I don't think anyone has taken lexicostatistics seriously in forty years. All Swadesh lists are useless for determining the genetic relatedness of languages, regardless of whether they're filled with native words or loanwords, because the methodology has proven to be flawed. The only point in having Swadesh list appendices at Wiktionary is to provide a list of basic words that we need to have entries for. —Angr 20:44, 1 June 2013 (UTC)[reply]

Cleanup, Revamp[edit]

I thought some of this list looked different from how I remembered it. Poking around in the history of this Appendix page and the Wiktionary:Japanese Swadesh list page, I realized why -- apparently Croquant (talkcontribs) and I had had the same idea at nearly the same time back in 2006, and he launched this Appendix page, while I launched the Wiktionary:Japanese Swadesh list page four days later.

Comparing the two pages, this Appendix page has gotten a lot more editing traffic, but sadly appears to be less usable -- more Chinese-derived terms, more compounds, and more inflected forms (all inappropriate for a Swadesh list), and less useful information given (no Notes or Usage, for instance). Add to that the fact that the wikicode is harder to work with (as each column is given in a single huge list, but it's each row instead that the editor must work with).

With all that in mind, I'd like to propose that we merge this Appendix page with the Wiktionary:Japanese Swadesh list page, with a bias towards keeping the wikicode from Wiktionary:Japanese Swadesh list page and merging in any preferred data from the Appendix page. If no one objects, I may set to that task in a week or two. -- Eiríkr Útlendi │ Tala við mig 19:41, 1 June 2013 (UTC)[reply]

RFM discussion: June 2013–July 2016[edit]

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


This page is quite a dog's breakfast. The list includes numerous terms that don't belong in a Swadesh list; see the list's Talk page for details.

Glancing over the list, I thought some of the content looked different from how I remembered it. Poking around in the history of the Appendix:Japanese Swadesh list page and the Wiktionary:Japanese Swadesh list page, I realized why -- apparently Croquant (talkcontribs) and I had had the same idea at nearly the same time back in 2006, and he launched the Appendix: page, while I launched the Wiktionary: page four days later.

Comparing the two pages, the Appendix page has gotten a lot more editing traffic, but sadly appears to be less usable -- more Chinese-derived terms, more compounds, and more inflected forms (all inappropriate for a Swadesh list), and less useful information given (no Notes or Usage, for instance). Add to that the fact that the wikicode is harder to work with (as each column is given in a single huge list, but it's each row instead that the editor must work with).

With all that in mind, I'd like to propose that we merge the Appendix:Japanese Swadesh list page with the Wiktionary:Japanese Swadesh list page, with a bias towards keeping the wikicode from the Wiktionary: page and merging in any preferred data from the Appendix: page. If no one objects, I may set to that task in a week or two. Once done, my sense is that we should delete Wiktionary:Japanese Swadesh list, or at least turn it into a redirect to Appendix:Japanese Swadesh list. If anyone feels otherwise, please chime in. -- Eiríkr Útlendi │ Tala við mig 19:47, 1 June 2013 (UTC)[reply]

This sounds more like a subject for WT:RFM Chuck Entz (talk) 19:57, 1 June 2013 (UTC)[reply]

Anatoli's Angr's post here copied from Appendix talk:Japanese Swadesh list:

I don't think anyone has taken lexicostatistics seriously in forty years. All Swadesh lists are useless for determining the genetic relatedness of languages, regardless of whether they're filled with native words or loanwords, because the methodology has proven to be flawed. The only point in having Swadesh list appendices at Wiktionary is to provide a list of basic words that we need to have entries for. —Angr 20:44, 1 June 2013 (UTC)[reply]
Is lexicostatistics as a whole discredited, or just Swadesh's approach? Is there any value in keeping these pages, then? Should we just remove them, if they're not to be maintained? We have plenty of other, more highly-trafficked lists that help us keep track of what terms we're missing and might want to add. Curious, -- Eiríkr Útlendi │ Tala við mig 23:27, 4 June 2013 (UTC)[reply]
Both. Lexicostatistics makes the assumption that language change is a homogenous, continuous process that can be reduced to mathematical models. In the real world, there are things like regional and social variation, and things like sociopolitical and economic forces (not to mention blind luck) that often determine which form survives. Swadesh lists are interesting, and provide a rough view of variation between languages, so they're probably worth keeping in the appendices. I wouldn't base anything on them as evidence, though. Lexicostatistics is usually a lot better than flipping a coin, but there are too many ways for it to go wrong, since it depends on unverifiable past events. Chuck Entz (talk) 02:09, 5 June 2013 (UTC)[reply]
Considering the amount of work I've put in on the Burmese and Irish Swadesh lists and the amount of work I'm planning to put in on the Lower Sorbian, Old Irish, and Welsh Swadesh lists, I'd be opposed to deleting them. I don't know of any other lists of terms we need for those languages. —Angr 12:58, 5 June 2013 (UTC)[reply]

──────────────────────────────────────────────────────────────────────────────────────────────────── @Chuck, thank you for the detail. I'm clearly behind in my reading.  :)

@Angr, understood. I'm fine with keeping them.

That said, if we are to keep them, I feel rather strongly that the lists should be cleaned up -- despite Angr's comment, known-borrowed words have no place in any such list, even if the methodology has been entirely discredited. At the bare minimum, drilling down to root forms for these concepts would itself give us a list of terms needed for etymological purposes. I'm working through JA terms to add etymologies, which is how I wound up coming back to this list in the first place.  :)

Also, if we are to keep them, presumably we should only keep one per language, yes? And presumably in the Appendix: namespace?

Cheers, -- Eiríkr Útlendi │ Tala við mig 15:34, 5 June 2013 (UTC)[reply]

  • I'm sorry, I don't see any reason to exclude loanwords if a loanword is the most common term for a particular concept. The English Swadesh list itself has a large number of loanwords, including they, husband, animal, forest, fruit, flower, skin, egg, vomit, give, count, sky, mountain, and correct. One per language, yes, though there's nothing wrong with keeping some language-family lists too, though these need to be kept within reason. Some currently existing ones run off the right edge of the screen because they contain so many languages. AFAICT all Swadesh lists are already in Appendix: mainspace, which seems like the best place for them to me. —Angr 16:05, 5 June 2013 (UTC)[reply]
  • Re: namespaces, my original comment in this thread concerns a duplication, with one such JA Swadesh list in Appendix:, and one in Wiktionary:.
  • Re: loanwords, my understanding was that the whole point of Swadesh lists was for historico-comparative research? If so, known loanwords would be irrelevant. I'm not opposed to keeping a list of modern terms for Swadesh concepts, but that wouldn't be a Swadesh list then, no? Would an acceptable compromise be to add a column with a header such as Modern equivalent? -- Eiríkr Útlendi │ Tala við mig 16:14, 5 June 2013 (UTC)[reply]
If anything, I'd rather have a separate column for native words that are now obsolete/archaic or that now have different meanings. To stick with the English examples, the native words deer, blossom, hide, spew, and reckon are all modern English too, so it would be odd to exclude them from the "Modern equivalent" column. They just don't mean "animal", "flower", "skin", "vomit", and "count" anymore, or are at least not the most common way of expressing those ideas. But they could be in a column to show that they are the modern English descendants of the Old English words that did have those meanings. —Angr 16:39, 5 June 2013 (UTC)[reply]
This doesn't seem to be a RFM issue...? - -sche (discuss) 23:29, 2 July 2016 (UTC)[reply]