Appendix talk:Afroasiatic Swadesh lists

From Wiktionary, the free dictionary
Jump to navigation Jump to search

Should this be at "Semitic Swadesh lists"? All the listed languages are Semitic, and "Hamitic" is an obsolete term as far as languages are concerned (the family is called "Afro-Asiatic"). -PierreAbbat 21:07, 4 Dec 2004 (UTC)

I'm thinking the same (5.5 years later). Better call this list Semitic Swadesh lists. Guaka 23:17, 7 August 2010 (UTC)[reply]

There are some serious problems with the Hebrew romanization here.

I would suggest using a romanization of Tiberian (Masoretic) Hebrew rather than Modern Israeli Hebrew or what we have here. The current romanization relies on the Masoretic text, which dates to the 9th century, and it reflects a stage of the Hebrew language (or at least its phonology) which had evolved much since the Biblical period. This is demonstrated by the Secunda of Origen's Hexapla, which preserves sections of the Hebrew Bible in Greek script, as it was pronounced in his day. The lexical material in the Hexapla is very different from the Masoretic text and still more different from the romanizations offered here. Geoffrey Khan has some great material on the phonology of Tiberian Hebrew.

I've removed the remark about the pronunciation of Hebrew tzaddi being a Yiddishism, as it isn't correct. Richard Steiner of Yeshiva University did a number of groundbreaking studies of the different realizations of the *ts across the Semitic languages, and provided ample proof that all of the Semitic sibilants were probably pronounced as affricates. The affricated *ts is shared by Hebrew with the Ethiopic languages, the South Arabian languages, certain Aramaic dialects, and perhaps even early Arabic, as it is described by the grammarians like Sibawaihi. There is also some evidence that this phoneme was affricated in Akkadian as well. The pharyngeal pronunciation of the "emphatics" (which is found in Arabic today and most of the languages with which it has come into contact) is actually quite late; this is suggested by the fact that Arabic uses Saad to represent *ts and *tsh in loanwords from languages possessing these phonemes, such as Persian.

Hebrew Romanization[edit]

I too believe that Tiberian Hebrew should be used, as it is better attested than earlier Hebrew, which must usually be reconstructed. The Arabic presented here is, at best, Classical Arabic, which dates to the same period as Tiberian Hebrew. However, the standard Romanization of Tiberian Hebrew is based on later (ca. 1200, R. Joseph Kimchi) grammarians' attempts to give Hebrew a system of five vowels of two lengths each, excluding the schwa and the hatafim. The earlier grammarians (ca. 900, b. Asher), however, classified the vowels into eight sorts differentiated by vowel quality, as ealy modern grammarians have done as well (See Gesenius[1] Chapter I, Sec. 7-8). Thus it would be preferable to vowelize the Hebrew text in the Tiberian manner and use the following transliteration:

Qamas = â, the open back vowel (â as in French)

Patah = a, the open front vowel (continental a)

Sere = é, the mid-close front vowel (French é)

Segol = è, the mid-open front vowel (French è)

Holem = ó, the mid-close back vowel (By analogy to é)

Qamas Qatan = ò, the mid-open back vowel (By analogy to è)

Hiriq = i, the close front vowel (continental i)

Shuruq = u the close back vowel (continental u)

Schwa = ', Hataf Patah = a, Hataf Segol = e, Hataf Qamas = o

Meteg (stress) = undelined vowel, e. g. mèk

With the exception of the mid-open back vowel ò, the above applies for Syriac Aramaic as well.

As for the consonants: since, for the above mentioned reasons, Tiberian hebrew is superior for the Swadesh list's purpose, the Rafe consonants should also be indicated. The Rafe is often important for etymological purposes, as in malké, where the Rafe shows that a short vowel has been elided entirely before the k. How ever, notice that I transliterate the rafe not with the symbols recommended for Arabic, in which they are distinct phonemes, but with underlined letters, e. g.: b, d, ĝ (for typographical reasons, it is impractical to underline here), k, p, and t. These have the additional advantage of being the symbols traditionally used for the purpose in Semitic linguistics (and therefore would be commendable even for Arabic).

As for the emphatics: these are not readable for many, even those equipped with Unicode fonts. Instead, I suggest that the strikethroughbe used: t, t, s, d. This symbol is based on an alternate IPA marking for pharyngeals, a tilde ~ superimposed.

Silent letters (e. g., Aleph at the end of a syllable, He finally without Mappiq, Yod in דְבָרָיו or similar words) should be writen in parentheses when orthographical, and otherwise not at all (e. g., simħâ(h) not simħâ(t)).

Usage of Tiberian Hebrew also removes the affricate/sibilant problem.

I also propose that Hebrew nouns be shown in several forms, e. g. singular, plural, construct, abstract, etc. since often these have important etymological bearings and assist in comparison with other languages. Should this prove to be impractical, at the very least an additional stem be shown.

If no one protests, I shall begin implementing these changes. Ratzd'mishukribo 02:54, 5 May 2006 (UTC)[reply]

Simplification of the preceding: replace â with a, which represents that sound in English; and therefore, for disambiguation, replace former a with à (why not? though any other diacritic would do); former è with e, again for familiarity for English speakers and also to facilitate typing; and former ò with o, with the same rationale as the replacement of è. For the emphatics, reversion to a previous version used on this page would also be an option, i. e. D S T T etc. Ratzd'mishukribo 16:57, 21 May 2006 (UTC)[reply]

Additionally, Wikimedia programming apparently does not support under-dotted letters in links, which is an obvious disadvantage if every word is to have its own entry. This seems to exclude the possibility of using the strike-through for pharyngeals as well. That being said, it seems capitalizing pharyngeals is the only option. Thus a replacement must be made for the pharyngealized t, for which I suggest the capital thorn, Þ.

Though there is no protest to the suggested changes, I see no support or any notice at all, so I cannot take silence as approval. Do you agreee to these proposals or not? Ratzd'mishukribo 18:08, 17 November 2006 (UTC)[reply]

No, in Swadesh lists we only want the simplest form, generally the masculine singular. Plurals are very helpful indeed, but they do not belong in the Swadesh list. Plurals, synonyms, grammatical info, etc., all belong on the separate pages for each individual word: e.g., אֵם. —Stephen 19:58, 17 November 2006 (UTC)[reply]
At last someone gives an opinion. I wondered if anyone would notice had I transcribed Arabic in Swedish. What do you think about the rest, i. e. vowels and consonants?Ratzd'mishukribo 00:12, 20 November 2006 (UTC)[reply]
I completely agree with the vowels and consonants suggested for Hebrew, but I have been using these letters for Arabic:
Template:ARchar = ’ (glottal stop); Template:ARchar = ð; Template:ARchar = j; Template:ARchar = ħ; Template:ARchar = χ; Template:ARchar = š; Template:ARchar = ʂ; Template:ARchar = ɖ; Template:ARchar = ʈ; Template:ARchar = ʐ; Template:ARchar = ʕ; Template:ARchar = ğ. (I had decided on ’ for the glottal stop because too many people mix up ʔ and ʕ.)
As I mentioned before, the plurals, constructs, abstracts, etc., are extremely useful as well, but that sort of detail, along with indication of gender, belongs on the individual pages for each word. —Stephen 13:54, 21 November 2006 (UTC)[reply]
As many different versions are proposed above, I review and clarify the consideration and options:
  • Considerations:
    1. No innovation. Though we cannot use every existing system, we should at least use one that the reader would encounter elsewhere, or a collage of several systems, if necessary.
    2. Consistency. There should be one symbol per phoneme, with no variance between languages.
    3. Browser compatibility. If there be a choice of symbols used, that more widely legible should be chosen.
Transliterations of words already linked to in their native script, i. e. Arabic, Hebrew, and Syriac, need not be linked to again. This should solve several previous difficulties.
  • Options:
    • Vowels as above, a, â, é, e, i, ó, o, u, ', a, e, o, loosely based on the online edition of Gesenius. The same for the corresponding Syriac vowels; Arabic long vowels take the macron, ā, ī, ū, and for Akkadian, ē.
    • Stress should perhaps be ignored.
    • Raphe consonants, for consistency, should be transliterated as the similar Arabic phoneme, for clarity: v, f, ĝ, x, ð, þ. As mentioned above, Tiberian Hebrew is used. Again, the same for Syriac.
    • Emphatics are problematic as both the underdot and the hook are illegible in many fonts. A solution must be found. UNtil then, the underdot is continued.
    • 2 and 3 are retained.
Ratzd'mishukribo 20:43, 24 November 2006 (UTC)[reply]
Minor correction: read ğ for ĝ above Ratzd'mishukribo 00:26, 28 November 2006 (UTC)[reply]

Question[edit]

The page says - ħ; stands for the typical rough h sound of Semitic languages.; could someone please tell me:

  • The place of articulation (it's not glottal because you can't voice a glottal fricative)
  • The manner of articulation (a fricative, I presume)
  • The voicing (voiceless, I presume)
  • Anything else special about this sound

My guess is that this represents a voiceless pharyngeal fricative, but that's just an educated guess. 200.77.83.133 20:21, 25 July 2006 (UTC)[reply]

Yes it is - see Wikipedia, Heth. Ratzd'mishukribo 21:13, 25 July 2006 (UTC)[reply]

Links with Niqqud[edit]

The links to the Arabic entries are not vowelized, and nor are the Syriac ones. I suggest removing the vowelization from Hebrew as well. Furthermore, many links do not redirect to there unvowelized version. Ratzd'mishukribo 00:20, 1 December 2006 (UTC)[reply]

They can of course be made to redirect, but I think it’s much better not to use vowel pointing. In the Arabic, in particular, words almost never receive full pointing, and when there is a need to disambiguate some word, only one or two points will be used. Therefore, there are many different possible combinations for each word, so I never use vowel points in links. Even so, it is still problematic because some letters such as yā’ and tā marbuta may be written with or without dots in most cases, and hamza may be left off of alef. The software developers are gradually addressing such problems, and hopefully the search software will one day learn to ignore vowel points and tashdids and will consider certain letters such as yā’ and alef maksura to be equivalent to one another. In the meantime, I do not write the vowels, or I write them so that they do not link: Template:ARchar. —Stephen 18:14, 1 December 2006 (UTC)[reply]

Transliteration and Completion[edit]

I suggest to adjust the transliteration to more scientific standards. Right now it looks more or less like chat-romanization and isn't very comfortable to read. I can to this for Arabic and Geʿez, but would need help with Hebrew.

Especially the Geʿez list is lacking many words. I'll also try to take care of this as soon as I can spare some time.

merhawi 87.145.101.239 16:52, 14 April 2011 (UTC)[reply]

Complete rewrite[edit]

This is the worst Swadesh list I have ever seen. first of all this template should have been used. What's up with the chat symbols? Use either ISO romanisation standards or IPA. lastly, there are some very bad word choices like ħanaš for a snake? I will create a list in my user domain and move it here when I'm done.--Rafy 09:05, 12 June 2011 (UTC)[reply]

It is your own fault that this template was not used, so don’t blame other editors for your shortcomings. Knowledgeable improvements are very welcome, but if you can’t work without insulting the other editors who have contributed their time and effort, then you will have trouble getting your own work accepted. This page was created on June 8, 2003, and on that date your preferred Swadesh template did not exist in its current form. This Swadesh list was made by using this template as it existed on June 8, 2003.
In any case, ħanaš was a recent change (last September) made by a single editor and I am sure that he had a reason and felt that it was a reasonable change. I agree with you that the original ثعبان should be restored.
As for "chat symbols", discuss any changes and modifications that you have in mind here on the discussion page first. There is a consensus for the current transliterations, and you should obtain the agreement of the other editors before you make a big change. And no more insults. —Stephen (Talk) 09:47, 12 June 2011 (UTC)[reply]
In particular, consider what Ratzd'mishukribo discussed above re Considerations:
  1. Consistency. There should be one symbol per phoneme, with no variance between languages.
  2. Browser compatibility. If there be a choice of symbols used, that more widely legible should be chosen.
If you change the transcription of the Arabic to IPA or some other system, you must carry the same changes across to all of the other languages. On this page we are ignoring the language-specific transliteration systems that we consistently use elsewhere on Wiktionary, because here we want no variance between the languages. If you change the transliteration of the Arabic to a different system but do not change the other languages, it will probably be reverted, so make such changes separately or you chance losing other, unrelated, edits that you made at the same time. —Stephen (Talk) 10:30, 12 June 2011 (UTC)[reply]
I apologise if my remark was seen as an insult, it certainly wasn't intended. I will add some Syriac (eastern variety) words the coming days, I have also replaced some archaic ones (pre-10th century) with modern more common words.--Rafy 11:21, 13 June 2011 (UTC)[reply]

There must be Berber language![edit]

Böri (talk) 11:45, 18 May 2012 (UTC)[reply]

I've just added Berber, along with Coptic (which shares a lot of isoglosses with Berber), Hausa, Oromo, and Somali. Now we just need some more Chadic languages, perhaps some more Cushitic, and of course, the fascinating Omotic languages of SW Ethiopia. — Stevey7788 (talk) 08:59, 23 February 2015 (UTC)[reply]
I feel like, of all Berber languages, Rifian is too innovative (in fact it's one of the most developed of all) for the list, and it's the one of the most arabised too (just count the amount of Arabic loanwords, even the numbers!). I have a list of 100 words in the Ghadames language, which is not one the most geographically central, but also one of the most conservative (it preserves Proto-Berber *β). Should I include it or replace Tarifit?

RFC discussion: January 2014–October 2015[edit]

The following discussion has been moved from Wiktionary:Requests for cleanup (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


This currently doesn't say anything about actual languages. It only lists words represented in various scripts. Presumably, Arabic and Hebrew script stand for Arabic and Hebrew language, but there's nothing that says so. —CodeCat 22:16, 13 January 2014 (UTC)[reply]

They are just languages mislabeled as scripts. --WikiTiki89 01:08, 14 January 2014 (UTC)[reply]
Seems to be OK now (unstrike the header if it's not). - -sche (discuss) 06:50, 29 October 2015 (UTC)[reply]


Separate columns for romanizations?[edit]

Is there a specific reason why the romanizations for some languages have their own columns rather than just being in brackets immediately after the native spelling (as they would be in a translation box on an English entry)? It seems to be it would be cleaner and more spacious for more languages. I figure this is just a holdover and nobody's bothered to put in the time to do it so far; if no one has a problem, I'll be merging the columns in the coming days. --334a (talk) 04:27, 2 August 2019 (UTC)[reply]