Wiktionary talk:Reconstructed terms

Definition from Wiktionary, the free dictionary
Jump to: navigation, search

Archived discussions[edit]

advanced/extended/hypothetical etymologies - Proto-Indo-European[edit]

This discussion was moved from Wiktionary:Beer parlor.

is it appropriate to add advanced etymological information to words? By this I mean tracing words back to their hypothesized proto-indo-european roots, where possible. Thus, joke comes from the latin iocus which in turn is the offspring of the proto-indo-european iek. I would like to see this, along with a standard format for adding PIE roots, and, eventually structured categories of root words so that there might be a "Category:PIE_iek" and "Category:LA_iocus" (is "latin" LA?) and "Category:DE_ger" (the root word for jest is strictly germanic, if I recall correctly) or something similar for words that likely descended from whatever language roots they may be from. I think this could be an incredibly helpful linguistic tool and I'd like to help slowly manually building upon it in my spare time using what references I have at my disposal. Thoughts, please. Jxn 20:42, 2 April 2006 (UTC)

The problem with all proto languages is that they are highly speculative. Worse the speculations are largely unverifiable without doing original research which is outside the scope of Wiktionary. In fact doing orginal research is quite hard in itself since there are virtually no sources to do any research on. Not that it really matters to us since anything put here must be possible for anybody to verify without doing any original research.
Secondly I really can't see any reasonable use for it. Looking at List of Indo-European roots just make me go "OK and knowing that two seemingly totally different words are related is useful in what way?". Having a category for any words that are cognates because they trace their etymology back to say a specific Latin word is one thing. I don't think it is that useful but at least it is way to find missing etymologies by looking at the category and realizing that some words are missing. But then you can click on the "What links here" on the page for say some Latin word if you are really intrested.
I think its great that somebody is intrested in adding etymologies but including proto languages is taking it too far. It would possibly be useful to have "reverse etymologies" on a page of a word or in the appendix to show how words "fan out" from a common "ancestor" but since words inflect or combine with other word to form new ones this would very likely look too horrible to be of any real use. Lets start worrying about adding normal etymologies first. BTW there is a page Wiktionary:Etymology that badly needs updating. Feel free update it or discuss any updates on the talk page. --Patrik Stridvall 22:22, 2 April 2006 (UTC)

You can put a reconstructed proto-form in the etymology. (If you do, please spell it according to the source and cite the source, so people don't quibble with it later.) Yes, Latin is la. There are already a few categories like this; equus is in Category:Latin root equ (there are more categories like this on the Latin Wiktionary; try la:Categoria:Radices, specifically la:Categoria:Radices Latinae). A large problem with putting reconstructed forms in page (or category) titles is their instability—the form varies from source to source and theory to theory. It would be best to put it under the attested forms of the root (and then possibly link them together with an infobox style template, similar to the one at la:Categoria:Radice equ). —Muke Tever 22:42, 2 April 2006 (UTC)

We don't have a policy against including roots as Patrik's comment suggests, and there are quite a few etymologies, e.g. on bark, which include PIE roots. Ncik 22:58, 2 April 2006 (UTC)

I didn't say that we have a policy against it. But as Muke says "the form varies from source to source and theory to theory" so I think it would be inappropiate to have such things on the pages for English words or for that matter on the pages of any living language. Speculation about the futher origin when you reach the "end of the line" is OK so in the case of bark the speculation can go on the page of the Old Norse word with references. So while we don't have a policy right now, I think we should. And that policy should be "Nothing that is too speculative on the pages of living languages". --Patrik Stridvall 09:21, 3 April 2006 (UTC)
What I find more interesting is adding "cognates" in the etymology section, mentioning related words in related languages. I've done this in a couple of instances, experimentally; see eten or accoil. — Vildricianus 12:00, 3 April 2006 (UTC)
Yes, that might be useful. Many to not say almost all English words with Old Norse origin exists in Swedish and I suspect, Danish, Norwegian and Icelandic as well. What worries me is that some words with Latin or Greek orgin exists in basicly all Romance and Germanic languages and possibly also Slavic Languages. In any case we probably should try to agree on some appropiate formating for the cognates. --Patrik Stridvall 14:18, 3 April 2006 (UTC)
That "the form varies from source to source and theory to theory" is not to me a problem. The form of "colorise"/"colorize"/"colourise"/"colourize" also varies from source to source and theory to theory but we have a way to handle it: Alternative spellings. We can handle it even better for these since the sources and theories have names. — 19:04, 3 April 2006 (UTC)
For "colorise"/"colorize"/"colourise"/"colourize" we can and eventually will dig up quotes to support each form. For recontructed proto-forms this is for obvious reasons not possible. The credibility of the proto-forms are entirely depended of credibility of whomever recontructed them and can't be verified without doing original research which is out of the scope of Wiktionary. Actually it is not really possible to verify them at all per any reasonable definition of the word verify.
You're thinking inside the box. Naturally it is impossible to find quotes for proto-forms in the way that it is for written languages, but it is certainly possible to find citations. A citation for such a form would be to state what dictionary or other scholarly work it is discussed in. Your disclaimer about the credibility of proto-forms is justified and I would recommend a page somewhere to explain this, but original research is not needed, even though Wiktionary, unlike Wikipedia already has plenty of minor original research. — Hippietrail 00:21, 5 April 2006 (UTC)
Note that I have not argued against having the proto-form on separate pages starting with "*" and I have no problem with pages for dead languages meantioning them as well as linking to them. See also below. -Patrik Stridvall 08:46, 5 April 2006 (UTC)
Filling the pages of dead languages with references to existing theories is one thing, but doing it on living languages is taking it too far. Anybody that have any use for them to do research on whatever will not trust us as a source and anybody else will have very little to not say no use for them. Cognates is one thing. It is much easier to remember foreign words that have cognates in your native language than other words. I'm trying to improve my French and words that have cognates is much easier to learn. Sure most of the time I can guess myself, but knowing for sure would be even better. I'm not trying to tell anybody how he or she is use his or her time but please lets spend time on adding the verifiable etymologies first. --Patrik Stridvall 20:01, 3 April 2006 (UTC)
But this is your opinion which you are entitled to. None of your arguments hold up though since they could all be used to deny entry of all kinds of things Wiktionary already includes. And as we all know, contributors spend their time here just exactly how they want to, and they surely always will. — Hippietrail 00:21, 5 April 2006 (UTC)
It is not about denying entry, it is about keeping the pages of living languages free from things that only are useful to a very small minority. But as I have said, I have no problem with having it on the pages of dead languages. As for policy in general, just because we allow entry for something doesn't mean that we should allow anybody to put it anywhere. All prioritizing regarding presentation is to some extent POV but that is unavoidable. In this case, I think my proposal is a reasonable compromise. --Patrik Stridvall 08:46, 5 April 2006 (UTC)

I have done a lot of work on etymologies myself, especially on the Old English entries. I like to see a full Etymology section, including on living languages. However, I note that pages are now being created in ‘Proto-Indo-European’. Personally when I use PIE forms I do not link them, I just put them in italics. They are in my opinion too conjectural and vary too much from authority to authority. So, although I don't really mind if someone wants to create all these pages, I hope care will be taken that existing etymologies will not be have their PIE forms changed just to fit in with the ‘standards’ now being created. Widsith 06:12, 4 April 2006 (UTC)

thanks all for the input/information. I would like to add that my preference is not to have the *proto language roots included with the basic definitions of words, but to have possible proto language roots (perhaps even conflicting hypothesized roots from different sources) included in a separate "extended etymology" section or similar at the bottom of the page. I think adding them above the definition would detract from the main purpose of wiktionary--word meanings, but I think that good sourced proto roots should be included, if only to help spur thought on the interrelationships between words. I think this could make wiktionary a much more powerful source of information than most standard dictionary (and, if people are active and watchful enough, allow it to offer high-quality etymological data that not even OED includes). What are others' thoughts on a standardized separate section for this type of etymological data, if sourced properly? (with the caveat, of course, that users be somehow informed--perhaps by key words such as "possible" or "theorized"--that these proto roots are not necessarily correct or confirmed) Jxn 01:59, 8 April 2006 (UTC)
This discussion was moved from Wiktionary:Beer parlor.

Pages for reconstructed words?[edit]

This discussion was moved from Wiktionary:Beer parlor.

Over the past few months, I've noticed a lot of broken links, and never a working one, for the many reconstructed words that linguists so adore. So a few hours ago I tried making a few such pages, and I think it worked pretty well. These are the first ones on Wiktionary that I'm aware of, and both pages have two Proto-Indo-European roots each: *gerə-, *od-. I also made some categories that seemed to be lacking, including Category:Roots, Category:Proto-Indo-European language, and Category:Reconstructions. Before I spend too much more time making more pages like this (which I'll gladly do if there's any interest), I decided to come here and see if anyone has any suggestions, criticism, comments, etc. regarding these changes and additions. Thoughts? -Silence 10:02, 4 April 2006 (UTC)

Reconstructed words are not supposed to be linked in etymologies. They (with perhaps a very few exceptions) specifically fail our criteria for headword inclusion. —Muke Tever 22:39, 4 April 2006 (UTC)
Is there a language code for this language ? Does this language have a name (perhaps Proto-Indo-European? If so, then just put the words in under that language heading. Forget the asterisks. And you'd have to successfully argue (or assert) a change in the Criteria for Inclusion
If it doesn't have an offical language code, then arguably it doesn't belong here, but rather in a WikiBooks text book on Proto-Indo-European roots". IMHO. --Richardb 13:09, 4 April 2006 (UTC)
Reconstructed languages are specifically excluded from the scope of ISO 639 language codes. —Muke Tever 22:39, 4 April 2006 (UTC)
Information. There is an Appendix that already deals with this. Maybe it should just stay as an appendix Wiktionary Appendix:Proto-Indo-European roots--Richardb 13:12, 4 April 2006 (UTC)
"Is there a language code for this language ?" - Not that I know of, but the code "ine" (for "Other Indo-European language") could perhaps work for it. Additionally, just about any linguist in the world will be aware of what language you're talking about if you use the abbreviation PIE, so I don't see any reason we couldn't simply use "PIE" for these purposes, since that doesn't seem to be taken (and probably never will be by anything else, considering the confusing it'd cause) on any of the language code lists I know of. I'm sure that the only reason it doesn't have a code is because it's a reconstructed language, and language codes are generally for attested languages (i.e. ones directly found in speech or writing, not derived from systematic commonalities between existing languages, even when those derivations are extremely sound and near-universally accepted, which is often the case for PIE); it's certainly noteworthy enough, and indeed, increasingly often, many major dictionaries (such as the American Heritage Dictionary) put a great deal of focus on PIE roots (in fact, I bet the main reason most don't is because of space concerns, which isn't an issue for us!), despite AHD being an English-specific, and not general Indo-European, dictionary, so this stuff can be highly valuable even to someone only interested in their own language, and not in comparative linguistics or what-have-you.
"Does this language have a name (perhaps Proto-Indo-European?" - The name that was originally used for it by its native speakers is lost, but Proto-Indo-European is indeed the name that is almost invariably used for this language, yes. (The main other name I've seen used is "Pre-Indo-European".)
"Forget the asterisks." - I'd considered this, but after thinking about the matter, it is my opinion that the asterisks are vitally important, not just for PIE, but for all languages where a form is not directly found in any text, but is just theorized (sometimes very strongly, almost to the point of certainty) to have existed based on etymological evidence. Without the asterisk, which is a very common linguistic convention, that extremely necessary aspect of these words could be lost to casual readers. For example, the Latin verb inodiare is not attested in any source, but we're almost certain that it existed because in odio is attested and later verbs in early French (which later developed into words like "annoy" and "ennui") are very similar both in form and meaning to this, suggesting, based on comparison to similar developments where there are attested "middle stages" for word evolution, that inodiare was probably a Vulgar/Late Latin innovation. But it's still a, so we should have a page on it at *inodiare, not inodiare (and since there are no other words spelled the same, a redirect from the latter to the former would probably be merited here), to distinguish it from the numerous Latin verbs that are attested, like amare. The same applies to every language; the convention, already very common on Wiktionary (but not formalized, apparently), of using * before words like Proto-Germanic reconstructions (which I feel should also have their own pages where noteworthy and widely-accepted, for the same reason as PIE), is an important indicator of their nature.
"If it doesn't have an offical language code, then arguably it doesn't belong here, but rather in a WikiBooks text book on Proto-Indo-European roots". IMHO." - I disagree. Anything that belongs in a comprehensive dictionary belongs in Wiktionary, and there's no argument one could make that the most successful ancestor language in the history of mankind (that there is any strong indication existed), which hundreds of thousands of words in dozens of languages have clear links to, isn't a valuable part of any comprehensive dictionary. Delegating PIE to WikiBooks would be rendering it completely useless to Wiktionary, as we couldn't then have separate pages for individual roots and thus couldn't effectively discuss or cite which form to use for each, when forms are disputed and when they're widely-accepted, etc. -Silence 15:13, 4 April 2006 (UTC)
As I commented somewhere above, I don't see the need for these forms to exist as pages of their own. But if they do go in they should retain the asterisks, which are important markers of their conjectural nature. Widsith 13:15, 4 April 2006 (UTC)
If not as pages of their own, then as what? I'm certainly willing to help put together a list, but surely you realize that a list has a ridiculous number of limitations, such as being extremely difficult to search through, to link to specific entries in, and to provide detailed information regarding each root. Imagine if instead of having individual entries for English, we just provided an alphabetical list of every English word on a single page, and expected people to track down the word they wanted by scrolling through it. Well, for PIE this would be even worse, since most people will only be familiar with the forms by means of their derivations, and thus an alphabetical list will be nearly useless to all but the most die-hard of linguists. Moreover, many of the phonemes in PIE, such as laryngeals, don't fit in the Latin alphabet and would need to be placed somewhat arbitrarily in such a list. I'm not saying that I oppose using a list, but I oppose only using a list, because it's much less valuable to our readers. -Silence 15:13, 4 April 2006 (UTC)
I do sympathise with what you're saying, and I have a great interest in PIE studies myself. But one reason why, as you say, ‘most people will only be familiar with the forms by means of their derivations’ is because there is little consensus over what the PIE forms should look like. How will you arbitrate between different authorities on the subject? Don't get me wrong – by all means go for it if you are willing to try and deal with this minefield. I am just not sure how useful it can be while there is so much hypothesis and disagreement involved. Widsith 15:35, 4 April 2006 (UTC)
  • I wrote a compelling and masterful eight-paragraph response to all the points on this page, elaborating a large number of my earlier comments, explaining the flaws in all the problems and alternatives (such as the list and category) to having individual pages, and overall demonstrating perfectly why including entries for reconstructed roots would be an absolutely fantastic addition to this dictionary that would open up a whole world of new etymological depth and value to readers. But that was all deleted, so, screw that. Can I just say that I'm right and somehow convince you just with that? =_= I do so love discussion and would love more feedback on the specific pages I've tried out above, but right now I want to stab my eyes with glass and nails. I hate the world.
  • To summarize the last paragraph: People aren't familiar with PIE for the same reason they're not familiar with most etymology: because linguistics isn't exactly a casual, everyday interest for most people. How much consensus there is regarding the forms has nothing to do with how well-known they are, and indeed, if everyone already knew all PIE forms, then there'd be no point listing it because it wouldn't be providing anyone new information. There's really less controversy over many PIE forms than you suggest, but we will arbitrate between different authorities in the same way Wikipedia deals with different POVs, and in the same way we already deal with variant forms for things like color and colour: we provide all widely-attested, noteworthy forms of words, and explain why and how they differ. Just look at *gerə- for an example of this (and of the types of references we can use in general). As for "I am just not sure how useful it can be while there is so much hypothesis and disagreement involved."—about as useful as etymology in general is. If absolutes are what you're looking for, you'll probably be disappointed; a form that's 95% likely, rather than 100%, still merits mentioning, as there are so few things in this world that are truly certain. -Silence 17:23, 4 April 2006 (UTC)

Oh well, I can't see why these don't deserve inclusion. But they'll need to be flagged with something in order to separate them from "standard" entries, they'll need plenty of references and so forth. They should also keep the asterisk. That's what I think, at least. — Vildricianus 09:02, 5 April 2006 (UTC)

I also think we should have entries for PIE roots as long as those are carefully researched and extensively referenced. Ncik 13:33, 5 April 2006 (UTC)

  • My understanding was that we do allow these entries, in the Wiktionary Appendix: pseudo-namespace only. (Richardb stated this above, right?) This obviously means they shouldn't be linked from main namespace entries. --Connel MacKenzie T C 13:56, 5 April 2006 (UTC)
    • I'm willing to negotiate on whether the PIE roots should be in the main encyclopedia-space (which I mainly thought was a good idea because the asterisk already takes care of distinguishing them from normal entries and because it's a lot easier and faster to type than some 20-letter phrase like "Appendix:Proto-Indo-European root" at the start of each page!) or in the Appendix; I can certainly see some advantages to keeping reconstructed forms in a separate namespace. However, regardless of whether they're in the main namespace or the Appendix namespace, it would not be acceptable to not link to those pages! Cross-namespace links are 100% appropriate in this context, as they are directly content-relevant; in fact, hundreds of Wiktionary articles already link to Appendix entries, as if they didn't, how on earth would anyone find the Appendix page they need?! If we were a paper dictionary, we'd direct users to the appropriate Appendix entry when a relevant PIE root was mentioned in an etymology; as an online dictionary, the exact equivalent is to provide a hyperlink leading directly to the appropriate root. I see absolutely no benefit to not linking fully to PIE roots in dictionary entries, where they are directly relevant.
    • Also, one thing that still hasn't been addressed is what to do with reconstructed words that are not PIE, and belong to a language where not all the words are reconstructed: like my example, inodiare (a reconstructed Latin word), above. If the PIE roots go in the appendix, presumably these words will also go in the appendix (where they are noteworthy enough for their own page), but if that's the case, what will the format be for naming them? I can't think of any clear, non-awkward way to consistently name, except for simply "Appendix:*inodiare" (which is pretty convenient, though having the * immediately after the : doesn't look good and could cause the asterisk to be missed). I'm willing to continue this work in Appendix entries, even though I feel it's rather more bureaucracy and convolution than is needed in this case, but only if (1) we decide on a consistent naming scheme for all reconstructed words and roots, not just PIE ones, and (2) we still link to those individual pages from non-Appendix dictionary entries.
    • Also, I agree with Vildricianus. References and the asterisk are both essential. -Silence 14:12, 5 April 2006 (UTC)
  • The way you have done it in *gerə- seems fine to me, as far as it goes. Perhaps it could have a very simple banner somewhere (by template) that states what Proto-Indian-European is about, and a request for translations to NOT be added, and any other applicable restrictions / instructions.--Richardb 15:33, 5 April 2006 (UTC)
    • A banner sounds fine to me, as a good way (that doesn't require excessively long page titles like "Wiktionary Appendix:Proto-Indo-European root *seH₂wel-" (vs. simply "*seH₂wel-" or "*sāwel-")) to make it clear to readers that this is about a reconstructed proto-language root, not an attested form; it could be used to explain the "*" terminology, and perhaps a link to an Appendix page that gives general information on PIE phonetics.
    • I understand concerns with including non-attested forms in the dictionary (though they are attested in the sense of being recorded by numerous reputable and noteworthy publications; they just aren't directly attested in their original language), but I just feel that purely as a matter of practicality and user-friendliness, "*" is just as good as "Wiktionary Appendix: Proto-Indo-European root *" for telling readers that a certain page is about a reconstructed root, and * has the advantage of being infinitely more concise (and thus both easy to link to and to search for on Wiktionary, requiring fewer redirects and piped elaborate links overall) while still carrying the data "reconstructed form" in its title in the form of the simple * (since, as far as I know, we don't use "*" for anything else in titles except *). However, if there's more support for keeping reconstructions in the Wiktionary Appendix: namespace, I'll go along with that, as long as we can decide on how to name (and organize) pages for all reconstructed roots, including ones that aren't PIE. Really, everything gets a lot more complicated if we decide to reserve reconstructed forms for the dictionary's appendix, since we also have to decide on what to change in the layout (i.e. if we have the name of the language in the page title, as is the case with "Wiktionary Appendix:Proto-Indo-European root *X-", we presumably won't also use a section header with the language's name within the page; also, what's to be done when there's more than one reconstructed root with the same form, which is the case with quite a few PIE roots, including the two prototypes I created above?) and deal with all sorts of novel issues, when we already have a well-established, working system at hand in the form of ordinary dictionary-style entries.
    • The only thing I disagree with is "a request for translations to NOT be added"; I don't see why a banner on PIE roots (or reconstructed roots in general) would need such a disclaimer. If a significant translation is lacking, it should certainly be added (preferably with a citation for verification), as is the case with all information on all pages. -Silence 16:55, 5 April 2006 (UTC)
I also don't see what's wrong with translations. For the time being we should allow all sorts of stuff to be added, and then, if problems arise, we can discuss again. But imposing restrictive policies before having any practical experience seems unreasonable. Ncik 02:41, 9 April 2006 (UTC)
  • The asterix character is used and does have special meaning. It is unacceptable (for this) in the main namespace. Categories in particular, are hosed by these entries by default. And not having an ISO 639 code means they do not meet our criteria for inclusion. Therefore, having them with the prefix "Wiktionary Appendix:Proto-Indo-European root *" is appropriate. Having them in the main namespace without that prefix is not appropriate.
I'm absolutely in favour of having these entries in the main namespace. If prefixing hypothesised with an asterisk causes technical problems, we should consider dropping it, though. The criteria for inclusion need to be modified. Ncik 02:41, 9 April 2006 (UTC)
  • I don't like the idea of including them in etymologies, as that gives the impression that they are words when in fact they are only theories. --Connel MacKenzie T C 02:44, 7 April 2006 (UTC)
Hypothesised words should of course be linked. That's common sense and we've always done so. Ncik 02:41, 9 April 2006 (UTC)
Ncik, that is simply not true. We include all words in all languages. But to be included here, it has to be a word. A theory about a word is not a word. --Connel MacKenzie T C 02:49, 9 April 2006 (UTC)
As I said, the criteria for inclusion might need to be changed. Arguably, a hypothisesed word is a word. Ncik 03:09, 9 April 2006 (UTC)
Well then, go convince Eclecticology. These are certainly more questionable than say Klingon, or Quenya. But even then, it is only a theory being attested, not a word. --Connel MacKenzie T C 06:58, 9 April 2006 (UTC) (edit) 07:03, 9 April 2006 (UTC)
I haven't really stepped in on this yet, but I principally agree with Connel. These PIE hypotheses have no place in the main namespace. Even as an appendix I have doubts. At the very least any such entry for a PIE form should be properly verifiable to insure that these spaces are not being used for someone's original research on his pet theories of linguistic roots. Eclecticology 18:44, 10 April 2006 (UTC)
  • Since this is obviously an issue that will require a lot of discussion before any sort of consensus can be reached, and since I don't want to go too far (even though I'm eager to waste hours of time creating such pages and linking to them from their derivative pages as soon as possible) with working on PIE roots before there's any agreement on how (or even if) we're to use them, I've created a new page to centralize discussion on this issue so we can eventually hammer out an agreement on this. It's at Wiktionary:Reconstructed terms, which is currently just a place to discuss, but may eventually become a Wiktionary guideline if we can work out an agreement on how Wiktionary should handle unattested, reconstructed words, roots, and phrases.

New discussions[edit]


So, the discussion goes on. Anyway, the talk page looks good, but I think the *gerə-page is a better choice than *od-, since it's more thoroughly worked on. Wakuran 10:50, 16 April 2006 (UTC)

  • I agree. The only reason I put *od- at the top is because it doesn't use any unusual characters, and I want a very ordinary-looking page (other than the * and -, of course) so as to avoid any potential confusion. Perhaps I'll make a new PIE page that doesn't have any laryngeals or other special characters in its title, just as an example to put there. -Silence 14:30, 16 April 2006 (UTC)


*gerə- is a nice page. It's good to see that language names and families are now being given, which wasn't the case with *od-. One thing that bothers me is the fact that you are listing borrowed terms as well as etymological descendants. English geranium for example is listed under Greek γεράνος which makes it look as though it evolved from the Greek word. This is obviously not the case in this instance, but with other situations it might be confusing. In Romance languages for example, most words have descended naturally from Latin, but some have been borrowed piecemeal at a late date, and these should be distinguished. For example, the Spanish words palabra and parábola both come from Latin parabola, but while palabra was a natural development of the Latin and was in continuous use, parábola was borrowed into the language in the fifteenth century. Perhaps a way of including these borrowed terms while also distinguishing them would be to put them in brackets? Widsith 08:04, 25 April 2006 (UTC)

I agree that noting borrowings would be beneficial; the only reason I didn't at least mention it in italics in the original is because, as you said, it's obvious in the contexts thus far, since English is not a daughter language of Greek and yet is listed in an indented offshoot from it. Most borrowings will probably be similarly clear: an Italian borrowing from French, for example, will make it clear that the word is not directly taken from Latin because the Italian derivative will be indented from the French one, rather than being at the same level. I think there are a lot of issues with the current derivational layout we could change, though, not just that one: the current system is adequate, but not exceptional, and has some inconsistencies. It is also a little difficult to navigate, since people don't seem to be clear on whether the forms should be alphabetized based on the language, alphabetized based on the word, or just shoved in piecemeal. If we could find a way to integrate a table format into the current system of indentions, that would also probably make it easier to note much of the information that's difficult to fit in right now, like derivational data and meanings. All of the layout, obviously, is entirely fluid right now: the only reason that *gerə- differs at all from *od- in layout is because I haven't bothered to implement some of the later ideas and suggestions there. (I also haven't bothered to move *gerə- to *gerH2 yet.) -Silence 23:56, 17 May 2006 (UTC)

PIE vs. other reconstructions[edit]

We need a page to decide on specifically PIE conventions. These will be independent of other reconstructions, such as post-PIE proto-languages (Category:Proto-Germanic) and non-IE reconstructions (Proto-Semitic). PIE entries will require a lot of redirecting because of varying spelling conventions, but agreeing on some convention will greatly reduce that nuisance. Ideally, we should try to stick closely to the conventions already in use on Wikipedia's w:PIE.

I strongly recommend using the "* namespace". It should be clearly apparent that an entry is a reconstruction, but why bother with clumsy entries like Appendix:Proto-Indo-European root *bheu- when simple *bheu- does the job? there can still be links to the "*" entry from bheu and bhew. Entire lists, like Appendix:List of Proto-Indo-European roots of course do belong in Appendix:. Dbachmann 12:24, 8 December 2006 (UTC)

I've made a brief proposal at Wiktionary:About Proto-Indo-European -- in rather telegraphic style I'm afraid, should be fleshed out. Dbachmann 12:50, 8 December 2006 (UTC)

Result of policy vote[edit]

Policy vote concluded 22 January 2006, confirming existing policy: terms in reconstructed languages do not meet CFI, may be entered in Appendicies. Robert Ullmann 12:03, 26 January 2007 (UTC)

Wiktionary:Requests for deletion/Others - kept[edit]

Kept. See archived discussion. 09:31, 19 January 2008 (UTC)


I've rolled back the most recent edit to this page. In that last section, it says the entries don't go in the main namespace (which that vote did conclude.) The added paragraph I think was intended to say that PIE is linked from entries in the main namespace (which is already obvious) but was mis-worded to say that those forms go in the main namespace (wrong.) --Connel MacKenzie 18:39, 3 February 2008 (UTC)

Connel, you have utterly missed the difference between terms in reconstructed languages e.g. PIE, which is what the vote was about, and what goes in the appendix; and reconstructed terms in existing languages, which do *not* go in the appendix, such as an inferred Latin term. Which is what the added text was about.
Please go read it again, note that it does not say that PIE goes in the main namespace (which would be wrong), and put the text back. Robert Ullmann 08:37, 4 February 2008 (UTC)
I rephrased it a bit. --Connel MacKenzie 09:13, 4 February 2008 (UTC)
Original discussion held here: User talk:Hippietrail#Links to reconstructed terms in non-proto languages.
Please read the very first sentence of this page:
Reconstructed terms are words, roots, and phrases which are not directly attested in their respective languages, but have been reconstructed by linguists through etymological evidence.
And now suddenly the distinction is made between the "terms of of reconstructed languages", and the "reconstructed terms of existing languages".
I repeat what I said on Hippietrail's talk page: reconstructed language = set of reconstructed terms. Set of terms reconstructed from the (Vulgar) Latin descendants by means of comparative method is a language, call it Proto-Romance or whatever.
By what criteria do reconstructed Proto-Romance terms merit inclusion in the main namespace, as opposed to other protolanguages? --Ivan Štambuk 14:20, 7 February 2008 (UTC)
Well I agree with Ivan's comment on the HT's page that unattested terms can't pass RfV and therefore shouldn't appear in the main namespace. That being said the difference between Vulgar Latin and, say, Proto-Indo-European is that we do have writing samples of Vulgar Latin ("Proto-Romance" ≠ "Vulgar Latin"). Some terms can be attested and some can't. What do we do with a language with terms split across the two namespaces (NS:0 for the attested terms and Appendix: for the unattested/reconstructed terms)? Maybe that's ok, and Category:Late Latin will have NS:0 and Appendix: members.? --Bequw¢τ 22:08, 5 March 2008 (UTC)
Encyclopædia Britannica says that "Vulgar Latin is also sometimes called Proto-Romance, although Proto-Romance most often refers to hypothetical reconstructions of the language ancestral to the modern Romance languages rather than to the Vulgar Latin", although the tables on this page make quite clear distinction between Proto-(Italo)-Western and Proto-Romance. There was a vote a while go that forbids adding all reconstructed terms into NS:0 (even with an asterisk, in which case they could collide with normal entries sucha as *nix), which is reasonable, becase reconstructed terms are usualy not definitely shaped, and could have several different forms depending on the author/notation.
And what category would reconstructed VL terms go into? Proto-Italo-Western Romance or Proto-Western Romance? Maybe puttting all of those into :Category:Proto-Romance language, and separating among individual dialects with title= parameter of {{proto}} and the appropriate subcategories of: :Category:Proto-Romance language?--Ivan Štambuk 17:28, 6 March 2008 (UTC)


(Not modifying main page, as it’s policy.)

Note that in etymologies, you can refer to earlier forms in the same or other languages can use {{proto}} to refer to these as usual; to categorize correctly when the earlier form is in a different (proto-)language, use the appropriate (Wiktionary-specific) language code (these are listed in Category:Appendix-only language templates), such as lang=gem-pro for Proto-Germanic. (This seems to be as of mid-2011.)


—Nils von Barth (nbarth) (talk) 13:16, 16 September 2011 (UTC)

Layout proposal[edit]

Entries in "bottom-level" proto-languages, and possibly in all reconstructed languages, should have their ===Etymology=== section replaced with a more general section such as ===Reconstruction notes===. Currently, Etymology sections are often used for discussing points of contention in how a reconstruction is established and how its descendants are derived, but this seems technically incorrect. This type of information, on the other hand, is vital for the ability for readers to assess the validity of any proposed reconstruction, and it needs an official place in a protolanguage entry. --Tropylium (talk) 04:46, 3 February 2015 (UTC)