Appendix talk:Romance doublets

From Wiktionary, the free dictionary
Jump to navigation Jump to search

@Word dewd544, Ungoliant MMDCCLXIV and whoever is interested: do you think this might be interesting? This is obviously very ugly, messy and incomplete, but the most important question is: is it useless? --Per utramque cavernam (talk) 00:22, 12 December 2017 (UTC)[reply]

It’s definitely interesting. It is somewhat useful, as long as it retains information that is not easily found on the doublet categories. (pinging also @Romanophile).
I propose that we add two extra columns: semi-learned borrowings and indirect borrowings (especially the former). — Ungoliant (falai) 00:28, 12 December 2017 (UTC)[reply]
It could potentially be useful since it more clearly compares/contrasts them, as opposed to someone having to identify them as part of the long category lists. I made some quick fixes to what was there so far also. Again when it comes to the topic of borrowing, at some point we're going to inevitably run into some words that we're unsure of, so it may be tricky sorting some. The semi-learned category could be useful (I'm unsure about where Portuguese empregar would fit). Also I found out there was a variant of Italian liberare with a 'v' used in the past, with a different meaning, which raises some questions about the standard form. Either way, it's not something I'm gonna devote too much time to for the moment. Word dewd544 (talk) 02:36, 12 December 2017 (UTC)[reply]
@Ungoliant MMDCCLXIV, Word dewd544: Yes, I just took some clean cases to illustrate the idea, but if we want to be exhaustive and accurate, it's going to ask more columns, and a lot more work and musing.
I've also created Appendix:French doublets (hopefully Appendix:Italian doublets, Appendix:Portuguese doublets, etc., will follow), where I'm toying with different ideas and ways of looking at things, but as you can see it's a big jumble, and there's a lot to disentangle. --Per utramque cavernam (talk) 02:58, 12 December 2017 (UTC)[reply]
Sounds good. One thing to consider is how far are we stretching the definition of doublets? There could be quite a few in one language (counting loanwords from sister languages), in which case they wouldn't technically be doublets anymore but rather triplets, quadruplets, etc? Also, if a language has a descendant in either the borrowed category or the inherited one but not both, I'm guessing it shouldn't be listed here? Trying to consider how to handle the case of Romanian pansa (dress a wound; bandage), for example, which was taken from French panser, a specialized use of penser. Technically it would still be a doublet of păsa, although for some reason it feels a bit odd or awkward to consider it so. Guess that's why Ungoliant mentioned having a separate category for indirect borrowings... Romanian complicates things as many of its neologisms are simultaneously borrowed from French, Italian, and Latin, and adapted partly to existing Romanian rules. But in this case it was clearly taken solely from French. Anyway I added this case but I'm not sure I like it like that (however adding a separate third column for an indirect borrowing just for Romanian would make it look a bit weird wouldn't it?). Then there's also the case of rare or archaic words and the question of whether they should be mentioned/listed here simply because they technically qualify as doublets (for now I just put a qualifier in parentheses after). Feel free to mess around with these and see what you think may work best to make a clear, decipherable presentation overall. Word dewd544 (talk) 07:17, 12 December 2017 (UTC)[reply]
Edit- I must say I also like the French doublets page a lot. There's a lot of detail in there that can be quite useful. Word dewd544 (talk) 07:48, 12 December 2017 (UTC)[reply]
Another question: do you want to limit the languages to the “big 6” or can any Romance language be added? — Ungoliant (falai) 10:55, 12 December 2017 (UTC)[reply]
@Ungoliant MMDCCLXIV: I certainly want to put everything there, not just the big six! I only used those out of convenience: I tend to stay away from lesser known languages, for which we often don't have entries yet (see ratio, there's interesting data there, but lots of red links), and whose orthography isn't always standardised. I'm really not all that knowledgeable anyway; I know Latin, French, a little bit of Italian, and that's it!
The big six should perhaps be displayed more prominently than the others, though. Maybe we could always put these first, at the expense of the alphabetical order? Friulian below Portuguese --Per utramque cavernam (talk) 12:44, 12 December 2017 (UTC)[reply]
That works. Or we could embolden them. — Ungoliant (falai) 12:49, 12 December 2017 (UTC)[reply]
I was originally thinking of just sticking to the big six, but I suppose there's no real good reason to do so. Word dewd544 (talk) 18:05, 12 December 2017 (UTC)[reply]
@Ungoliant MMDCCLXIV, Word dewd544:
  • Bolding isn't extremely keen on the eye, but neither are the tables themselves so I don't know :p. But it's definitely strange to have the names of the big six bolded in some tables, and not in others.
  • Does Sardinian have a standardised orthography? Wikipedia says that "Some attempts have been made to introduce a standardized writing system for administrative purposes by combining the two Sardinian varieties, like the LSU and LSC, but they have not been generally acknowledged by native speakers." Does it mean that each of the two varieties has its own standard, or that it's an absolute free-for-all?
  • Also, I'm going back to the Wiktionary:Beer_parlour/2017/October#Section_"Descendants" convo again: ordering the descendants historically and not simply alphabetically makes it slightly harder to parse when there are doublets. Compare these two versions of recuperāre: historical version (I'm not sure about the Italian words, nor about Portuguese recobrar) vs. alphabetical version. And of course you don't notice as easily that there are doublets, but maybe that's not a problem if we have these Appendices pages?
  • Perhaps @Robbie SWE will be interested as well? --Per utramque cavernam (talk) 17:57, 12 December 2017 (UTC)[reply]
I shudder to think how insane it's going to be to handle the case of magister... Word dewd544 (talk) 18:56, 12 December 2017 (UTC)[reply]

Thanks for the consideration @Per utramque cavernam! You guys are doing an awesome job so far, I'll take a look and see where I can contribute. --Robbie SWE (talk) 20:09, 12 December 2017 (UTC)[reply]

Reference talk on semicultismos[edit]

The more I think about it, for many if not most cases of "semi-learned" words or semicultismos, the main thing that separates them from the "full" borrowing category here is simply the passage of time within the language... like I suspect with Portuguese espadua. They gradually adapted to some of the changes of the language, but not all, since they entered at a later time than the inherited ones. I mean, technically, can't even French état count as a sort of semi-learned word? It entered in Old French as estat and by modern French lost the 's' (with the final consonant becoming silent; not sure when exactly this development happened in French's timeline), as would also happen to inherited terms. Unlike the other Romance borrowings from Latin status, which either preserve the word intact or add an initial 'e', the French one stands out and is much more "Frenchified"/"Gallicized". Of course there are other semi-learned terms that were purposely adapted to a given Romance language when borrowed, which is a different case. Word dewd544 (talk) 03:01, 14 December 2017 (UTC)[reply]

To do[edit]

Iulius, spatula, palus (*padule), factus, plenus, flamma, caput, fabulor, lacuna, magister, solidus, fluxus, benedictus, pacifico, ligo. — Ungoliant (falai) 20:42, 12 December 2017 (UTC)[reply]

Regarding blasphemo, I'm unsure about the Portuguese lastimar. Some sources mention it as a borrowing from Spanish. Anyone have any other sources? Word dewd544 (talk) 00:12, 13 December 2017 (UTC)[reply]
Where did you find that? My sources do not mention a borrowing from Spanish (except a single sense used in Rio Grande do Sul). — Ungoliant (falai) 00:16, 13 December 2017 (UTC)[reply]
What's a good source for Portuguese etymologies that is freely accessible? I know I had a decent Brazilian one that went into more depth (had it bookmarked on an old computer) but I can't seem to find it anymore. The one I was talking about was this https://www.infopedia.pt/dicionarios/lingua-portuguesa/lastimar, but I've found that it can occasionally be a bit inaccurate or questionable. Anyway, in general the shift of bl to just l in Ibero-Romance isn't exactly normal, is it? Word dewd544 (talk) 05:36, 13 December 2017 (UTC)[reply]
@Word dewd544, Antenor Nascentes’, Dicionário etimológico da língua portuguesa is by far the best available online (archive.org), and Dicionário Aulete (www.aulete.com.br) is very comprehensive (very rarely I run across an etymology that is a bit iffy; it’s always a good idea to double check with Nascentes if it looks hard to believe). You can also search for articles that discuss individual words at Google Scholar, but you are unlikely to find individual words you’re looking for. I have access to some etymological material at my local library. — Ungoliant (falai) 11:32, 13 December 2017 (UTC)[reply]
I see, thank you. Word dewd544 (talk) 17:18, 13 December 2017 (UTC)[reply]
noverare v. trans. ora non pop., numerare.
Annoverare (Dante); annovero (ant.) « conto » (Seneca, Pist.).
Da nŭmĕrāre (lat.). Da nŭmĕrus venne nòvero (ora non pop.) « numero » (G. Vill.), mentre, per introduzione letteraria, ne venne número (Giamboni), detto anche dell’armonia del verso e della prosa (Varchi). Numerare da nŭmĕrāre; numeroso da nŭmĕrōsus.
nòvero v. s. noverare.
and
nòvero Forma varia di NUMERO (= fr nombre) con sostituzione di V a M.
Numero.
Deriv. Annoveràre.
REW doesn’t list any other form with a V, but there are some pretty convoluted developments: embruar, lombrare, orná, drombär, armnär, ... — Ungoliant (falai) 01:52, 13 December 2017 (UTC)[reply]
Dissimilation between the two nasals makes the most sense to me. I recall some other cases of m > b > v, which isn't that unusual. Also, this brings up another issue: regarding numerare, the Italian seems to be the only significant one with a doublet in this case, most of the others being either borrowings or inherited; the case of Old Spanish is still unclear. Should we mention a Latin word on this appendix where the doublet is only in one language? Word dewd544 (talk) 05:36, 13 December 2017 (UTC)[reply]
@Word dewd544: I meant to answer sooner to that question. What I had in mind with this page was to list only pairs of doublets that are shared by at least two Romance languages. But doublets that are only found in French (grêlegracile) belong to Appendix:French doublets, in Italian (noverarenumerare) to Appendix:Italian doublets, etc. Otherwise this page would become huge, and redundant with the single language Appendices. --Per utramque cavernam (talk) 12:20, 13 December 2017 (UTC)[reply]
Also, to answer another part of your question above: I think we should ideally least only true doublets; i.e. one is a borrowing (or semi-borrowing, or semi-learned form) from Latin, and the other the proper inherited form of that language. That would make use exclude the Romanian pair pansapăsa (not inherited, but borrowed from French) from the table for pensāre, or Portuguese preensãoprisão (not inherited, but again borrowed from French). What would be interesting to me would be to see all the cases of independent appearance of doublets. The idea is: "there is no law that says "if Portuguese has a pair of doublets for this word, French must have it too", so it's funny that it's been the case so many times anyway".
I don't know if that makes sense? But I'm interested in what you guys think of course. --Per utramque cavernam (talk) 12:40, 13 December 2017 (UTC)[reply]
Finally, about the case of rare or archaic words: I do think they belong here; we have the footnotes and parentheses to introduce the necessary caveats.
In Appendix:French doublets, I created a separate table for them (#6 in the table of contents) because it's true that I don't want to clutter the main list with obscure words (as I explained it on the talk page). I think it's good to keep a list of simple, straightforward and unequivocal examples, which could possibly be useful for students or laymen. (rédemptionrançon is a very nice and illuminating example, but let's face it, who's gonna be interested in a pair such as mensemoise? I didn't even know those terms before...).
But here I don't think we have to put them away; it's truly a comparative list at heart, and what matters is that the words exist, not that they're common, etc. --Per utramque cavernam (talk) 13:03, 13 December 2017 (UTC)[reply]
I agree with all of that. — Ungoliant (falai) 13:06, 13 December 2017 (UTC)[reply]
Understood. Word dewd544 (talk) 17:18, 13 December 2017 (UTC)[reply]
  • A little side note — would Latin aquila fit here? I'm mainly thinking of Romanian acvilă vs. aceră, but I suspect the situation might be the same for the other Romance languages too. --Robbie SWE (talk) 21:02, 13 December 2017 (UTC)[reply]
    Well the problem with that is most of the other languages don't have doublets afaik, except for perhaps the case of Old French, but we're not using Old versions of languages are we? As a general rule, I guess we're agreeing to list those that affect at least two languages. Word dewd544 (talk) 01:48, 14 December 2017 (UTC)[reply]
    I see what you mean @Word dewd544. Keep up the good work! --Robbie SWE (talk) 19:09, 14 December 2017 (UTC)[reply]
    @Word dewd544: To be honest it bothers me a little that we don't display Old French, Old Spanish, etc. But I can't think of a clever layout. --Per utramque cavernam (talk) 18:13, 14 March 2018 (UTC)[reply]
    @Per utramque cavernam: Yeah but it's also a bit inconsistent how some languages have "Old" versions and others don't. Like Spanish, French, and Portuguese do, but Romanian and Italian don't. And I guess Catalan does too, but it's not used often, compared to Old Occitan. So for languages without Old stages, everything will fall under that actual language as just archaic or obsolete, regardless of the dating being on par with that of say Old Spanish. It's true Romanian only started being attested in the early 16th century, by the time other languages were already out of their "Old" phase, so it doesn't apply as much to it, but Italian has been written for considerably longer. Word dewd544 (talk) 19:43, 14 March 2018 (UTC)[reply]
  • Another good one to add is crassus. I always thought that the forms beginning with 'c' in Romance were borrowings, while those with 'g' were the inherited ones, through a Vulgar Latin *grassus, like French and Romanian gras. But oddly, the TLFi doesn't explicitly mention French crasse as a borrowing... Any thoughts on this? Word dewd544 (talk) 20:35, 14 December 2017 (UTC)[reply]
    @Word dewd544: Could the forms with initial g be analogical with gros (the obvious problem being that it works only for French... or does it? What's the etymon of gros?)?
    If French crever is inherited from crepo, or crêpe from crispus, it means there's no sound law for initial voicing, so crasse could be inherited. A rather hasty reasoning, but what do you think? --Per utramque cavernam (talk) 12:08, 16 December 2017 (UTC)[reply]
    That's true. As other languages' dictionaries say, the forms with 'g' are likely due in part to analogy with grossus, which was probably of ultimately Germanic origin. But it was absorbed early enough to affect Romanian as well. Also, maybe the senses of crasse (and Old French cras) that had to do with "fat", "thick", "dense" were inherited, but I'm not sure about the more abstract sense of "crass". Oftentimes the meaning changes somewhat in inherited words. Anyway maybe French is unique in this, since the Spanish etymological dictionary I used certainly mentions craso as a "cultismo", and with Italian, apparently grasso was attested as early as the 12th century in the TLIO, while crasso was not attested until 1357. And in Romanian's case, it was obviously taken from French recently. Word dewd544 (talk) 17:50, 16 December 2017 (UTC)[reply]
  • I'm a bit bothered by the TLFi etymology of French assembler. Do we really need to reconstruct a Vulgar Latin *assimulo? After all, assimiloassimulo are attested. TLFi writes that "assimilavit se rattache à similis « semblable » et non à simul", but similis is itself related to simul, so that's a bit weak. Is there such a semantic gap between Classical assimilo–assimulo and Vulgar *assimulo that we need to posit a second verb coined entirely anew?
Classical assimilo: "to make similar" > "to compare, to put side by side" > "to put together, to bring together" doesn't seem extremely far-fetched to me. --Per utramque cavernam (talk) 12:27, 16 December 2017 (UTC)[reply]
Yeah I don't know how I feel about making a separate new VL. entry for it, especially as it was attested. But I noticed that has happened with some other words, if the Romance descendants have a somewhat different or specialized meaning. Also, I removed the Ibero-Romance ensamblar since those were taken from Old French ensembler. Word dewd544 (talk) 17:50, 16 December 2017 (UTC)[reply]
  • Regarding Italian artiglio... Most Italian dictionaries seem to list it as a borrowing from Old Provençal, and it does look to at least have some kind of Gallo-Romance influence, as the expected form would be something like *artecchio. However a few do just list the Latin source without an intermediate. Perhaps it's one of those odd cases like coniglio (which also looks to have foreign influence upon first sight, with an expected *conecchio, but perhaps not?). Whatever the case may be, this would seemingly affect the status of Sicilian artigghiu too, which seems a parallel form of the Italian. Word dewd544 (talk) 17:08, 22 December 2017 (UTC)[reply]
    About French article, I wonder if we should move it to the "semi-learned borrowing" column. --Per utramque cavernam (talk) 20:35, 22 December 2017 (UTC)[reply]

Created entries that need etymology[edit]

I'm glad y'all compiled this. But I don't really do etymologies, so here you go! Ultimateria (talk) 22:32, 17 December 2017 (UTC)[reply]

Counterparts in English[edit]

Nice list. I noticed that several of these had counterparts in English that were not yet marked as doublets, such as entire and integer, so I added {{doublet}} to their etymology sections. — Eru·tuon 22:05, 23 December 2017 (UTC)[reply]

@Erutuon: I suspect English could easily be the language with the most doublets, but I find them a bit less interesting since they often simply reflect the Romance (especially French) situation. What I'd be very interested in would be cases of Germanic doublets. --Per utramque cavernam (talk) 20:01, 24 December 2017 (UTC)[reply]
@Per utramque cavernam: There are some Germanic doublets that involve palatalization: ditch and dike, church and kirk, shirt and skirt, witch and wicca, edge and egg; to stretch the concept a bit, seek and beseech. Unfortunately, these aren't quite as interesting as some of the Romance ones. — Eru·tuon 23:41, 24 December 2017 (UTC)[reply]

Technical question[edit]

@Ungoliant MMDCCLXIV Would you know how to fuse the cells of the lines "See also"? --Per utramque cavernam (talk) 14:48, 17 December 2017 (UTC)[reply]

Like this:
| colspan="3" | See also
Ungoliant (falai) 15:02, 17 December 2017 (UTC)[reply]
@Ungoliant MMDCCLXIV: Thanks! There are quite a few Portuguese red links in there, if you feel like doing some of them... :p --Per utramque cavernam (talk) 15:21, 17 December 2017 (UTC)[reply]

Module error[edit]

@Chuck Entz I've no idea why this page appears in CAT:E. Too many links? --Per utramque cavernam (talk) 23:24, 24 December 2017 (UTC)[reply]

Mh, sorry, hadn't looked close enough. Too many expensive function calls indeed. We'll have to find a solution, I guess... --Per utramque cavernam (talk) 23:27, 24 December 2017 (UTC)[reply]
@Per utramque cavernam: The only solution right now is to add the page's title to the list of exceptions in Template:redlink category. — Eru·tuon 19:55, 26 December 2017 (UTC)[reply]

Possible Romanian doublet at coagulō[edit]

Is it appropriate if I add Romanian coagula (borrowed) and închega (inherited, possibly through Vulgar Latin root *inclagāre, from metathesis of *incoāglāre, alternatively, from în- + *clagāre < coāgulāre)? --Robbie SWE (talk) 17:59, 28 December 2017 (UTC)[reply]

@Robbie SWE: It's tempting, but I would advise against it because of the prefix in închega. You could put a dash for the inherited term and mention închega in a footnote, though. --2A02:2788:A4:F44:788A:2C1E:651C:944F 22:15, 28 December 2017 (UTC)[reply]
@Robbie SWE: I think it's fair to list it. The thing about words prefixed with în- in Romanian is that in many cases they may very well have been added later, as an internal development in the early language (it's fairly common), rather than being directly from a supposed Vulgar Latin construction. A lot of the entries in the DEX mention a root *incoagulare simply because they see the prefix, which is pretty superficial. I tend not to use the first few entries because I've come to see that they're not always the most accurate; rather I look for the more detailed ones with the source 'Dicționarul etimologic român'. The more in depth etymologies do mention that it could have been a later addition to the base Vulgar Latin derivative of coagulare. Word dewd544 (talk) 17:14, 3 January 2018 (UTC)[reply]

error in module[edit]

I'm not sure why this error came up when I tried adding a new entry for cannabis. Lua error in Module:doublet_table at line 186: No language with name .

The code looks exactly the same format as the preceding and following ones. I don't get it. Did something change when you guys re-arranged the layout of the code? Word dewd544 (talk) 19:28, 19 January 2018 (UTC)[reply]

@Word dewd544: Fixed; it had to do with the number of columns. --Per utramque cavernam (talk) 19:32, 19 January 2018 (UTC)[reply]
Thanks I got it now. Word dewd544 (talk) 19:53, 19 January 2018 (UTC)[reply]

Error[edit]

@Word dewd544, your last edit caused an error. Thanks! —*i̯óh₁n̥C[5] 19:01, 8 March 2018 (UTC)[reply]

I know but I'm too lazy and can't really be bothered to fix it. Besides whoever wrote the script for that is probably at fault and made a mistake, not me. Word dewd544 (talk) 21:49, 8 March 2018 (UTC)[reply]
Sorry if I sounded rude there, I was just frustrated with that template not working. Does it not accept five columns? Everything seems right, compared to the other ones. The other reason I said that is because I don't even think that word is a good candidate for this list, given that it doesn't have more than one (if that) inherited descendant. Word dewd544 (talk) 04:33, 9 March 2018 (UTC)[reply]
@Word dewd544: Lol, I actually thought you were joking! There was one pipe too many: see diff. But I've removed the table anyway now. --Per utramque cavernam (talk) 18:11, 9 March 2018 (UTC)[reply]
@Word dewd544: Now the module will complain if you enter the wrong number of cells, and thus restore my reputation. — Eru·tuon 20:21, 9 March 2018 (UTC)[reply]
@Per utramque cavernam:Ah, right you don't have to end the line with a a pipe. Whoops. And good, that works for both of us; it'll save me a lot of frustration. Nice to know what exactly it is you're doing wrong haha. Word dewd544 (talk) 17:49, 10 March 2018 (UTC)[reply]