Talk:essersi

From Wiktionary, the free dictionary
Jump to navigation Jump to search

RFD discussion: July 2022–December 2023[edit]

The following discussion has been moved from Wiktionary:Requests for deletion (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


SOP. I can't find a specifically reflexive entry for this verb in any dictionary. essersi is extremely common e.g. in reverso.net, but all examples are of the form essersi PAST-PARTICIPLE, where si is simply raised up from the past participle and attached to essere. Benwing2 (talk) 01:19, 2 July 2022 (UTC)[reply]

Does SOP apply considering that some forms are written without spaces? Even if it always appears an an auxiliary, we have entries for forms like wouldn't, which is an auxiliary where the semantic negation generally can be seen as applying to the main verb. If kept, the definition might need to be adjusted to indicate that this is not used by itself but only occurs in larger constructions. Urszag (talk) 01:30, 2 July 2022 (UTC)[reply]
@Urszag Good point. However, if this is the case, then pretty much any transitive verb in Italian can have a reflexive entry for it that does nothing but define it as the reflexive equivalent of the base verb, which seems pointless and is definitely contrary to the way that we currently handle Italian verbs. Benwing2 (talk) 02:39, 2 July 2022 (UTC)[reply]
There is a larger point here about morphology that is written without spaces but is entirely predictable, cf. Turkish or Arabic. Benwing2 (talk) 02:41, 2 July 2022 (UTC)[reply]
The same logic applies to a large number of inflections that are entirely predictable and do nothing but define themselves as the X equivalent of the base lemma. Where do you draw the line? Theknightwho (talk) 05:24, 2 July 2022 (UTC)[reply]
That is a policy question that has never been answered – perhaps also because it is difficult to phrase an applicable and coherent policy. When we say “all words in all languages”, I (choose to) interpret that as “all lexical terms”. That includes in my perception terms that are written with a space, while excluding some that are written without a space, although attestable. Some languages don’t use spaces anyway, so being spaceless is not a usable criterion. An example of an SOP spaceless term is the Turkish noun boşluksuz, which is regularly formed as boşluk (space) +‎ -suz (-less). This is attestable in this sense through actual uses,[1][2][3] but if it was not, that would purely be so by accident: any speaker of Turkish will understand it immediately and not consider it in any way peculiar, its meaning being precisely the sum of its parts. We can contrast this with the equally regularly formed boşluk = boş (empty) +‎ -luk (-ness). Its saving grace is that it has a specialized meaning not fully covered by “emptiness”. If we are to include boşluksuz, why stop there? Why not boşluksuzluk (“spacelessness”)? I assume that a user looking up a term in a foreign language will not be looking for an isolated word picked from a document they cannot understand anyway, but has enough basic understanding of the language to interpret the text if they know the meaning of this specific term, unknown to them, just like ESL learners may need to look up druthers in the sentence, “But if you had your druthers, you wouldn’t use women at Frank Lee at all, right?”[4] and then, having seen our definition, will understand the sentence. Any TSL learner will learn the suffix -suz early on, and will see that boşluksuz is formed like boşluk + -suz; if they encounter the word oruspusuz and do not know its meaning, it is because they don’t know the meaning of orospu.  --Lambiam 17:27, 2 July 2022 (UTC)[reply]
I think, at the end of the day, it just boils down to whether we want to have a dictionary that can be used by absolutely anyone for anything or rather one that requires some basic knowledge (A1~A2) in a language to act as a complete reference. What speaks for the former is that there is legitimate use for a service like that. I have seen people on here say that they do use Wiktionary as a tool to translate foreign texts by looking up every single word on Wiktionary.
Our current treatment is that almost anything written without spaces goes (in languages whose orthography uses spaces). Notable exceptions are Latin -que and Turkish predicatives (and perhaps Turkish -ir -mez forms; they're written with a space but they are not SOP, their predictability comes from the scheme they belong to, see Talk:aşar aşmaz). On the other hand, there is a strong consensus (one that I personally strongly agree with) that we retain German "SOP" compounds (see WT:ADE#Criteria_for_inclusion). It we were to attempt to formulate cross-lingual rules for exclusion (excluding all "words" belonging to a certain derivational scheme), I'd start with the following criteria:
  1. The scheme must be 100% predictable in terms of morphology, phonetics, semantics and syntax
  2. The scheme must be applicable to every word of a certain PoS
  3. The scheme must be applicable by anyone possessing only fundamental knowledge in a language (very subjective, I know, but nevertheless I think it's a good criterion; if only experts could analyze a certain scheme then we'd be a dictionary for only experts in that language)
If any criterion is violated by a particular scheme, no form belonging to that scheme is excluded (any policy that allows half the derivations to be included but not the other half is absurd in my eyes).
To discuss some examples:
  • Inflectional schemes (for all PoS) in most (natural!) languages are included by their failure to meet 1 and 2 (because there usually is some kind of irregularity)
  • Inflectional schemes for some constructed languages (apart from nominal plurals schemes because they fail to meet 2 thanks to the existence of uncountable nouns) are excluded
  • Turkish predicative forms, -siz, -le (and probably a whole host more) are excluded (-lik however is included by failing to meet 1: the construction can have multiple meanings but not all meanings are invoked in every single derivation)
  • German compounds are included by their failure to meet multiple criteria: the irregular interfixes, the often more restricted meaning than what follows by the constituents, the difficulty to split them into the constituent even for people that have some fundamental knowledge etc.
The second criterion would have to be further relaxed (to "every word of a certain PoS or a subclass thereof") to be applicable to essersi (namely reflexive verbs) but note that at that point, Turkish (along with every other language with 100% regular plurals) plurals are also to be deleted.
I also want to point out again that MediaWiki is fundamentally the wrong software to build a dictionary with. If we had a better software, we could generate virtual pages on the fly whenever somebody looked up essersi or boşluksuzluk (see Wiktionary:Beer_parlour/2022/January#Should_we_have_entries_for_Turkish_predicative_forms? for more thoughts on this point).
Pinging also @Surjection because I'm curious about the perspective of a speaker of a different highly agglutinative language. — Fytcha T | L | C 19:09, 2 July 2022 (UTC)[reply]
@Fytcha I’m broadly in agreement with your points, and would support a proposal to make that part of CFI - at least after a bit more discussion to iron out any kinks. Another exception (that fits your criteria) is that we exclude English possessives ending -'s, but we don’t exclude plurals despite 95% of them following the schema -s. Theknightwho (talk) 23:01, 2 July 2022 (UTC)[reply]
@Benwing2, @Lambiam, @Fytcha, @Urszag, @Theknightwho: apparently there is an archaic essersi that's a pronominal intransitive verb (the only verbs with pronouns attached that's worth having in a dictionary, since their meaning can't be immediately derived by the sum of their parts) and meant the same as essere but with a '"valore intensivo" (intensive meaning). I never heard of this verb before looking it up two minutes ago. I don't think the current definition of essersi refers to this verb though. The current essersi is just essere + si and has no place in a dictionary, like @Benwing2 said above. It's a pure SOP and can be deleted. Sartma (talk) 20:38, 2 July 2022 (UTC)[reply]
@Sartma, Benwing2: WT:SOP only applies to multi-word expressions (either spaced or hyphenated but not concatenated). The issue with essersi is a different one. We have in the past deleted "put-together things" that are written without spaces but such considerations should be done on a per-class basis, not on a per-word basis. Either exclude all such words or none. And whatever the consensus ends up being, it should be documented in WT:AIT. — Fytcha T | L | C 20:50, 2 July 2022 (UTC)[reply]
@Fytcha, Sartma My thoughts are this:
  1. If we really put every form without spaces in Wiktionary, it will start to become unusable in some languages due to the proliferation of non-lemma junk.
  2. The problem with essersi is not only that it's entirely SOP but that it's not even a constituent in most cases. Some examples from context.reverso.net (quoting the first 8 examples without any filtering):
    Qualcuno potrebbe essersi ispirato al killer del camion frigo.
    Deve essersi trovato sotto la pioggia.
    Potrebbe essersi prelevato il sangue per mesi.
    Chiunque potrebbe essersi rintanato in quella fattoria.
    I danni sembrano essersi limitati all'addome inferiore.
    Quel tipo sembra essersi pentito veramente.
    Deve essersi rifugiata nel seminterrato fino alla riapertura.
    Potrebbero essersi finti volontari dei soccorsi.
    Cioè il killer potrebbe essersi tagliato mentre l'aggrediva.
In every one of these examples, si is raised from the following past participle or even further (see clitic climbing). The first example is from ispirarsi (to be inspired). The second example from trovarsi (to be found). In the third, the si is not even a logical part of the verb prelevare (to withdraw) (and there is no non-SOP verb prelevarsi) but is a sort of reflexive of possession (I don't know the proper term) that is logically attached to sangue (blood). You might compare it to Latin -que or -ve, which are lowered from their logical position and attached to the first word of the following constituent, or to English 's in a friend of mine's car; in none of these cases is the clitic attached to the constituent it is logically part of. Benwing2 (talk) 23:56, 2 July 2022 (UTC)[reply]
@Benwing2: I agree with everything you wrote. (On a side note, the si in prelevarsi ("to take blood from oneself") is just the normal reflexive pronoun ("to/for oneself"). From an Italian point of view, it's not really attached to sangue, it's more about the action of "taking blood" being done by the subject for the subject, so on a logical level it goes together with the verb.) Sartma (talk) 00:26, 3 July 2022 (UTC)[reply]
@Fytcha: That's very true. It's not even an actual SOP. It's just the combination of the auxiliary verb essere + the pronoun of a following pronominal verb... delete. Sartma (talk) 00:05, 3 July 2022 (UTC)[reply]
No double voting. Imetsia (talk) 16:22, 8 July 2022 (UTC)[reply]
Keep. It's a word you might run across and might want to look up, in a language not known for agglutination (unlike Turkish or indigenous American languages). It's not a clitic either, unlike -'s or Latin -que. I don't really buy the arguments about proliferations of non-lemma forms - that might be the case for other terms in other languages, but this verb only appears to have 13 single-word forms (5 persons of the infinitive + 5 persons of the gerund + 3 imperatives), unless I am missing something. This, that and the other (talk) 07:23, 8 July 2022 (UTC)[reply]
Delete for the points made above. Imetsia (talk) 16:22, 8 July 2022 (UTC)[reply]
(Notifying GianWiki, SemperBlotto, Ultimateria, Jberkel, Imetsia, Sartma, Catonif): @Fytcha, Theknightwho, This, that and the other I wonder if we can come to a conclusion about this. I just discovered venirsi, for which I think the same issue exists as essersi, doversi and potersi (and there are probably others). If we are to keep these forms, we need a way of indicating that they are non-constituents that happen to be run together according to the spelling rules of the language. The current definitions e.g. of venirsi are radically false: (1) reflexive of venire; (2) to come, to arrive. What about some variant of {{it-compound of}}, which would classify them as non-lemma forms, under CAT:Italian combined forms? Should we modify the categories to emphasize the non-constituent nature of these terms, or is CAT:Italian combined forms enough? Benwing2 (talk) 05:47, 10 January 2023 (UTC)[reply]
I'm reading this discussion right after having talked about this exact problem here, in which I vote for delete not for just this, but for every combined verb and verb form with no additional meaning. This is not the RFD of combined forms though, so since those now exist, this essersi, along with venirsi, doversi and potersi should be treated as combined forms, as Benwing suggested. CAT:Italian combined forms seems like the right place. Catonif (talk) 15:28, 10 January 2023 (UTC)[reply]
@Benwing2 I'm satisfied with the solution currently in place, but the non-lemma forms like potermi seem like they need attention. Would it not be better to define these using {{it-compound of}} too? This, that and the other (talk) 06:17, 28 December 2023 (UTC)[reply]

I'm calling this RFD-resolved - a lot of text, not many actual votes, and an implemented resolution (conversion to form-of entries). This, that and the other (talk) 06:21, 28 December 2023 (UTC)[reply]