Wiktionary:Beer parlour/2018/December: difference between revisions
Line 107: | Line 107: | ||
:: Pronunciation is also no good, since it is extremely speaker and context dependent, and lexicalized loans can themselves have a special phonology.[[User:Crom daba|Crom daba]] ([[User talk:Crom daba|talk]]) 18:07, 10 December 2018 (UTC) |
:: Pronunciation is also no good, since it is extremely speaker and context dependent, and lexicalized loans can themselves have a special phonology.[[User:Crom daba|Crom daba]] ([[User talk:Crom daba|talk]]) 18:07, 10 December 2018 (UTC) |
||
::: I don't think we can have a coherent policy or test across different languages. Speakers of different languages will absolutely differ in their criteria for what counts as a native word. This is even more difficult with global languages like English where different communities are in contact with a huge variety of other languages from which to borrow from. [[User:DTLHS|DTLHS]] ([[User talk:DTLHS|talk]]) 18:23, 10 December 2018 (UTC) |
::: I don't think we can have a coherent policy or test across different languages. Speakers of different languages will absolutely differ in their criteria for what counts as a native word. This is even more difficult with global languages like English where different communities are in contact with a huge variety of other languages from which to borrow from. [[User:DTLHS|DTLHS]] ([[User talk:DTLHS|talk]]) 18:23, 10 December 2018 (UTC) |
||
::: Good words, {{ping|Crom daba}}. I want to point out how language is really written on the internet: In printed works or works inspired by print practices there are many things that don’t happen but are unproblematic elsewhere, in unrestrained speech where people can develop their own standards or own morals, unspooked by societal expectation, so to speak in [[Stirnerian]] language: Remarkably, nowadays ''in Russian chats'', and I mean those where discussions take place and people try to write correctly, one just writes some foreign words in foreign script and then immediately joins Russian endings in Cyrillic script to them. It’s also the way I think and do it: Writing Russian in Germany, referring to things in Germany without having a notion of a Russian equivalent, I just write German words or English words in Latin script and decline them Russian and in Cyrillic script (without space I mean, you understand; most iconic, I think), and this does not make them Russian. I often can think “Is this word Russian already”? There are some obvious ones that do exist, like everyone uses the word {{m|ru|терми́н}} in reference to appointments in Germany, a word that does not exist in Russia, and I long did not even know that it doesn’t, it seems so indispensable. This middle ground of dubiosa (is this English or Latin, huh? Not English because of lacking spread) is only left out by me and other dictionary editors often because these words have limited relevance to a greater world and one would look up these words in German dictionaries anyway (as I said earlier, an entry in one language suffices, a Greek entry {{m|el|marketing}} is otiose), plus they are CFI-problematic (best one can do is quote them from fora and commentaries under articles, perhaps with archive links, but that’s it, these Soviets here don’t produce a corpus that would help to quote Russian as spoken in Germany). Separating the words is even more difficult if you look at inter-Slavic conversations: Like is {{m+|ru|менто́вка|t=mint liquor as popular in Bulgaria}} Russian? It is used in Russian texts here and there, and obviously with Russian endings then, but is it perceived as Russian? (With a German legalese term with no equivalent in English, how does the {{m|de|Verkehrsanschauung}} or {{m|de|Verkehrsansicht}} see it?) I have also read quite a lot strange words from Russian expats in Serbia and things like that, you could make large lists of such words if you wanted to; theoretically this could lead to having words in Russian written with Cyrillic characters we thought do not exist in Russian – I make here the strange observation that Latin words with foreign diacritics pass easier into texts of other languages but the Cyrillic languages tend more to transcribe all, i. e. having a Russian text with {{m|sh|ђ}} is way more weird than Vietnamese diacritics, Semitic transcriptions and what you can imagine in English texts. And that’s only in Europe, elsewhere things become crazier, which others can describe better. |
|||
::: For the phonetical point, see that legit French words contain pharyngeal fricatives, like {{m|fr|hebs|t=prison, can}}, {{m|fr|hnouch|t=popo, bacon}}. Here we have also an issue arising if we know that a word has passed into French, English, and you can attest it from songs (like they have been printed on CD or are buyable as downloads or else unlikely to vanish, so durable). The flip side of words written in a non-native script are words which have passed but cannot or only with uncertainty be written in the native script. English example: {{m|en|gwop|t=moolah}}. |
|||
::: Normal dictionaries to a large part avoid such problems because they leave out {{w|lang=de|Exotismus#Sprachwissenschaft|exotisms}}, i. e. words for things that do not exist in an area where there is a community of the language documented. With this I lean towards an exclusion ground that is that if a word in English is for a foreign thing and the Verkehrsanschauung does not see the word as English then it is not English. Confer {{m|en|mesdemet}}! This is “not really English”. What does apply for abstracta then, what is Greek ''marketing'' then? This criterion I have just stated becomes difficult for foreign “ways of life”. Maybe Greek ''marketing'' is not actually Greek because he who uses such a word ceases to think like a Greek, regardless of the script it is written in. There are many gross things written and said in Arabic or Hindi texts that I would for this reason see as not-Arabic and not-Hindi. And the same criterion can apply to determine if a word has passed from German into Russian. |
|||
::: The issue gets complicated however because there is not only code-switching for Wiktionary but there is also Translingual: You could make a case for “marketing” being Translingual and not only English. I have argued already ({{section link|User talk:Fay Freak#Translingual}}) for grammatical terms like ''genitivus absolutus'', ''status constructus'' and the like being Translingual in the first place. Maybe “marketing” is translingual because teachers of business and marketing have made it so ex cathedra, which is why it is used in Greek, never able to become Greek. [[User:Fay Freak|Fay Freak]] ([[User talk:Fay Freak|talk]]) 20:02, 10 December 2018 (UTC) |
|||
== Linking elements of a term in <nowiki>{{en-noun}}</nowiki> == |
== Linking elements of a term in <nowiki>{{en-noun}}</nowiki> == |
Revision as of 20:03, 10 December 2018
References for Vietnamese readings listed under Template:vi-readings
I would like to add superscript references for readings of Vietnamese Han characters using the following code as a suggestion:
{{vi-readings|rs=老04 | hanviet = giả - tdcn | nom = giả - tdcn;gdhn, giã - tdcn, rả - tdcn, trả - gdhn;btcn, dã - gdhn }}
The abbreviations used are:
tdcn = {{vi-ref|Nguyen (2014).}}
gdhn = {{vi-ref|Trần (2004).}}
btcn = {{vi-ref|Hồ (1976).}}
The desired output using 者 as an example is as follows:
Han character
者: Hán Việt readings: giả[1]
者: Nôm readings: giả[1][2], giã[1], giở[1], rả[1], trả[2][3], dã[2]
References
Currently, this is also achievable using the bulkier code below:
{{vi-readings|rs=老04 | hanviet = [[giả#Vietnamese|giả]]<ref name="tdcn">{{vi-ref|Nguyen (2014).}}</ref> | nom = [[giả#Vietnamese|giả]]<ref name="tdcn"/><ref name="gdhn">{{vi-ref|Trần (2004).}}</ref>, [[giã#Vietnamese|giã]]<ref name="tdcn"/>, [[giở#Vietnamese|giở]]<ref name="tdcn"/>, [[rả#Vietnamese|rả]]<ref name="tdcn"/>, [[trả#Vietnamese|trả]]<ref name="gdhn"/><ref name="btcn">{{vi-ref|Hồ (1976).}}</ref>, [[dã#Vietnamese|dã]]<ref name="gdhn"/> }}
If possible, could someone edit Module:vi so that the suggested code in the first paragraph would give the desired output? KevinUp (talk) 15:49, 1 December 2018 (UTC)
- @Suzukaze-c Hi. If you have the time, would you mind comparing the desired output above with 者#Vietnamese? I can't figure out how to implement this within the module. KevinUp (talk) 06:50, 7 December 2018 (UTC)
unchanged plural
What does "unchanged plural" exactly mean in the Usage note for craft? that's not the general terminology used in Wkt, is it? --Backinstadiums (talk) 16:38, 2 December 2018 (UTC)
- I've changed it to be explicit: "The plural craft is used to refer to vehicles. All other senses use the plural crafts." Ultimateria (talk) 19:12, 2 December 2018 (UTC)
Inevitable discussion about reference works from non-Latin cultures
Given the situation where issue |lang= in {{quote-web}}
in the Grease-pit page of this month insinuates opening all reference templates it has become opportune to uniformize their content. It has caught my eye that there are lurking multiple fashions of displaying references for cases of a work published in a script that is not the one of the Romans, namely, the author name was written in a certain script and the title of course too, but to my great surprise and contrary to Wiktionary’s usual laudable Unicode- and internet-standard compliance I encountered that there were reference templates here already created that did not even include the original title but wrapped it in {{xlit}}
so that only a transliteration of it remained, and the same has also been done with titles of their authors, so that I could not recognize any of the books and almost did not find already created templates by Wiktionary’s search function, being already prepared – in vain – to create the templates.
So I reasoned that, since we are in late 2018 and our letter case is unlimited in what concerns languages that have in the Modern Age been used for pursuing science, the templates must all be uniformized so that the original title is displayed, opining also that transliterations are to be discarded for scripts that are unambiguous since they are no gain for anyone (if you don’t know the language you don’t know the transcription either, short of negligible cases when one is literate in Latin script only but not the actual script for a non-Latin-script-written language one knows) and “En.Wiktionary entries already have too much wasted space”, as @-sche acutely observed on the Grease pit page of this month and also has been voiced as a cause of displeasure.
There might be little experience in reference sections in any works containing non-Latin references, but naively and naturally and looking at how my computer does it, I always ordered references by the Latin names first and then the Cyrillic ones, and so I have come to the belief that the original-script author names can be had easily. People might however be more appealed to by Latin-transliterated names, but even then I am apprehensive of those being less iconic, but this is of limited importance for names. It can very well be grave in logographic writing systems some of which are still in use, for particularly people’s names can have the most arbitrary characters and it would be utterly impossible to reconstruct the original name without browsing the web again only to find a name which a Wiktionary editor has needlessly left out. Currently the Japanese reference templates have all formats.
So how does Wiktionary look upon all these factoids? What should references have, perhaps with distinctions by writing systems? I’d like to see completely removed the transcriptions of the titles of alphabetic and syllabaric scripts because they have no non-theoretical uses and would sort references by Unicode (I don’t actually know how Chinese sort their Chinese reference sections, and perhaps one feels that Japanese titles transliterated could somehow help, I avoid talking about those scripts). Plus why have people even thought that |title=
and the author parameters would be the correct place to put transliterations or transcriptions? This would easily be different parameters |tr-title=
, |tr-author=
and so on that can be expanded for those who need it (whose existence I deny), and this would make reference templates use more expected parameters. Which of course entails as a minimum that we have original-script titles – come on, are readers supposed to reverse-transliterate titles? Author titles perhaps in both since one might not know the script but the author from other publications in other, Latin-written languages? But this is not generally true, though there are often adapted author names around. Avicenna is quite iconic, no need for اِبْن سِينَا (ibn sīnā), but that’s more often for classics and applicable to quotation templates. What does iconicity tell us here? And I have not even mentioned how often title-translations should be done, which have a parameter already. There is still this issue around of quotation templates containing bare long titles, and there are a few “click to expand” solutions for these as I remember. Pinging some people I find interesting to hear or interested: @Sarri.greek, Eirikr, Sgconlaw, Dan Polansky. Fay Freak (talk) 00:29, 5 December 2018 (UTC)
- I’m sorry, could you summarize all that? I’m having trouble understanding what your concerns are. — SGconlaw (talk) 01:51, 5 December 2018 (UTC)
- @Sgconlaw I wanted to uniformize references of books written in a non-Latin alphabet a bit, pointing out the questions whether the original script of a) the author name b) the title should be shown, and c) whether transliterations of the author names should be shown d) whether transliterations of the titles should be shown. I was just formulating much pros and contras. My result has been to vehemently affirm b), deny d) (hardly valuable clutter), lean to a), I am rather open to c), but it would need to look good enough (like on the Chinese reference page KevinUp has linked it is great but we need
|tr-author=
for this I think). Fay Freak (talk) 19:35, 5 December 2018 (UTC) - I say show both the original and the transliteration, in the future we will be able to customize this to everyone's satisfaction with css magic. Crom daba (talk) 03:42, 5 December 2018 (UTC)
- Yes, and at the least transliteration does not belong to
|title=
, otherwise there won’t be CSS magic. There need to be separate fields for original titles and author names and their transliterations, I don’t think I can be wrong here, @Sgconlaw. Now supra there are the arguments for displaying. The decision about display should not be influenced by limited forms of saving the information. Fay Freak (talk) 19:35, 5 December 2018 (UTC) - Here are the formats used for Chinese references: Wiktionary:About Chinese/references, Korean references: Wiktionary:About Korean/references and Vietnamese references: Wiktionary:About Vietnamese/references. Also, all Chinese quotations and usage examples (whether it is cited from a book, song, video or the web) are provided using Template:zh-x. A list of abbreviations for well known references used by this template can also be found at Module:zh-usex/data. KevinUp (talk) 04:08, 5 December 2018 (UTC)
- @KevinUp The Chinese reference page is great. Until the point where I find: “Starostin, Sergei (1989). Rekonstrukcija drevnekitajskoj fonologicheskoj sistemy (A Reconstruction of the Phonological System of Old Chinese)”. Why is the Russian title not given in Russian script but the Chinese titles are given in Chinese script only (and not in Pinyin)? No logics.
- There is also the issue of some titles being translated and some not, but that’s minor. Fay Freak (talk) 19:35, 5 December 2018 (UTC)
- Yes, and at the least transliteration does not belong to
- @Sgconlaw I wanted to uniformize references of books written in a non-Latin alphabet a bit, pointing out the questions whether the original script of a) the author name b) the title should be shown, and c) whether transliterations of the author names should be shown d) whether transliterations of the titles should be shown. I was just formulating much pros and contras. My result has been to vehemently affirm b), deny d) (hardly valuable clutter), lean to a), I am rather open to c), but it would need to look good enough (like on the Chinese reference page KevinUp has linked it is great but we need
- @Fay Freak: I'm not sure why the work by Sergei Starostin was not written in the Cyrillic script. I tried to trace the source of that work, and this is what I managed to find: [1]. Unfortunately I was unable to trace the original source. Perhaps someone else could help by looking up the bibliography of Sergei Starostin.
Phonological reconstructions for Early Zhou, Classical, and Middle Chinese are based on Sergei Starostin's version as originally published in: [Starostin, Sergei. Rekonstrukcija drevnekitajskoj fonologicheskoj sistemy [Reconstruction of the Phonological System of Old Chinese]. Moscow, 1989.] Particular reconstructions are transliterated into the UTS from S. Starostin's etymological database of Chinese characters (bigchina.dbf), available online at http://starling.rinet.ru.
- As to why Chinese titles are given in Chinese script only and not in Pinyin, this may have been done to prevent a cluttered appearance of the reference works. Also, it seems that pinyin tone marks are omitted for Chinese reference works in Yale University Library's Quick Guide on Citation Style for Chinese, Japanese and Korean Sources: APA Examples. KevinUp (talk) 16:32, 6 December 2018 (UTC)
Adding pinyin for numbers in Chinese (Mandarin?) example sentences
@Dokurrat, KevinUp, Justinrleung, Suzukaze-c, Tooironic, Wyang & co. (alphabetically organized) I added Pinyin for the numbers in a Mandarin Chinese example sentence, and that pinyin was removed- see [2]. I think we should give the pinyin for the numbers (maybe?). I'm okay either way- in fact I don't think we need to do all sentences one way (no pinyin for numbers in example sentences) or all the other way (pinyin for all numbers in example sentences). But I'm not sure. idk. I'm just putting it out there for y'all to discuss. Any which way is fine to me. --Geographyinitiative (talk) 04:30, 5 December 2018 (UTC)
- No, I don't think we should add pinyin for Arabic numerals. Dokurrat (talk) 04:41, 5 December 2018 (UTC)
- I like the idea. I usually do it for Japanese. —Suzukaze-c◇◇ 04:42, 5 December 2018 (UTC)
- I'd like to see the numbers as pinyin, because they are read according to its Mandarin pronunciation. Also, depending on context, they can be read as cardinal numbers or standalone digits: 365天 ― sānbǎiliùshíwǔ tiān ― Three hundred and sixty five days.
- 員工365失踪了。 [MSC, trad.]
- Yuángōng sānliùwǔ shīzōng le. [Pinyin]
- Employee no. 365 is missing.
员工365失踪了。 [MSC, simp.]- ^ this. —Suzukaze-c◇◇ 05:18, 5 December 2018 (UTC)
- I'd like to see the numbers as pinyin, because they are read according to its Mandarin pronunciation. Also, depending on context, they can be read as cardinal numbers or standalone digits: 365天 ― sānbǎiliùshíwǔ tiān ― Three hundred and sixty five days.
- I like the idea. I usually do it for Japanese. —Suzukaze-c◇◇ 04:42, 5 December 2018 (UTC)
- Agreed that we should add pinyin conversion for Arabic numerals. ---> Tooironic (talk) 06:09, 8 December 2018 (UTC)
- It has to be added manually, of course, otherwise we are asking for possible future errors in conversion. Perhaps re-transliterated numbers need to be displayed differently, so that e.g. sānbǎiliùshíwǔ for "365" is known to mean to stand for 三百六十五 (sānbǎiliùshíwǔ, “three hundred sixty five”) or 三六五 (sānliùwǔ, “three six five”). A different colour or underlined? Also, maybe a trick is needed to use a hidden "三百六十五"/"三六五" but display "365", so that a manual pinyin is not required? BTW, @KevinUp: I have suppressed the display of "365" in your example with @. --Anatoli T. (обсудить/вклад) 07:15, 8 December 2018 (UTC)
- @Atitarev: Automatic pinyin transliteration of Arabic numerals can be done by adding pronunciation data of 0-9 to
data.polysyllable_pron_correction
in Module:zh-usex/data. However, this would render "365" as 三六五 (sānliùwǔ, “three six five”). Manual input would still be needed if "365" is intended to be read as 三百六十五 (sānbǎiliùshíwǔ, “three hundred sixty five”). KevinUp (talk) 14:45, 8 December 2018 (UTC)- @KevinUp: I understand. As I said, what we need is, a new method in the module to use the transliteration of hidden characters, in this case "三百六十五" for transliteration purposes only - "sānbǎiliùshíwǔ" but display unlinked "365" in the Chinese text. --Anatoli T. (обсудить/вклад) 04:16, 9 December 2018 (UTC)
- This seems to be slightly complex, so we may have to add this to Wiktionary:About Chinese/tasks. KevinUp (talk) 04:25, 9 December 2018 (UTC)
- @KevinUp: I understand. As I said, what we need is, a new method in the module to use the transliteration of hidden characters, in this case "三百六十五" for transliteration purposes only - "sānbǎiliùshíwǔ" but display unlinked "365" in the Chinese text. --Anatoli T. (обсудить/вклад) 04:16, 9 December 2018 (UTC)
- @Atitarev: Automatic pinyin transliteration of Arabic numerals can be done by adding pronunciation data of 0-9 to
- It has to be added manually, of course, otherwise we are asking for possible future errors in conversion. Perhaps re-transliterated numbers need to be displayed differently, so that e.g. sānbǎiliùshíwǔ for "365" is known to mean to stand for 三百六十五 (sānbǎiliùshíwǔ, “three hundred sixty five”) or 三六五 (sānliùwǔ, “three six five”). A different colour or underlined? Also, maybe a trick is needed to use a hidden "三百六十五"/"三六五" but display "365", so that a manual pinyin is not required? BTW, @KevinUp: I have suppressed the display of "365" in your example with @. --Anatoli T. (обсудить/вклад) 07:15, 8 December 2018 (UTC)
Wiktionary lemmas written in a nonnative script
As Wiktionary grows, I noticed some unusual entries written in a nonnative script such as 0.5#Chinese, の#Chinese that qualify for Wiktionary:Criteria for inclusion and may have also passed Wiktionary:Requests_for_verification due to its widespread used in a particular language or region. However, I think that it might be better to list such entries (that have passed RFV) in an appendix or separate namespace or to put a banner right below the language header to inform our readers that this lemma is written in a nonnative script along with categorization. KevinUp (talk) 15:14, 5 December 2018 (UTC)
- Out of curiosity, do we have Arabic, Greek, Hebrew, Hindi, Russian lemmas that are written in the Latin script, for example? I've also found Category:Terms written in foreign scripts by language, but only Chinese, Japanese and Korean are listed in this category. KevinUp (talk) 15:24, 5 December 2018 (UTC)
- These entries are rather interesting: fighting#Chinese, friend#Chinese, part-time#Chinese. Yes, I've heard these terms used in real life, such as in TVB dramas, but I am surprised to see these entries included in Wiktionary. I would like to propose for such terms to be listed in an appendix or separate namespace, because such entries are more likely to be found in an informal dictionary such as an A-Z pocket slang dictionary, rather than a formal dictionary. KevinUp (talk) 15:55, 5 December 2018 (UTC)
- The issue has come up before, with marketing being used (in Latin script) in Greek texts. Wiktionary:Beer parlour/2017/September § Modern Greek terms spelt with Latin characters. See also this revision history for a recent disagreement. I'm not comfortable at all with including that sort of things. Per utramque cavernam 16:15, 5 December 2018 (UTC)
- Foreign script is a strong argument for code-switching. Even when it is used constantly in Greek it can be the case that it never passes into Greek, and it is no loss not to add it either because the English entry suffices (you read a Greek text, look up a word here but find it as English, that’s enough, you don’t expect anyway that all that you read is in the dictionary as Greek). Fay Freak (talk) 19:39, 5 December 2018 (UTC)
- The issue has come up before, with marketing being used (in Latin script) in Greek texts. Wiktionary:Beer parlour/2017/September § Modern Greek terms spelt with Latin characters. See also this revision history for a recent disagreement. I'm not comfortable at all with including that sort of things. Per utramque cavernam 16:15, 5 December 2018 (UTC)
- Script is secondary to the actual spoken language, and usage of words should be analyzed for codeswitching, and for what-language lexicon a word belongs to. French has fr:American way of life#Français and fr:web design#Français, and Japanese has サード (sādo, “third”) and ホエールウォッチング (hoēru wotchingu, “whale watching”); are these "acceptable"? —Suzukaze-c◇◇ 19:42, 5 December 2018 (UTC)
- Maybe we need to find a way to represent code-switching? It would seem like a common pattern for a foreign word to have a code-switched variant (with foreign pronunciation, in a foreign script) and a nativized one (being closer to the native language's phonology, spelled in the language's native script) with the first one being extremely common and the second at the edge of attestability, but due to our policies we only include the second one and create a distorted picture of actual usage patterns.
- I remember @Vahagn Petrosyan having something to say about this. Crom daba (talk) 20:06, 5 December 2018 (UTC)
- I create a Usage note, as in վարագույր (varaguyr). --Vahag (talk) 12:18, 6 December 2018 (UTC)
- Yes, I think that we need to find a way to represent code-switching. Rather than using foreign script as an argument for code-switching it might be better to decide based on the pronunciation of the entry.
- I would like to suggest for entries such as (1) part-time#Chinese, (2) PK#Chinese, (3) SUS#Japanese that have been nativized to become closer to the phonology of the language it was borrowed into (despite retaining its nonnative script) to be accepted as legit entries whereas entries such as (1) fighting#Chinese, (2) fr:American way of life#Français, (3) の#Chinese that are found mostly in written form but rarely in spoken conversations are to be put under some sort of banner to inform our readers that such entries are of unconventional usage and are mostly written for stylistic effect. KevinUp (talk) 16:32, 6 December 2018 (UTC)
- Alternatively, we should set up some sort of guideline to decide whether or not an entry is considered code-switching or not. KevinUp (talk) 06:50, 7 December 2018 (UTC)
- Yes, language-specific CFI are needed. --Anatoli T. (обсудить/вклад) 07:17, 8 December 2018 (UTC)
- Alternatively, we should set up some sort of guideline to decide whether or not an entry is considered code-switching or not. KevinUp (talk) 06:50, 7 December 2018 (UTC)
- I think that the issue of the script is a bit of a red herring. Take the originally English word online, which has become commonplace in many languages, including Serbian. Now when Danas, a major newspaper, uses the word, they write for example ”Srbi sve više kupuju online”. The Politika newspaper is also written in Serbian but uses Cyrillic script; when they use the word, they write for example “Политика Online”, as they in fact do on every page of their website. It would be strange to consider the use by Danas a loan word but the use by Politika a case of code switching, merely because one happens to use Roman script and the other Cyrillic for what is the same language. --Lambiam 17:30, 8 December 2018 (UTC)
- In this particular case, the spelling is a strong indicator of code-switching, as Serbian orthography is phonemic and (unlike Croatian) strongly prefers transcribing foreign names and terms. You could consider onlajn (abundantly attested) a nativized variant, although arguably the choice between these spellings is a matter of personal style. Crom daba (talk) 18:00, 10 December 2018 (UTC)
For an example in English, Москва is citeable (Citations:Москва) but was deleted (Talk:Москва), and Citations:ἄρχων is also citeable (as are, I expect, Arabic-script forms of Allah and PBUH, etc). An older Chinese example is Talk:Thames河, deleted in 2011.) - -sche (discuss) 17:54, 8 December 2018 (UTC)
- When I read, “With absolute confidence I can boast that my Frittelle di Fiori di Zucca are the best in the world”, I don’t think, “Oh, perhaps we should consider including an entry for the English term frittella di fiori di zucca. No, I think this is an instance of code switching, and in this case one of a very common type. I think we should not have an English entry oliebol either. Although the term can be found in English texts, it is obviously a Dutch word. There is a need for a test or criterium when the use of a foreign term is simply code switching, and when the term becomes part of the lexicon of a borrowing language. As I’ve tried to argue above, being written in a different script is not a litmus test. Being included in quotation marks is a strong indicator of not being seen as part of the lexicon, but not all authors will use these when code switching. When the imported term becomes subject to local inflection, or can serve as a component to form new compound words, this is a strong indicator of having become lexicalized, but as a test this does not work for analytic languages like Mandarin. --Lambiam 12:33, 9 December 2018 (UTC)
- In personal experience, code switched fragments can very easily be inflected and are likely to be joined in compounds to attach them to native sentence structure. Also, lexicalized loans are likely to have defective inflection.
- Pronunciation is also no good, since it is extremely speaker and context dependent, and lexicalized loans can themselves have a special phonology.Crom daba (talk) 18:07, 10 December 2018 (UTC)
- I don't think we can have a coherent policy or test across different languages. Speakers of different languages will absolutely differ in their criteria for what counts as a native word. This is even more difficult with global languages like English where different communities are in contact with a huge variety of other languages from which to borrow from. DTLHS (talk) 18:23, 10 December 2018 (UTC)
- Good words, @Crom daba. I want to point out how language is really written on the internet: In printed works or works inspired by print practices there are many things that don’t happen but are unproblematic elsewhere, in unrestrained speech where people can develop their own standards or own morals, unspooked by societal expectation, so to speak in Stirnerian language: Remarkably, nowadays in Russian chats, and I mean those where discussions take place and people try to write correctly, one just writes some foreign words in foreign script and then immediately joins Russian endings in Cyrillic script to them. It’s also the way I think and do it: Writing Russian in Germany, referring to things in Germany without having a notion of a Russian equivalent, I just write German words or English words in Latin script and decline them Russian and in Cyrillic script (without space I mean, you understand; most iconic, I think), and this does not make them Russian. I often can think “Is this word Russian already”? There are some obvious ones that do exist, like everyone uses the word терми́н (termín) in reference to appointments in Germany, a word that does not exist in Russia, and I long did not even know that it doesn’t, it seems so indispensable. This middle ground of dubiosa (is this English or Latin, huh? Not English because of lacking spread) is only left out by me and other dictionary editors often because these words have limited relevance to a greater world and one would look up these words in German dictionaries anyway (as I said earlier, an entry in one language suffices, a Greek entry marketing is otiose), plus they are CFI-problematic (best one can do is quote them from fora and commentaries under articles, perhaps with archive links, but that’s it, these Soviets here don’t produce a corpus that would help to quote Russian as spoken in Germany). Separating the words is even more difficult if you look at inter-Slavic conversations: Like is Russian менто́вка (mentóvka, “mint liquor as popular in Bulgaria”) Russian? It is used in Russian texts here and there, and obviously with Russian endings then, but is it perceived as Russian? (With a German legalese term with no equivalent in English, how does the Verkehrsanschauung or Verkehrsansicht see it?) I have also read quite a lot strange words from Russian expats in Serbia and things like that, you could make large lists of such words if you wanted to; theoretically this could lead to having words in Russian written with Cyrillic characters we thought do not exist in Russian – I make here the strange observation that Latin words with foreign diacritics pass easier into texts of other languages but the Cyrillic languages tend more to transcribe all, i. e. having a Russian text with ђ is way more weird than Vietnamese diacritics, Semitic transcriptions and what you can imagine in English texts. And that’s only in Europe, elsewhere things become crazier, which others can describe better.
- For the phonetical point, see that legit French words contain pharyngeal fricatives, like hebs (“prison, can”), hnouch (“popo, bacon”). Here we have also an issue arising if we know that a word has passed into French, English, and you can attest it from songs (like they have been printed on CD or are buyable as downloads or else unlikely to vanish, so durable). The flip side of words written in a non-native script are words which have passed but cannot or only with uncertainty be written in the native script. English example: gwop (“moolah”).
- Normal dictionaries to a large part avoid such problems because they leave out exotisms, i. e. words for things that do not exist in an area where there is a community of the language documented. With this I lean towards an exclusion ground that is that if a word in English is for a foreign thing and the Verkehrsanschauung does not see the word as English then it is not English. Confer mesdemet! This is “not really English”. What does apply for abstracta then, what is Greek marketing then? This criterion I have just stated becomes difficult for foreign “ways of life”. Maybe Greek marketing is not actually Greek because he who uses such a word ceases to think like a Greek, regardless of the script it is written in. There are many gross things written and said in Arabic or Hindi texts that I would for this reason see as not-Arabic and not-Hindi. And the same criterion can apply to determine if a word has passed from German into Russian.
- The issue gets complicated however because there is not only code-switching for Wiktionary but there is also Translingual: You could make a case for “marketing” being Translingual and not only English. I have argued already (User talk:Fay Freak § Translingual) for grammatical terms like genitivus absolutus, status constructus and the like being Translingual in the first place. Maybe “marketing” is translingual because teachers of business and marketing have made it so ex cathedra, which is why it is used in Greek, never able to become Greek. Fay Freak (talk) 20:02, 10 December 2018 (UTC)
Linking elements of a term in {{en-noun}}
At l'esprit de l'escalier, should the individual elements of the phrase in {{en-noun}}
be linked to French words, like this: {{en-noun|head=[[l'#French|l']][[esprit#French|esprit]] [[de#French|de]] [[l'#French|l']][[escalier#French|escalier]]}}
? (Pinging @Per utramque cavernam as we discussed this on the entry talk page.) — SGconlaw (talk) 17:11, 5 December 2018 (UTC)
- No. They should be linked in the etymology. DTLHS (talk) 17:13, 5 December 2018 (UTC)
- In the case at hand, the link is to an entire French term, esprit de l’escalier. Where should the individual elements be linked, or do we just not link them in this case? I was thinking that since the elements of a term in
{{en-noun}}
are usually linked by the template anyway, it makes sense to include the links to the French words manually. — SGconlaw (talk) 17:20, 5 December 2018 (UTC)- Those links are one click away. Theoretically it can be different if a French phrase exists only in English or an other language, the French not being CFI-compliant as French. Fay Freak (talk) 19:43, 5 December 2018 (UTC)
- I usually link to component multi-word terms of a term if they reflect the sense of that term, eg, black sugar maple would link to black and sugar maple. And, as Fay Freak says, the individual words are just one more click away. It seems unhelpful to make a user guess at whether there are multiword components and which grouping leads to a possible entry. DCDuring (talk) 23:08, 5 December 2018 (UTC)
- The
{{en-noun}}
template links to English terms, though. In this case, the terms are French, so it's not appropriate to link. —Rua (mew) 10:45, 6 December 2018 (UTC)- Generally, yes, but arguably not exclusively. For example, sometimes when an element is not present in the Wiktionary (for example, a person's name), I've seen a link to an English Wikipedia article. I see no reason why links can't be to other languages where appropriate. — SGconlaw (talk) 12:01, 6 December 2018 (UTC)
- Because, again,
{{en-noun}}
creates English links. If you put a French word in there, it will still be an English link. A dead link, moreover. —Rua (mew) 18:43, 10 December 2018 (UTC)
- Because, again,
- Generally, yes, but arguably not exclusively. For example, sometimes when an element is not present in the Wiktionary (for example, a person's name), I've seen a link to an English Wikipedia article. I see no reason why links can't be to other languages where appropriate. — SGconlaw (talk) 12:01, 6 December 2018 (UTC)
- The
- I usually link to component multi-word terms of a term if they reflect the sense of that term, eg, black sugar maple would link to black and sugar maple. And, as Fay Freak says, the individual words are just one more click away. It seems unhelpful to make a user guess at whether there are multiword components and which grouping leads to a possible entry. DCDuring (talk) 23:08, 5 December 2018 (UTC)
- Those links are one click away. Theoretically it can be different if a French phrase exists only in English or an other language, the French not being CFI-compliant as French. Fay Freak (talk) 19:43, 5 December 2018 (UTC)
- In the case at hand, the link is to an entire French term, esprit de l’escalier. Where should the individual elements be linked, or do we just not link them in this case? I was thinking that since the elements of a term in
New sinograph QIOU "poor and ugly"
How should this situation be dealt with in terms of lexicography?
--Backinstadiums (talk) 00:26, 6 December 2018 (UTC)
- The same way we deal with any other word or sinograph — add it if it is attested in durably archived media, spanning over a year, etc. (It doesn't look like this is.) —Μετάknowledgediscuss/deeds 02:01, 6 December 2018 (UTC)
quadrumanus appears in the Cambridge Grammar of the English Language, page 1663; is it a typo or a variant of quadrumanous --Backinstadiums (talk) 15:58, 6 December 2018 (UTC)
- (This sounds like a Wiktionary:Tea room question. — SGconlaw (talk) 16:02, 6 December 2018 (UTC))
- It is a taxonomic designation (as in Chiropsalmus quadrumanus). Highly unlikely to be an English adjective because of the spelling. Equinox ◑ 16:40, 6 December 2018 (UTC)
- The authors were probably looking for a word that began with quadru and was not formed in Latin, as they are talking about "marginal vowels" as English morphological elements, which in the case of 'quadr' can be i, a, or u. Why they didn't choose quadrumane or quadrumanous for the purpose is beyond me. We could ask them. Maybe it is was typo. DCDuring (talk) 18:16, 6 December 2018 (UTC)
New Wikimedia password policy and requirements
The Wikimedia Foundation security team is implementing a new password policy and requirements. You can learn more about the project on MediaWiki.org.
These new requirements will apply to new accounts and privileged accounts. New accounts will be required to create a password with a minimum length of 8 characters. Privileged accounts will be prompted to update their password to one that is at least 10 characters in length.
These changes are planned to be in effect on December 13th. If you think your work or tools will be affected by this change, please let us know on the talk page.
Thank you!
CKoerner (WMF) (talk) 20:02, 6 December 2018 (UTC)
Programming languages
Since the Wiktionary includes all languages; Does it also include Programming languages? --2A01:112F:742:C00:14B9:E7A5:D1B3:F0B3 09:23, 8 December 2018 (UTC)
- No, as they aren't human language (though a few words may rarely get borrowed into English grammar). Equinox ◑ 10:23, 8 December 2018 (UTC)
- Is tlhIngan Hol a human language? --Lambiam 19:26, 8 December 2018 (UTC)
- Eh, it's clearly a totally different kind of thing from a programming language. The only programming language I've ever seen that even inflects verbs is Inform 7. Equinox ◑ 19:34, 8 December 2018 (UTC)
- Programming languages are determined by a language specification, not by usage. That falls under "documentation", not lexicography. DTLHS (talk) 17:31, 8 December 2018 (UTC)
- But the reference manuals for a programming language use terms from that language as if they were English, French etc - so we really ought to have them somehow. SemperBlotto (talk) 14:23, 10 December 2018 (UTC)
- We've had this discussion before. Early programming languages only had a few keywords, but now there are hundreds of frameworks with thousands of named classes (e.g. ExecutionEngineException, HttpMessageInvoker) and each class may have hundreds of named properties, methods and fields. These, too, are listed in manuals and guides. Equinox ◑ 14:30, 10 December 2018 (UTC)
- See Wiktionary:Requests_for_verification/English#caddr as well. - TheDaveRoss 14:31, 10 December 2018 (UTC)
- Take this sentence from a book on conversational French: “Bonjour is usually used until around six p.m., whereas bonsoir is used after six p.m.” In a book on French you can expect to find French words used as nouns in English sentences. Only, they are not used with their French meaning. They stand for themselves. So these sentences mention the word in the sense of the use–mention distinction. Likewise, the English sentence “
esac
iscase
spelled backward, rather likefi
isif
spelled backward” only mentions these keywords. To understand the sentence you don’t have to know the meaning of any of these words. On the other hand, grep, originally just another computer command, can be used as a verb (”I grep, he greps, we grepped”), so it clearly has become lexicalized and merits to be included. --Lambiam 18:12, 10 December 2018 (UTC)
- Take this sentence from a book on conversational French: “Bonjour is usually used until around six p.m., whereas bonsoir is used after six p.m.” In a book on French you can expect to find French words used as nouns in English sentences. Only, they are not used with their French meaning. They stand for themselves. So these sentences mention the word in the sense of the use–mention distinction. Likewise, the English sentence “
According to the description: "This appendix provides detail to sources linked by Wiktionary. It is to be linked from reference templates." It contains three items, all created by User:Dan Polansky. Is this a new policy? The only reason I've noticed it is that Dan changed one of the Hungarian reference templates. I'd prefer to link directly from the template to its corresponding website and not to an appendix. Was there a Beer parlour discussion or vote on this? Panda10 (talk) 15:25, 9 December 2018 (UTC)
- It is not a new policy, not anything mandatory and rigid. If you don't like my change in Template:R:TotfalusiEty 2005, please revert it. The point of the appendix is to provide more information than comfortably fits in the mainspace, e.g. English rendering of the title. Some reference templates link to Wikipedia, which is similar in that it does not lead to the main website of the reference. --Dan Polansky (talk) 15:31, 9 December 2018 (UTC)
- Dan, thanks for your prompt reply. I do see your point, but for now, if you don't mind, I will revert the changes until it is decided by the community how to standardize reference templates. Panda10 (talk) 16:46, 9 December 2018 (UTC)