Wiktionary talk:About Ancient Greek

Definition from Wiktionary, the free dictionary
Jump to: navigation, search

Nasal infix[edit]

A category that should be created under Category:Ancient Greek terms by etymology is Category:Ancient Greek words with nasal infix or something like it, for categorizing Ancient Greek words created by w:Nasal infix. This includes many present-tense stems like λαμβάνω ‎(lambánō), λανθάνω ‎(lanthánō), μανθάνω, and so on. Most of these also have the suffix -ανω. Perhaps I will do this myself sometime. — Eru·tuon 06:01, 5 January 2015 (UTC)

A link to the relevant section in Smyth's Greek Grammar, listing examples.

Does anyone know off the top of their head of any etymological templates that do something similar to this (i.e., note a derivation of a word by prefix or infix or something and categorize the entry)? Eru·tuon 08:29, 5 January 2015 (UTC)

Cross-posted at Wiktionary talk:About Latin#Nasal infix. Check there for my comments regarding {{infix}}. Eru·tuon 09:59, 5 January 2015 (UTC)

I'm not sure if the nasal infix should be considered an etymological feature. Of course it's inherited from PIE, but it seems to me that the category's purpose is to list verbs that synchronically have a nasal infix? —CodeCat 14:24, 5 January 2015 (UTC)
Here's my suggestion from the my posting on the Latin page, transferred to a Greek example. In the Etymology section of λαμβάνω, phrasing like this could be used: (The present stem λαμβάν- originates) from *lh₂⟨n⟩gʷ-: zero-grade of *sleh₂gʷ- with nasal infix *n. Thus this would be a "diachronic" template, categorizing words with a certain PIE morpheme, when these words can be traced back to a particular PIE form. Eru·tuon 17:48, 5 January 2015 (UTC)

After a little thought, I think this idea would work better as a general PIE morphology template that includes more than just nasal infix. I posted that idea in About Proto-Indo-European. Eru·tuon 18:10, 5 January 2015 (UTC)

Template:grc-pron: phonemic transcription of pitch accent[edit]

Hey, this topic has probably been beaten dead already, but here goes. I noticed that in {{grc-pron}}, pitch accent on two-mora syllables is transcribed using the diacritics for high and mid pitch. Thus, ὗς ‎(hûs) is transcribed /hy᷇ːs/, χείρ ‎(kheír) /kʰe᷄ːr/. This represents a change from {{grc-ipa-rows}}, which transcribes the latter word as /kʰe͜ér/.

Since this is intended to be a phonemic transcription, the latter is better: the mid pitch is not phonologically distinctive in Ancient Greek, but only the high pitch. Words are distinguished by which mora the high pitch is placed on: for instance, τόμος τομός /tó.mos to.mós/ "slice, sharp"; ἤτε ἦτε /ɛɛ́.te ɛ́ɛ.te/ "or, you are".

Perhaps the idea is that writing in mid pitches is phonetically accurate, but that's not actually true. Unaccented morae did not have level mid pitch; rather, according to Allen, the morae before the high-pitched mora had a rising pitch contour, and the morae after had a falling pitch contour. (I suspect the actual reality was a little more complex, and depended on sandhi, position in syntactic units, etc., but I haven't read up about this.) Marking unaccented syllables with the mid-pitch diacritic seems to indicate that this rising and falling did not occur, and thus is misleading.

So I propose that we return to the earlier transcription: /hý͜ys kʰe͜ér/, for instance. This transcription is phonemically accurate, since it breaks things into morae and only marks the accented mora. Using symbols for pitch contours would make the phonemic contrasts less clear (since it doesn't show how pitch contours derive from high pitch on a single mora), but is at least phonetically accurate: [hŷːs kʰěːr].

I suspect part of the reason for the change is the oddity of representing a long vowel or diphthong with two vowel letters, and of using the tie, and the question of ambiguity between diphthongs and two-vowel sequences. However, there's no ambiguity, even without the tie, if all syllable breaks are written; then any two vowel letters without a syllable break between them count as one syllable: for instance, in φιλέεις /pʰi.lé.ees/; Homeric ἐύ /e.ý/, Attic εὖ /éu/. Using both non-syllabic diacritics and syllable breaks — [pʰi.lé.ee̯s éu̯] — would be redundant.

Of course, even if there's technically no ambiguity, it would still be confusing to non-specialists, so maybe a non-syllabic diacritic or tie would be necessary to help people out.

If we wanted to have a phonetically accurate transcription, I think it would be much more complex: rising pitch before the accented mora would have to be transcribed, and falling pitch after — though perhaps only the falling pitch, since it is more important: αἰτία would be transcribed [aǐ.tí.âː] or [ai.tí.âː]. I have a theory that the second is more accurate stress-wise, and might therefore be best as a phonetic transcription, but regrettably I have not fully formulated or published this, so that's just an aside. Eru·tuon 01:52, 23 January 2015 (UTC)

Declension table cleanup[edit]

This conversation got a little long, so I moved it to Wiktionary talk: About Ancient Greek/Declension table cleanup. ObsequiousNewt (ἔβαζα|ἐτλέλεσα) 13:46, 17 March 2015 (UTC)

Unmarked vowel length[edit]

@ObsequiousNewt you commented on vowel length in ὑμεῖς‎ ‎(humeîs‎), and I thought I'd answer here. I also noticed that the LSJ doesn't mark the length, and I'm stumped as to why. From line 2.75 of the Iliad (μεῖς δ' ἄλλοθεν ἄλλος ἐρητύειν ἐπέεσσιν), which I noted in my edit summary, it's clear the υ is long, and also from the fact that ὔμμες, the Aeolic form of the word (used in Homer), has a double μ. This is a case of the s in a Proto-Indo-European sonorant cluster being elided and causing compensatory lengthening of the preceding vowel (Attic) or the following consonant (Aeolic): usm- > uum- or umm-. But not sure why the LSJ doesn't mention it. Eru·tuon 00:22, 26 March 2015 (UTC)

Double rhos[edit]

The page says:

There exists a convention in some older works of adding a smooth and rough breathing mark to internal double rhos. Ancient Greek on Wiktionary prefers unmarked internal rhos. Consequently Βορρᾶς ‎(Borrhâs) is correct, and Βοῤῥᾶς ‎(Borrhâs) is incorrect.

I think that's unnecessarily rigid, and doesn't help people who might be looking up terms that have the breath marks. What do people think of having things like Βοῤῥᾶς hard-redirect to Βορρᾶς rather than prohibiting them? Even the new search function doesn't recognize -ῤῥ- as a form of ρρ, so there's no automatic redirect from the search box (the way searching for bóring, for example, automatically takes you to boring). —Aɴɢʀ (talk) 18:40, 24 May 2015 (UTC)

I'm fine with this. Could we get automatic linking to strip the breathing marks from double rhos. Also, what is our policy on graves since the way they are used is based on an arbitrary Byzantine Greek rule that encodes no phonetic information? —JohnC5 01:42, 25 May 2015 (UTC)
Do we know for sure that it doesn't? I'm pretty sure some people have analyzed it as having phonetic reality in the loss of a high pitch accent. As for automatic redirects, I think that comes straight from MediaWiki and we at Wiktionary have no control over it. We could ask in the Grease Pit just to make sure, though. —Aɴɢʀ (talk) 09:10, 25 May 2015 (UTC)
Really? I'd love to know more. Benjamin Fortson claims in his chapter on AG that is merely a typographical convention and Smyth §154 and §155 describe the accent's placement, not its effects. Regardless, wouldn't it only affect Byzantine Greek onward if true? —JohnC5 09:21, 25 May 2015 (UTC)
I think he was talking about linking (controlled by Module:languages#Language:makeEntryName using tables in Module:languages/data3/g) rather than redirects (controlled by MediaWiki). The stripping of diacritics is on a character-by-character basis, so it would require code to recognize a multi-character pattern, which might not be a good idea. Chuck Entz (talk) 15:34, 25 May 2015 (UTC)
Diacritic stripping in links would only make sense if we wanted to write "Βοῤῥᾶς" on other pages while still keeping the pagename itself Βορρᾶς, but that isn't the case anyway. As for the grave accent, it doesn't matter for our purposes if it has any phonological significance or not, we still have to decide what we want to do with forms like δὲ, καὶ, Ἀττικὸς, and πατὴρ. (I see we already have hard redirects for the first two of those.) —Aɴɢʀ (talk) 16:11, 25 May 2015 (UTC)
Yeah, I support this idea. The dumb thing is that the search function used to do that kind of replacement, and then it got "updated".... >_< ObsequiousNewt (εἴρηκα|πεποίηκα) 20:16, 25 May 2015 (UTC)

DI‑GAMMA / VAU : Smooth‑breathing & SIGMA / SAN : Rough‑breathing[edit]

This conversation is a bit long, so I am moving it to the subpage Wiktionary talk:About Ancient Greek/Digamma & sigma theory. —JohnC5 22:02, 24 June 2015 (UTC)


Proposition: put comparative and superlative forms of adverbs on the headword line, like so:
ObsequiousNewt (εἴρηκα|πεποίηκα) 17:02, 6 July 2015 (UTC)

Make it so! —JohnC5 00:49, 7 July 2015 (UTC)

Citing inscriptions and papyri[edit]

We need a good way to do this. Module:Quotations may be acceptable for this purpose; I'm not sure, but if not (and even if so), we need to figure out how to format it. Things I (at least) think a good citation should include:

  • Trismegistos number. No question about this in my view: Trismegistos is the single best collection of papyri in existence. In the unlikely event that something doesn't have a TM number... I'll deal with that if it comes up.
  • Year.
  • Location.
  • Probably the publisher—although in many cases there will be more than one of these. Perhaps the original publisher is best. If the publisher is included the associated numbers should also be included, although they should be formatted in a readable fashion.
  • Not the location (inventory) of the papyrus: this is subject to change, and in any case is accessible through TM.

If somebody who has experience with citing things would like to step forward, great, otherwise I can probably come up with some sort of proposal. —ObsequiousNewt (εἴρηκα|πεποίηκα) 18:54, 30 September 2015 (UTC)

Pinging Acronym, JohnC5, Angr, and, uh, also, how should we cite scholia? Can it be done with the quotations module? —ObsequiousNewt (εἴρηκα|πεποίηκα) 15:49, 9 October 2015 (UTC)
How would you deal with coins or monumental inscriptions? Sometimes we cite those and then put in a picture of the object being cited in the entry (when available). —Μετάknowledgediscuss/deeds 15:52, 9 October 2015 (UTC)
Trismegistos has inscriptions (175215 is an example of a coin, although unfortunately no picture is available). My only concern with TM is that it doesn't have everything... I'm right now trying to put it to the test by attempting to locate a Coan inscription—even so, it has a lot of data, which is always nice. And yeah, putting in pictures sounds like a great idea. —ObsequiousNewt (εἴρηκα|πεποίηκα) 16:27, 9 October 2015 (UTC)
@ObsequiousNewt, Metaknowledge: Re citing scholia, I've cited a scholion once here on Wiktionary; you can see the format I chose at Citations:κεχηνώδης. As for citing inscriptions, I don't feel tremendously well qualified to comment; I still have yet to figure out how to look up authorities' citations of the Corpus Inscriptionum Latinarum:-S  — I.S.M.E.T.A. 17:43, 9 October 2015 (UTC)
Um. One, I was looking for a way to do that in a templated manner, preferably using Module:quotations, two, why are there so many sentences there? —ObsequiousNewt (εἴρηκα|πεποίηκα) 18:38, 9 October 2015 (UTC)
@ObsequiousNewt: Oh, I see; never mind. I added those sentences for context, since I was trying to work out what κεχηνώδης means. That many do no harm when they're from a public-domain source and when they constitute the only thing on a Citations (rather than mainspace) page. — I.S.M.E.T.A. 11:36, 10 October 2015 (UTC)

Vote: Using macrons and breves for Ancient Greek in various places[edit]

FYI of various Ancient Greek editors: Wiktionary:Votes/pl-2015-09/Using macrons and breves for Ancient Greek in various places. --Dan Polansky (talk) 06:51, 17 October 2015 (UTC)

Second element of diphthongs[edit]

Not sure if I should be posting this here, or on the template talk page, but {{grc-IPA}} currently renders αι αυ as /aɪ aʊ/ (and so with other diphthongs). This should be changed to /ai au/, because there's no indication that the second element of diphthongs was lax like the English diphthongs /aɪ aʊ/ supposedly are. In fact, before vowels, according to W. S. Allen, the second element was pronounced as a doubled semivowel, very much not lax: οἷος, παιδεύω would be something like [hôjjos paiděwwôː]. — Eru·tuon 20:45, 11 February 2016 (UTC)

Fixed. —ObsequiousNewt (εἴρηκα|πεποίηκα) 15:27, 16 February 2016 (UTC)
Is this supposed to hold true for iota subscription, because in ᾍδης ‎(Hā́idēs), a glide may not bear the accent? —JohnC5 16:30, 16 February 2016 (UTC)
Or βαίνω ‎(baínō) for that matter. Also, the coärticulation mark below now no longer makes sense. —JohnC5 16:34, 16 February 2016 (UTC)
Oops, I wasn't very clear. The diphthongs should end in /i u/ in the phonemic transcription. The offglide is realized phonetically as [jj ww] only between vowels, but that isn't supposed to be reflected in the phonemic transcription. The phonemic transcription of οἷος, παιδεύω should be /hói.os pai.deú.ɔː/. (Since no one's responded to my comment above about accent, you can ignore the way I render the pitch accent for now.) I also agree that the ties should be removed, since syllable breaks are enough to show that two successive vowels are a diphthong. — Eru·tuon 19:53, 17 February 2016 (UTC)
I'm assuming the non-pre-vocalic diphthong in βαίνω ‎(baínō) would still be /báj.nɔː/ and not /baɪ́.nɔː/, correct? —JohnC5 21:16, 17 February 2016 (UTC)
It should be /baí./ rather than /baɪ́./ or /baj́./. Using the high vowel symbol /i/ makes it clear that the diphthong is a sequence of two vowels and equivalent to a long vowel. — Eru·tuon 21:50, 17 February 2016 (UTC)
I'd opt for /a͜í./ with the elision before non-vowels, but /áj.j/ before vowels. Also, what do we do for something like ζῴου ‎(zṓiou) (the genitive of ζῷον ‎(zôion))? Perhaps /zdɔ́ːj.joː/? —JohnC5 21:59, 17 February 2016 (UTC)
I'd avoid using /j/ and /w/ in phonemic transcription since ("standard" Attic-Ionic and Koine) Greek doesn't have those phonemes, though if we want to add a narrow phonetic transcription they'd be fine there. I'd also prefer using the nonsyllabic marker rather than the coarticulation mark, e.g. /bái̯.nɔː/, /háːi̯.dɛːs/, /zdɔ́ːi̯.oː/. —Aɴɢʀ (talk) 07:40, 18 February 2016 (UTC)
I'd be fine with this. Newt? —JohnC5 15:21, 18 February 2016 (UTC)
At the risk of seeming overly fussy, I'd prefer not having a tie or non-syllabic diacritic, because it's superfluous when syllable breaks are marked, but otherwise I'm fine with this rendering of the diphthongs. — Eru·tuon 22:15, 18 February 2016 (UTC)
I'd still like the non-syllabic marker but no tie. Though I think the /z͜d/ or /z͡d/ could use one. We do need to the non-syllabicity of the glides and the accent needs to be over the vowel and not the glide. —JohnC5 00:33, 19 February 2016 (UTC)
Actually, the accent has to be over the glide when the diphthong has an acute accent. It can only be over the first part of the diphthong when there's a circumflex. οἶκοι ‎(oîkoi) and οἴκοι ("homes" and "at home"), for instance, were distinguished by which of the two morae of the first diphthong was accented: /ói.koi/ and /oí.koi/. (There might also have been a difference in how long the final diphthong was.) Similarly with ἦ ἤ: /ɛ́ɛ ɛɛ́/. So if the offglide has the non-syllabic diacritic, it must still be able to have accent.
Why do you want a tie on /zd/? Standard IPA usage of the tie is only for affricates (d͡z) and double articulation (k͡p), not for consonant clusters that happen to be written with one letter. The cluster has to be divided between syllables in words like φράζε ‎(phráze), which is scanned long–short: /pʰrás.de/.
Also, ζ is phonemically /sd/. If I remember right, Allen says the original affricate /d͡z/ underwent metathesis in Attic to make it similar to the cluster σβ /sb/ (and remove the complexity of an affricate phoneme from the phonological system). The voicing-assimilated forms [zd zb] should only be used in the phonetic transcription that we don't have yet. — Eru·tuon 20:15, 19 February 2016 (UTC)
We currently use /ɛ᷇ː/, /ɛ᷄ː/ for ἦ ~ ἤ. I suppose we could use /ɔ᷇i̯/, /ɔ᷄i̯/, but that seems a bit odd. I'm just uncomfortable with putting an accent on a non-syllabic character. As for the ζ, I think I was drunk when I wrote that. You are quite right and please ignore me. —JohnC5 20:29, 19 February 2016 (UTC)
I have no objection to omitting the nonsyllabicity mark and just using /oi/, etc.; as Erutuon mentioned, the syllabic break marks are sufficient for distinguishing diphthongs from vowels in hiatus. I merely meant that I prefer using /oi̯/ etc. to /oj/ etc., not that I prefer /oi̯/ etc. to /oi/ etc. —Aɴɢʀ (talk) 21:27, 19 February 2016 (UTC)
That is acceptable. —JohnC5 21:35, 19 February 2016 (UTC)
Good. (Nevertheless, there's actually nothing wrong with putting a pitch accent mark over a nonsyllabic character. Lithuanian accent, for example, can fall on the consonants /r l m n/ as well as /j w/ and the vowels.) —Aɴɢʀ (talk) 21:50, 19 February 2016 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Really? I always assumed that the Lithuanian accents were just conventional. If that's the case, then we should keep the nonsyllabic mark. —JohnC5 23:47, 19 February 2016 (UTC)

There are actually lots of languages where pitch or tone falls on nonsyllabic sonorants. But anyway, how do we want to distinguish acute and circumflex accent on long vowels? Something like /ɛ̀ɛ́/ for ή vs. /ɛ́ɛ̀/ for ῆ? Or just /ɛɛ́/ vs. /ɛ́ɛ/? —Aɴɢʀ (talk) 09:00, 20 February 2016 (UTC)
The I prefer the latter merely for legibility. I assume we don't like /ɛ᷇ː/ vs. /ɛ᷄ː/ then because it doesn't represent the phonemic representation of the accent? —JohnC5 18:43, 20 February 2016 (UTC)
I don't like /ɛ᷇ː/ and /ɛ᷄ː/ because I find them difficult to interpret. —Aɴɢʀ (talk) 18:55, 20 February 2016 (UTC)
I also prefer the one with a single acute accent, because (as I said above) it does accurately represent the usual phonemic analysis of the accent.
However, since Allen says the accent consists of high pitch and falling pitch, falling pitch could be added to the phonemic transcription. That's more complex, so maybe best to go with the single acute solution for now. — Eru·tuon 19:27, 20 February 2016 (UTC)
The phonemically distinctive factor is the syllable the accent is placed on, and, for long vowels and diphthongs, whether it is of one type or the other. So that's the minimal amount of detail that should be in a phonemic transcription. Everything else is extra phonetic detail. So all we have to do is decide which two IPA symbols to use to denote accent types 1 and 2. I think that the accent mark should be placed on the first vowel, and long vowels denoted with a length mark rather than by writing the vowel twice. —CodeCat 19:44, 20 February 2016 (UTC)
I agree that the surface contrast is acute vs. circumflex vs. nothing, but usually this is analyzed as representing a contrast among three options defined by morae: high pitch on the last or only mora (á aá), high pitch on the first mora of a long vowel (áa), and no high pitch (a aa). Do you disagree, or just think we shouldn't try to give the underlying representation in the phonemic transcription? — Eru·tuon 20:03, 20 February 2016 (UTC)


I created a table of correlatives. It started from Smyth's table, then grew as I fed it with whatever I could scrounge up from LSJ. It might need some pruning (not sure if everything is actually correlative), but where could it be put so that it would be available to people? Might also be nice to have some way that correlatives corresponding to a given pronoun are automatically displayed in entries; for instance, so that the entry on the medial demonstrative τοιοῦτος displays the relative, interrogative, etc. that correspond to it. — Eru·tuon 19:44, 20 February 2016 (UTC)

I left some comments on the talk page of your table. Your idea to display them automatically is nice, but you should consider that each cell in the table is a combination of two axes. So while we can display the relative, interrogative that are on the same row as a medial demonstrative, we can also go vertical and list all medial demonstratives in addition. —CodeCat 19:53, 20 February 2016 (UTC)
Well, the two axes would have different uses. Demonstratives and relatives, from the same row, would sometimes be used together as correlatives (τοῖος ... οἷος ...), while words from the same column would only be used together only by accident. So the columns would have to be put in a different section of the entry than the rows: perhaps in "related terms". — Eru·tuon 20:29, 13 March 2016 (UTC)

Splitting dual and plural pronouns from singular[edit]

Perhaps I should've asked for feedback beforehand, but I split νώ ‎(nṓ) and ἡμεῖς ‎(hēmeîs) from ἐγώ ‎(egṓ). It seems like they were only described in the same entry because that's what LSJ does, but that doesn't seem to be the way it's typically done on Wiktionary (see I and we) when the pronouns have different roots. Now they can have more neatly have separate etymologies given, and so on.

Let me know if I'm being a bit too bold. Otherwise, I will probably split σύ ‎() too.

And I don't know how to do this, but {{grc-decl}} should be made to allow a dual-only table in νώ ‎(nṓ).

It would be nice to have a personal pronoun navigation box like on German Wiktionary, or an expandable table that can be put in the See Also section. — Eru·tuon 23:09, 29 February 2016 (UTC)

I split the second-person pronouns today. Now we really need etymologies for σφώ ‎(sphṓ), ὑμεῖς ‎(humeîs), νώ ‎(nṓ), and ἡμεῖς ‎(hēmeîs). Anyone have access to Beekes?
Also an entirely unrelated thing: I wonder if ἄριστος ‎(áristos) and ἀρετή ‎(aretḗ) are related. The etymology sections say the first comes from *h₂er- and the other from *h₂erh₁-. Wonder if the final *h₁ in the second could be an ending added to the first. Perhaps this question should be placed somewhere else. — Eru·tuon 08:46, 3 March 2016 (UTC)
grc-decl is fixed. Beekes says of ἀρετή that "connection with ἀρείων is semantically attractive, but formally not clear." He does not reconstruct *h₂er- for any relevant roots (only ἀραρίσκω) and does not give a certain reconstruction for either ἀρείων or ἄριστος. I don't think there's any precedent for *h₁ as a suffix, though. —ObsequiousNewt (εἴρηκα|πεποίηκα) 16:23, 4 March 2016 (UTC)

Dialect of entries[edit]

We need a way to specify what dialect the main form of an entry belongs to. For instance, ἠώς ‎(ēṓs) is Ionic and Epic, but there isn't any place to say that on the page. And ῥέζω ‎(rhézō) is a purely poetic verb. Perhaps the headword templates should be modified to allow this, because I'm not sure where else the information could go. In ῥέζω ‎(rhézō), I've added a blank entry line to say that all uses are poetic: messy and not satisfactory. — Eru·tuon 19:29, 14 March 2016 (UTC)

This is not at all a difficult thing to do: just use {{label}}. We should not be making blank lines when we already have perfectly functional infrastructure for marking dialects. —Μετάknowledgediscuss/deeds 21:06, 14 March 2016 (UTC)
You mean, put {{label}} in the {{head}} line? I thought it was only for defs. But that would be a simple solution. — Eru·tuon 00:55, 15 March 2016 (UTC)
It is. Considering that different dialects can have the same spelling with different definitions, that's not always a bad thing. Also, I've never used it, but there's also a template called {{term-context}} designed specifically for placement to the right of the headword. Chuck Entz (talk) 01:39, 15 March 2016 (UTC)
Well, I was suggesting that you put the {{label}} on each definition. But you could also label the headword-line; standard practice is to use {{term-label}} for that, although not many languages make use of it. —Μετάknowledgediscuss/deeds 01:41, 15 March 2016 (UTC)
ISMETA suggested something like this a while a go for Ecclesiastical and Medieval Latin. —JohnC5 01:47, 15 March 2016 (UTC)
Hmm, I like the idea of using {{term-label}}. It would allow us to add labels for Greek dialects to Module:labels/data. But is {{term-context}} supposed to be used, or is it deprecated like {{context}}? — Eru·tuon 02:18, 15 March 2016 (UTC)
It does look like the two bear the same relationship as {{label}} to {{context}}. I hadn't realized that when I mentioned {{term-context}}. Chuck Entz (talk) 02:42, 15 March 2016 (UTC)

Prioritized dialect[edit]

Should we say something about which dialect form we prioritize? Obviously whenever Attic, Ionic, and Koine all agree on a form, that's our primary lemma, but I'm not sure about cases where they don't agree. I have the impression that we basically always take the Koine form, is that right? In other words, we follow Attic and Koine in prioritizing ἡμέρᾱ/οἵᾱ/νέᾱ over ἡμέρη/οἵη/νέη, and we follow Ionic and Koine in prioritizing γλῶσσα over γλῶττα and ξένος over ξεῖνος. Is there ever a time when we prioritize a form other than the one that appears in Koine? —Aɴɢʀ (talk) 14:12, 19 March 2016 (UTC)

The practice in established dictionaries, as I've understood it, is to use the Koine form as the citation form. There are several points of divergence with Ionic (no quantitative metathesis over ϝ, less contraction, ευ for εο, and most notably η for ᾱ even after ε/ι/ρ.) There are significantly less points of divergence with Attic, however; the main ones are ρρ for ρσ and ττ for σσ, which are specifically Attic features—but note that specifically Attic words are used as the citation form—LSJ has θάλασσα, ἄρσην, but ἕτερος, δέχομαι, γίγνομαι. —ObsequiousNewt (εἴρηκα|πεποίηκα) 00:03, 20 March 2016 (UTC)
Yes, it's correct that we usually choose the Koine form, and it would be good to explain this somewhere. In general we're following the practice of the LSJ. In the case of nouns in -ᾱος, like λᾱος ‎(lāos), the Koine form isn't the Ionic, but the Doric; and for contracted verbs, we're choosing the uncontracted form, which is Ionic and not Attic or Koine. It would probably be accurate to say either Attic, Koine, or uncontracted. Hmm, one counterexample would be γίγνομαι ‎(gígnomai), for which the Koine is γίνομαι ‎(gínomai).
I have also created a few entries on aorist forms — ἔτλην ‎(étlēn), ἠρόμην ‎(ērómēn), ἦλθον ‎(êlthon) — where the LSJ instead chooses to make an entry on the nonexistent, dialectal, or suppletive present-tense forms *τλάω ‎(*tláō), εἴρομαι ‎(eíromai), ἔρχομαι ‎(érkhomai). — Eru·tuon 23:58, 19 March 2016 (UTC)
It makes much more sense to use the uncontracted forms, as they provide more information. The split between "contracted" and "uncontracted" is not as simple as "Attic" and "Ionic"—both dialects used both forms, but in different quantities, and in different paradigms (e.g. some dialects did not contract over ϝ; most dialects did not contract σ-stem nouns but would contract verbs, and the treatment of *-klewēs is a goat rope.) I don't know whether you meant to say λαός or λᾶας, but λαός is basically the Attic form (λεώς being archaic even in Attic-Ionic) and λᾶας does not seem to survive into Koine.
LSJ's usage of τλάω as a lemma makes sense because all of the other tenses of *telh₂ are attested. εἴρω is not really the lemma in LSJ—all the other terms are listed under ἐρῶ. Putting suppleted tenses on the same entry as their citation forms is general practice everywhere. Usage of the aorist as a citation form for *telh₂ and *werh₁ makes sense (and, honestly, probably more sense than using the future tense), but I see no reason not to use the present as a citation form in ἔρχομαι. —ObsequiousNewt (εἴρηκα|πεποίηκα) 00:32, 20 March 2016 (UTC)
Hmm, sorry, I meant to say λᾱός ‎(lāós). Perhaps νᾱός ‎(nāós) vs. νεώς is a better example?
I just placed etymology at ἦλθον ‎(êlthon), not inflection, which is still at ἔρχομαι ‎(érkhomai). I don't know, it would almost make more sense to me if ἔρχομαι were described at ἦλθον since only the present tense has the root ἔρχ-. — Eru·tuon 01:03, 20 March 2016 (UTC)


A few items I'm not sure how to add to the transliterator module Module:grc-translit: ι̯ ‎(), ‎(), ᾿ ‎(᾿). The first is iota with non-syllabic diacritic (0x32E), equivalent to *y in PIE notation. Two characters that should together be transliterated as y. The last two are Greek dasia and psili (0x1FFE, 0x1FBF). Right now the module only recognizes combining comma and reversed comma above, which aren't appropriate for mentioning the breathing diacritics in Etymology sections. Could someone add those for me? — Eru·tuon 21:14, 26 March 2016 (UTC)

I don't think that "y" would be widely understood; people might think it represents upsilon. How is it currently transliterated? —CodeCat 21:16, 26 March 2016 (UTC)
As , as above. I don't know, currently upsilon is transliterated as u, and I think the transliteration should match Proto-Hellenic and Proto-Indo-European. But maybe you're right. — Eru·tuon 21:19, 26 March 2016 (UTC)

@ObsequiousNewt, could you add the two breathing marks? — Eru·tuon 20:06, 29 March 2016 (UTC)

I can, but why do we need to use non-combining breathing marks in such a manner? —ObsequiousNewt (εἴρηκα|πεποίηκα) 15:39, 30 March 2016 (UTC)
I'd use the rough breathing in etymologies that explain the absence of the rough breathing in a given word through Grassmann's Law or analogy; for instance, in ἄλοχος ‎(álokhos). ἔχω ‎(ékhō) and ἄκοιτις ‎(ákoitis) should also have a note. Preferable to saying /h/, which could leave readers wondering what grapheme represents the sound. — Eru·tuon 18:39, 30 March 2016 (UTC)
I think using /h/ is preferable. When we say "ἁ- ‎(ha-, alpha copulativum) +‎ λόχος ‎(lókhos, lying down)", the h is right there in the transliteration of ἁ-, and anyway, we can probably assume that anyone interested enough to be reading the etymology sections of Ancient Greek words knows how h is written in the language. —Aɴɢʀ (talk) 18:55, 30 March 2016 (UTC)
I don't think we can assume that for all Ancient Greek words. Hopefully there are some people looking because they're interested in the etymologies of English words that derive from Ancient Greek, who haven't started learning Greek. That is plausible in the case of the prefix hydro-. I would certainly hope that some people who haven't learned anything about Greek would be curious enough to be interested in the etymology sections. And it doesn't do good to presume they won't be interested and therefore make the etymology sections unintelligible to them. That's just a way to discourage any spark of curiosity about Greek in readers. — Eru·tuon 19:13, 30 March 2016 (UTC)
It seems to me that readers would either know what grapheme represents the sound, or not care what grapheme represents the sound. Regardless, I see nothing wrong with saying e.g. "From ἁ- (ha-) + -λοχος. The initial *h was lost due to Grassman's law." [note 1: I suspect this is strictly an exocentric rather than endocentric compound, given the difference in gender; note 2: why is Grassman's law acting over two syllables?] —ObsequiousNewt (εἴρηκα|πεποίηκα) 20:55, 30 March 2016 (UTC)
Hmm, I never paid attention to the "next syllable" part of the law before. LSJ just says "dissimilation from a following aspirate" in the entry on ἀ-. Then this must not count as Grassmann's law in the strict sense. But there's another example of long-range dissimilation of aspirates: ἀδελφός ‎(adelphós). Perhaps this is an isolated and only Greek phenomenon.
Well, no way to figure out which of us is right about our readers, so I'll leave that question alone. But now that I think a little more, there is another reason I sometimes want to use the rough breathing. Using *h looks like Proto-Hellenic orthography, and would imply that epenthesis or deletion of an h sound occurred in Proto-Hellenic. But using the Greek grapheme simply implies the synchronic question of the presence or absence of the sound or grapheme, without the diachronic question of when loss or addition of the sound occurred. For ἄλοχος ‎(álokhos) it might have happened in Proto-Hellenic, but there are cases where it definitely didn't happen in Proto-Greek or we don't know: ἡμέρα ‎(hēméra), ἠέλιος ‎(ēélios), ὕδωρ ‎(húdōr). (These probably gained or lost the sound sometime during the branching out of the Greek dialects from Proto-Hellenic.) There using the Greek grapheme shows that we're talking about the synchronic question without implying anything diachronic. So I would appreciate it if I have the option of using the Greek grapheme. — Eru·tuon 08:19, 3 April 2016 (UTC)
(edit conflict) @Erutuon: I also am not sure about ι̯. What book uses this to represent /j/? Buck (1910) uses the Latin letter j; Pamphylian and Argolic use ι, Cyprian uses 𐠅 (syllabary). —ObsequiousNewt (εἴρηκα|πεποίηκα) 20:34, 30 March 2016 (UTC)
Smyth uses it, for instance in [1], which discuss what is now called palatalization in Proto-Hellenic. He also uses υ̯ for ϝ in §20. Maybe it's a bit dated. I don't remember what symbol Sihler uses. — Eru·tuon 20:39, 30 March 2016 (UTC)
Ah, you mean Proto-Hellenic, not actual Greek. But why are we writing Proto-Hellenic in the Greek script? —ObsequiousNewt (εἴρηκα|πεποίηκα) 20:55, 30 March 2016 (UTC)
Heheh, good point. This comes from -ια ‎(-ia). I suppose the etymology should give the Proto-Greek or slightly pre-Greek form, which is probably *-ya. But it needs verification, and I don't have Sihler or Beekes. — Eru·tuon 21:05, 30 March 2016 (UTC)

Redirects from baria forms for oxytones[edit]

Would it be desirable to have hard redirects for all Ancient Greek oxytones for their corresponding forms with a baria (grave accent) on the ult? And could this be instituted by bot? ObsequiousNewt and I discussed this at User talk:Erutuon#Special:Diff/37805790 and both of us consider it desirable. — I.S.M.E.T.A. 20:13, 2 April 2016 (UTC)

I'm in favor. —Aɴɢʀ (talk) 21:01, 2 April 2016 (UTC)
The only reason not to would be to prevent terms that aren't exactly equivalent being redirected, but the relationship is pretty tight in Ancient Greek and I don't think any other language uses the Greek alphabet with that accent, except maybe Katharevousa Greek. Chuck Entz (talk) 22:43, 2 April 2016 (UTC)
I think it's a good idea, but I worry that some weird ancient Greek-script language is going to throw a wrench in this (though I doubt it, because those attestations don't normally have diacritics in the first place). I'm tagging some people would be able to allay my fears, and might want to contribute to this discussion anyway: @Vahagn Petrosyan, Ivan Štambuk, JohnC5, Liliana-60Μετάknowledgediscuss/deeds 23:38, 2 April 2016 (UTC)
If there are collisions, we can simply put a note on the page like abit#English. I'm all but certain there aren't any, though. —ObsequiousNewt (εἴρηκα|πεποίηκα) 00:37, 3 April 2016 (UTC
Sounds like a good idea, though I feel like there should be some kind of note in the entry explaining to readers that yes, they found the right entry, even though there's an acute rather than a grave on the word, and explaining under what conditions the grave appears. Some readers will know the grave accent rule, some won't.
As for other languages using the Greek alphabet, I can't imagine why they would ever need to use the grave accent, unless they have multiple pitch accent types like Serbo-Croatian, Lithuanian, and Latvian; and I doubt there's any such language that uses the alphabet (besides Archaic to early Koine Greek, of course). — Eru·tuon 01:20, 3 April 2016 (UTC)
I'm honestly inclined not to. We don't have the time nor the space to explain every small general truth about a language. If someone is redirected to an entry, it is obvious that that entry is the correct form. —ObsequiousNewt (εἴρηκα|πεποίηκα) 02:39, 3 April 2016 (UTC)
Well, I suppose a long note explaining grave accent would be obtrusive, and I'm not sure where it would fit in an entry. But maybe if we have an appendix on accent, we could direct readers there for an explanation; maybe by linking a superscript breve after a form with final acute, like ψῡχή(`)? Kind of tacky, and it would have to be repeated in many places throughout declension tables, but it's basically the way Latin entries link to the appendix on IPA for Latin (see for instance magnus). — Eru·tuon 06:31, 3 April 2016 (UTC)
I'm fine with this. I cannot think of anything that would conflict, but then my knowledge of AG is only recently acquired. —JohnC5 02:52, 3 April 2016 (UTC)
Ancient Greek isn't the problem. There were some non-Greek languages contemporary to Ancient Greek that used the Greek alphabet, but I have my doubts that they would have borrowed the accents, which were fairly marginal until Byzantine times. Medieval and modern Greek used the same accents as Ancient Greek until a century or two ago, though I would guess the rules for the grave would probably be pretty much the same. There are some Greek dialects/languages such as Tsakonian that have used the polytonic Greek script, some Turkic languages spoken by Greeks, and other languages in the Balkans such as Albanian and some South Slavic ones that have used variants of the Greek alphabet, but I don't know if they used the grave accent. @Saltmarsh may know more about the Greek part of the picture. At any rate, I would suspect that there would be very little, if any, overlap on the longer words (I noticed τθὸ, τὸ and other such monosyllables, as well as ἁπὸ, in a sample of Tsakonian at w:Tsakonian language). Chuck Entz (talk) 03:33, 3 April 2016 (UTC)
I know less than little about polytonic forms of Greek, but my suggestion would be that if a wordform is attested it should be included as an "form of" entry (or fuller entry) - if it isn't attested we ignore it? I thought that avoided redirects. I haven't read all of the previous discussion, so apologies if this point has already been made!   — Saltmarshσυζήτηση-talk 05:50, 3 April 2016 (UTC)

A related question that was also discussed on my talk page is whether to have redirects from words with an enclitic-triggered final acute: for instance, ὄνομά ‎(ónomá) in the phrase τὸ ὄνομά μου ‎(tò ónomá mou, my name). Either ὄνομά ‎(ónomá) should have an entry that says "form of ὄνομα ‎(ónoma) used before an enclitic", or it should simply redirect to ὄνομα ‎(ónoma). For those who don't know about the Greek rules of accent, this final acute is used in (roughly) the opposite environment from final grave: final grave appears everywhere except before a pause or an enclitic, but this enclitic-triggered final acute only appears before an enclitic. So, this question is similar to the question regarding final graves. — Eru·tuon 06:31, 3 April 2016 (UTC)

I'm in favor of hard redirects in both cases, i.e. τὸτό and ὄνομάὄνομα. —Aɴɢʀ (talk) 16:45, 3 April 2016 (UTC)

@Angr, Chuck Entz, Metaknowledge, ObsequiousNewt, Erutuon, JohnC5, Saltmarsh: This is all very encouraging. FWIW, I'd be happy with soft redirects (form of entries), if hard redirects are felt undesirable. The main thing, I think, is that this be bot-enforceable, since manually creating these redirects would be a painfully tedious, mammoth task. Does anyone here have the will and the wit to write such a bot? — I.S.M.E.T.A. 12:43, 4 April 2016 (UTC)

I can look into it later, but I'm busy trying to rewrite Module:grc-conj at the moment. —ObsequiousNewt (εἴρηκα|πεποίηκα) 23:06, 4 April 2016 (UTC)
@ObsequiousNewt: Thank you; that would be great. Good luck with rewriting Module:grc-conj! — I.S.M.E.T.A. 14:32, 13 April 2016 (UTC)

Derived vs. Related terms[edit]

I've been vacillating over whether to place words deriving from the root of a verb in Derived terms or Related terms: for instance, δεῖγμα ‎(deîgma), which derives from δεικ- ‎(deik-), the root of δείκνῡμι ‎(deíknūmi). I expect that term is unambiguously a derived term, because it looks like it began to be used in Attic Greek, so it was probably created by native speakers from the root of the verb and the suffix -μα ‎(-ma) (with regular sound changes) rather than inherited. But words like Πειθώ ‎(Peithṓ) must be pretty older, and no new words are readily formed from the suffix *-ώ ‎(*-ṓ). And then there are words with a particular ablaut grade, such as λόγος ‎(lógos); should it be considered as derived from λέγω ‎(légō) or simply related? And then there are words like θέσις ‎(thésis) that could have been inherited, but could also have been created pretty easily, because the root and suffix were both productive in Greek. And πίστις ‎(pístis) and πεῖσις ‎(peîsis) both derive from the same root or verb, but the first was likely inherited (if unusual sound changes or lack thereof are an indication) and the second was more likely created. Perhaps then the first is related and the other derived?

So, lots of confusing cases. Is there any way to decide between derived and related in reference to verbal roots? I guess the question is, does derived mean coined during a certain time period of the language (Proto-Hellenic, pre-Greek alphabet, Archaic Greek, Classical, or Koine), or can it include creations that occurred during PIE? — Eru·tuon 01:23, 9 April 2016 (UTC)

I would go with "Related terms", because derived terms are also related- though many would assume that "related" means the terms are independent of each other. I would reserve "Derived terms" for clear cases of one term being derived from another. Chuck Entz (talk) 02:06, 9 April 2016 (UTC)
I have always used "Derived terms" for words that are formed with productive suffixes/prefixes (or compounds), and "Related terms" for any other etymologically related words. I don't know how in accordance with policy this is, though. —ObsequiousNewt (εἴρηκα|πεποίηκα) 19:15, 10 April 2016 (UTC)

Dialectal forms in inflection tables[edit]

I'm starting to think that maybe we should do this differently.

We obviously want to show κούροισῐ(ν) and not **κούροις in the inflection table for κούρη. This is non-negotiable in my eyes. But the tables we have are, well, first of all, they're huge. Anywhere from half to two-thirds of the cells have footnotes. There are upwards of a dozen footnotes per adjective, or as many as forty per verb, and keep in mind this is only a summary—I'm generalizing the dialects into Attic, Epic, "Aeolic", and ""Doric"" and simplifying many of their features, while leaving out completely dialects like Boeotian or Arcadian or Cretan. The space for dialectal forms is nearly as tall as (or even taller than) the table itself, even when split into two columns. Second of all, it's an overload of information, information which is probably irrelevant to most viewers, who either (a) don't know Ancient Greek and don't care about dialectology, or (b) are classicists and also don't care about dialectology. Third of all, the program currently puts every marker in by default, which means that even hapax legomena (and, we had a discussion once about this, and I support putting whole tables for them) have full information on what the word would like in the dialects, which is simply not information that we can confidently provide. Plus it means we have to do things like explicitly set dial=~dor/aio for any word that has a non-original eta in it. Every single derivative of δῆμος should not need to be qualified that way.

Therefore, I propose that we get rid of the "dialectal forms" box. We should probably replace it with a note like "For inflection of dialectal forms, see Ancient Greek dialectal inflection." which isn't great wording but you get the idea. —ObsequiousNewt (εἴρηκα|πεποίηκα) 05:00, 16 August 2016 (UTC)

Pings only work when followed by a signature in the same paragraph. Chuck Entz (talk) 02:26, 17 August 2016 (UTC)
Seems like they also have to be added in the same edit as the signature. — Eru·tuon 04:16, 18 August 2016 (UTC)

Let's try this again: User:I'm so meta even this acronym, User:JohnC5, User:Angr, User:Erutuon ObſequiousNewtGeſpꝛaͤchBeÿtraͤge 04:30, 18 August 2016 (UTC)

I don't like the ugly mess of dialectal forms at the bottom of Attic–Koine inflection tables, but it's really cool that Wiktionary actually displays the forms. I don't know of any other website where you can see them. Perseus can recognize dialectal forms, but it doesn't display paradigms (or maybe I'm just not technologically adept to figure out how to do it).
I would rather see the dialectal forms split into separate tables, at least for the dialects that are actually attested in literature. Then you can view coherent paradigms, rather than a jumble of forms.
That would make Wiktionary more useful for someone who's reading Homer or some other work not in Attic or Koine, and wants to figure out what case and number (or person, number, voice, etc.) a form is, or wants to get a feeling for what the paradigm looks like in a particular dialect. If you're reading Homer, you could click on the Epic table and view the inflection.
Maybe that's a tiny fraction of Wiktionary users, though. Hard to know. If so, perhaps the work involved in making sure δῆμος ‎(dêmos) doesn't get Aeolic and Doric endings stuck onto it isn't worth it. I like to be able to see dialectal forms (actually, pretty much just Epic; I don't know Aeolic and Doric), but I'm probably atypical. — Eru·tuon 06:01, 18 August 2016 (UTC)
There's always the option of nested collapsible boxes, so that those who don't want to see the extra content don't have to. —This unsigned comment was added by Chuck Entz (talkcontribs). 13:19, 18 August 2016 (UTC)
I think showing separate tables for different dialects is a good idea. Certain forms always correlate to certain other forms. For example, an Ionic writer will always use forms with ē and never those with ā, so it's not really possible to find one case form with ē next to another case form with ā in the same Ionic text. —CodeCat 14:32, 18 August 2016 (UTC)
@ObsequiousNewt, Chuck Entz, Erutuon, CodeCat: I would also be sad to see all the dialectal variants go, but agree that the current presentation is rather cluttered; Chuck's idea of presenting them in nested collapsible tables is the best solution to this that I've heard so far. — I.S.M.E.T.A. 17:35, 18 August 2016 (UTC)
@I'm so meta even this acronym, Chuck Entz, Erutuon, CodeCat: I like this solution better than what we have now, but I'm not sure I like it enough to be satisfied with it. Pursuant to my third reason, I would prefer to limit it to those dialects that are actually attested, rather than showing everything by default. For example, μοῖρα would have collapsible tables for Epic and Ionic (but not Aeolic?) but τετράμοιρος, being a hapax limited to Attic, wouldn't have any. The problem with this is that figuring out every dialect that something exists in can be difficult (I had to search through a dozen words until I found an example that I could easily check was Attic only) and only changes the form of my third problem with the current situation. There's also the problem that, as I stated in my first reason, there are a lot of dialects I'm leaving out, some of which vary by just one form, so it's wise to keep in mind that there are more than five dialects that we're dealing with... I think Buck has something like two dozen, plus groups ("Doric", "Aeolic", Northwest, Ionic, East, West, etc.) which should ideally be all considered, and this not only increases the amount of space that will be necessary, but also the amount of work that will have to be done when looking for attested dialects. ObſequiousNewtGeſpꝛaͤchBeÿtraͤge 20:21, 18 August 2016 (UTC)
Well, only some of the dialects have literature written in them. Other dialects, I assume, are only found in inscriptions: Arcadocypriot, Cretan, Elean, Pamphylian, Boeotian, Euboean, Northwest Greek, etc. At the minimum, the tables should cover literary dialects. I don't know where one would find information on epigraphical dialects anyway, and what the purpose of covering them would be. — Eru·tuon 20:37, 18 August 2016 (UTC)
Some people study epigraphy. Myself, for example. Here is a database of what is probably most inscriptions, and here is a database of what is probably most papyri. "Literary dialects" are important, to be sure—it's where the Attic/Epic/Ionic/Doric/Aeolic distinction comes from—but I see no reason to say they are more important than non-literary dialects. ObſequiousNewtGeſpꝛaͤchBeÿtraͤge 00:26, 19 August 2016 (UTC)
Okay, so there would be a group of people interested in inscriptions and their inflected forms. I would still say literary dialects are more important in the sense that there are likely to be more people reading Ancient Greek literary works than inscriptions. I wouldn't object to including non-literary dialects. It would be interesting. — Eru·tuon 01:29, 19 August 2016 (UTC)
I think we should have a distinction between dialects that are displayed by default and dialects that are not displayed by default. The former would be automatically populated unless suppressed for that table, while the later could either be only manually populated, or automatically populated only when enabled for that table. Chuck Entz (talk) 01:33, 19 August 2016 (UTC)
@ObsequiousNewt, Erutuon, Chuck Entz: The declension tables have (or, at least, used to have) a footnote saying "Not all forms, especially dialectal forms, are necessarily attested. Use with caution." Does that not mitigate this concern? — I.S.M.E.T.A. 16:50, 19 August 2016 (UTC)
I'm sure there are dialects with small corpora where many words simply aren't attested in any form. I'm not familiar with the workings of the templates, so I could be mistaken, but- in such cases, aren't we showing inflections for words that may have never existed at all in those dialects? After all, it's one thing to apply attested inflectional morphemes to attested roots to produce unattested forms, but applying them to unattested roots risks possibly masking regional distribution patterns. Chuck Entz (talk) 18:04, 19 August 2016 (UTC)
@Chuck Entz: Even for widely attested dialects, it's generally impossible to search the entire corpus to find if a word is attested, which is why I don't think any dialect should be shown by default. ObſequiousNewtGeſpꝛaͤchBeÿtraͤge 02:53, 22 August 2016 (UTC)

Okay, so I feel right now like the best solution is to get rid of showing any dialect by default, but have—honestly, I would prefer just having separate tables rather than trying to do something with Javascript, as it's easiest—for any dialect explicitly mentioned by LSJ. So, to pick a word at random, οὐρανός would have tables for Attic (unmarked), Severe Doric, Bœotian, and Æolic (which I'm inclined to call Lesbian per Buck) and rely on an Appendix:Ancient Greek dialectal declension for any other forms, because even though they exist (e.g. Epidauran), searching for all possible forms is difficult to impossible—I'd have to check at least eleven forms in Packhum, which wouldn't even be exhaustive because of line breaks, *and* there are plenty of texts (e.g. most papyri) that aren't digitized or public or searchable. Is this solution acceptable to everyone? Pinging again, since it's been a month: @I'm so meta even this acronym, JohnC5, Angr, Erutuon, CodeCat, Chuck Entz ObſequiousNewtGeſpꝛaͤchBeÿtraͤge 18:02, 16 September 2016 (UTC) — IFYPFY. — I.S.M.E.T.A. 23:10, 12 October 2016 (UTC)

That is acceptable to me. —JohnC5 18:08, 16 September 2016 (UTC)
Me too. And don't worry too much about obscure dialectal forms. If something attested in a papyrus that isn't digitized or public or searchable, then it isn't "likely that someone would run across it and want to know what it means". —Aɴɢʀ (talk) 18:35, 16 September 2016 (UTC)
I was going to suggest that editors explicitly select which dialects to display, and then the module would generate separate tables for each. The current behavior of selecting which dialects not to display and sticking all but the Attic in a bunch of footnotes is a bit odd. Regarding οὐρανός ‎(ouranós), it should also have Ionic and Epic. — Eru·tuon 19:24, 16 September 2016 (UTC)
@ObsequiousNewt: This is not my favorite solution, but I am content with it, especially because of the excellent and detailed survey of dialectal variation afforded by Appendix:Ancient Greek dialectal declension. One thing, though: how do I get multiple dialects to appear? I was thinking of Δῑδώ ‎(Dīdṓ), for which I need the table to display both the Attic and the Ionic forms; I tried adding a |dial=att/ion parameter, but that didn't work. What's the secret? — I.S.M.E.T.A. 23:30, 12 October 2016 (UTC)
@I'm so meta even this acronym: There currently isn't one, but I think it should be... at least moderately easy to implement. I'll get onto it sometime in the near future, I have a few things on my plate—I'm currently working on rewriting grc-translit to make it more editor-friendly and also probably more accurate, and I might prop up a survey regarding it as soon as I have finished reading through Gignac's grammar of Koinê (and whatever else I find to cross-reference it; if you know any good resources I'd be grateful.) ObſequiousNewtGeſpꝛaͤchBeÿtraͤge 23:46, 12 October 2016 (UTC)
@ObsequiousNewt: Cool. In the meantime, I'll just add a second table, redundant as it might seem. Just holler at me when they can be combined; no rush, of course. I wish I could help re Koinê, but you are definitely the superior of the two of us when it comes to Greeking, so I couldn't even recommend anything. Sorry. — I.S.M.E.T.A. 00:02, 13 October 2016 (UTC)

Most common words[edit]

Wiktionary doesn't have a frequency list for Ancient Greek. Is there any place we can get some frequency lists, say for different types of literature: Attic prose, tragedy, Epic poetry, or the Septuagint and New Testament?

Perseus has a Vocabulary Tool, but it doesn't let you select all the AG works on the list. (I tried doing a list from Plato's works, and the output is messy, with many repetitions and some doubtful results. Not sure how to clean it up.) I do have an Attic Greek vocabulary book from the University of St Andrews that I could type up a list from, but that's a lot of work. I generally use the Dickinson Greek Core Vocabulary when trying to come up with example words, or to get some inspiration on which entries to work on, but the site doesn't explain how that list was compiled.

Some lists would be helpful, because they would help me determine which entries most need work, with the goal of helping readers of AG works. (Other somewhat different goals would be creating entries for words that are used as roots for English words or taxonomic names.) — Eru·tuon 06:19, 8 October 2016 (UTC)

Frequency is hard, because inflected forms. Perseus' vocabulary tool does not work like that; it just estimates based on I don't know what. It is 100% unreliable. In terms of what is useful, I'm more inclined to fix existing entries, and add important entries which are missing, than to try to compile frequency-lists. That's the kind of thing I'd want a TLG subscription for anyway, although Perseus does have a fairly good corpus. ObſequiousNewtGeſpꝛaͤchBeÿtraͤge 07:03, 8 October 2016 (UTC)
Ancient Greek has big holes in its attestation, so frequency is a poor indicator of importance. Besides, there are Ancient Greek writings that codify core values of Western Civilization, and others that are corpus filler. A hapax legomenon in the former is a big deal, but almost everything in the latter isn't. Chuck Entz (talk) 15:54, 8 October 2016 (UTC)

Perhaps I'll type up the St Andrews lists then. They probably have some sensible way of selecting the most useful Attic vocabulary. — Eru·tuon 03:41, 25 October 2016 (UTC)

Perseus under PhiloLogic has word frequency tools, but I haven't figured out how to get them for more than one document yet. But you could certainly grab them for the Iliad and Odyssey, and then that's already two huge corpora. In fact, here is a page of Homer. ObſequiousNewtGeſpꝛaͤchBeÿtraͤge 16:29, 25 October 2016 (UTC)

Hi @Erutuon, I just saw this discussion. I have made these types of lists in the past. It's possible to somewhat mitigate the problem with the Perseus lists using a Bayesian filter. The other issue with Perseus is the slowness and brittleness of the interface, especially when using multiple works. I will post something here soon, but it will be preliminary. I was going to do this eventually for the purpose of guiding some LSJ imports I have been contemplating. Isomorphyc (talk) 12:31, 26 October 2016 (UTC)

Here is a preliminary list: User:OrphicBot/Sandbox/Naive Perseus Concordance. It stands to be improved, but it is probably pretty close to what one would get from selecting everything in Perseus. Isomorphyc (talk) 18:20, 26 October 2016 (UTC)
A few more:
* User:OrphicBot/Sandbox/Naive Perseus Concordance - Homer
* User:OrphicBot/Sandbox/Naive Perseus Concordance - Plato
* User:OrphicBot/Sandbox/Naive Perseus Concordance - Attic Drama
* User:OrphicBot/Sandbox/Naive Perseus Concordance - Koine
Isomorphyc (talk) 18:57, 26 October 2016 (UTC)

Symbol to mark apocope[edit]

What symbol do we want to use to mark apocope? I just created δ' and τ' using ' (U+0027 APOSTROPHE), with hard redirects from forms using (U+2019 RIGHT SINGLE QUOTATION MARK) and ᾿ (U+1FBF GREEK PSILI). Then I found ἀλλ᾽ using (U+1FBD GREEK KORONIS). So what should we use as the primary form? Whichever we pick, I do think there should be hard redirects from the others. —Aɴɢʀ (talk) 23:30, 25 October 2016 (UTC)

I like the look of the coronis better, but the practice you described elsewhere relating to French, of using the plain apostrophe in entry names and a nicer-looking character in headword templates, would make sense here too. — Eru·tuon 23:34, 25 October 2016 (UTC)
I prefer the coronis. The argument that is usually employed in favour of ' vs. is that the former is a lot easier to type on QWERTY keyboards than the latter; that doesn't really apply to polytonic Greek. I'm happy with redirects from the ' form, of course. Are the psile redirects necessary? Do spellings with it ever occur? — I.S.M.E.T.A. 23:38, 25 October 2016 (UTC)
Well, they occur at User:ObsequiousNewt/freq-hom and at τε#Derived terms, so certainly some people use psili in digital renditions of Ancient Greek. —Aɴɢʀ (talk) 23:49, 25 October 2016 (UTC)
Perhaps we should ask the question of which is most commonly used. I searched my text files of the whole Iliad and Odyssey (since that's the only giant work that I have a text file of), and they have far more ugly regular apostrophes than pretty right single quotation marks (i.e., about 8–10,000 versus fewer than 100). The spacing smooth breathing and spacing coronis did not occur at all. If there are redirects, it doesn't really matter which one we use, but hey, the ugly apostrophe seems to be more common. — Eru·tuon 03:49, 26 October 2016 (UTC)
Pinging ObsequiousNewt for his opinion since he hasn't stated one yet. —Aɴɢʀ (talk) 15:29, 26 October 2016 (UTC)

The character used by Perseus is U+1FBD, the coronis, which oddly doesn't normalize to any other character but does normalize to space + psili (U+0020 U+0313). However, the coronis is properly the symbol used to mark crasis, not elision. The symbol used to mark elision is rather the "apostrophe", which does seem to imply we should be using U+0027. I put coronides on my list because I did not know this. ObſequiousNewtGeſpꝛaͤchBeÿtraͤge 19:44, 26 October 2016 (UTC)

Still, if you look at a modern printed text (e.g. OCT or Loeb), the apostrophe looks identical to the smooth breathing and the coronis, and true crasis always has a nonspacing coronis over a vowel (doesn't it?), not a spacing coronis next to a letter. Since Unicode provides us with a spacing coronis, why not make use of it? Also, I notice Perseus uses U+1FBD GREEK KORONIS as its apostrophe. —Aɴɢʀ (talk) 20:43, 26 October 2016 (UTC)

@Angr, Erutuon, ObsequiousNewt: The derived term «οἷός τ᾿ εἰμί» was added to τε by Erutuon in this revision (in case that matters at all). ObsequiousNewt stated that he “put coronides on (mostly) everything > 100” in User:ObsequiousNewt/freq-hom, although the character that actually occurs there is ᾿ (U+1FBF GREEK PSILI); I think this is because ObsequiousNewt's coronides underwent NFKC normalisation ( [U+1FBD GREEK KORONIS] and ᾿ [U+1FBF GREEK PSILI] both undergo compatibility decomposition to ␠ [U+0020 SPACE] +  ̓ [U+0313 COMBINING COMMA ABOVE]; ␠ +  ̓ then undergoes canonical recomposition to ᾿ [U+1FBF GREEK PSILI]; see w:Unicode equivalence). As Erutuon said “if there are redirects, it doesn't really matter which one we use”, but using a properly-displaying character in the PAGENAME at least means that we don't need to specify that character using |head= parameters. ObsequiousNewt correctly states that the coronis properly marks crasis, not elision; however, crasis is just a kind of contraction, which leads me to regard the analogy of English (and other languages), wherein the apostrophe indicates both omission and contraction, as encouraging the use of the coronis for marking elision as well as crasis. The psile is unsuitable for marking elision because it indicates smooth breathing, whereas some cases of elision involve rough breathing (e.g. ἐπίἐφ᾽ and κατάκαθ᾽); the dasia is even less suitable because it indicates rough breathing (whereas most cases of elision involve smooth breathing) and because it isn't even the right shape. To my mind, our choice is between (U+1FBD GREEK KORONIS) and ʼ (U+02BC MODIFIER LETTER APOSTROPHE), with the latter arguably representing the better semantic choice (though cf. my argument from the analogy of other languages, above); to my eyes, the former looks right, whilst the latter is the wrong shape. — I.S.M.E.T.A. 22:07, 26 October 2016 (UTC)

Could Module:headword or Module:headword/templates be made to automatically replace a plain apostrophe with coronis? Then we wouldn't have to specify the character in the |head= parameter. — Eru·tuon 22:14, 26 October 2016 (UTC)
Why do we need to use U+02BC and not just U+0027? In fact, I'm pretty sure that U+02BC is meant for glottalization, ejective, etc., it being a modifier letter. ObſequiousNewtGeſpꝛaͤchBeÿtraͤge 23:38, 26 October 2016 (UTC)