Definition from Wiktionary, the free dictionary
Jump to navigation Jump to search

Deletion debate[edit]

Green check.svg

The following information passed a request for deletion.

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.

Was marked as {{delete}}, but since I don't know anything about Japanese... Mglovesfun (talk) 08:53, 6 September 2009 (UTC)

Added an essentially same entry 歩いて to the subject.
In my opinion, it is a combination of two words, a verb form 歩い and a particle て, and should be considered as SoP. Including this kind of combinations can lead to a disastrous situation, exactly like that in other areas where we dare to exclude sum-of-parts entries. --Tohru 14:06, 6 September 2009 (UTC)
Contrary to the entry, this is not an adverb in Japanese. It is the verb 歩く (aruku) conjugated to aruki with the -te suffix. Popularly called the "te-form", the medial -k- drops out in colloquial language.
Japanese verbs (and adjectives) conjugate and various suffixes attach to those conjugations. Listing all of those patterns is not both not practical nor very realistic. Thus the norm found in all dictionaries is to list it in a base form recognized by Japanese speakers. That is 歩く here. But just to give an idea how unpractical it is to list other forms, no matter how useful they may be to learners, here is a basic list of entries that would need to be created just for this one verb:
  • 歩いて
  • 歩いた
  • 歩いたら
  • 歩いたり
  • 歩かぬ
  • 歩かず
  • 歩かない
  • 歩かなかった
  • 歩ければ
  • 歩けれど
  • 歩けれども
  • 歩きます
  • 歩きました
  • 歩きません
  • 歩きませんでした
  • 歩け
  • 歩けよ
  • 歩こう
  • 歩かなくて
  • 歩ける
  • 歩けない
  • 歩けます
  • 歩けなかった
  • 歩けません
  • 歩けませんでした
  • 歩かせる
  • 歩かせない
  • 歩かせます
  • 歩かせません
  • 歩かせませんでした
  • 歩かれる
  • 歩かれない
  • 歩かれなかった
  • 歩かれません
  • 歩かれませんでした
  • 歩かせられる
  • 歩かせられない
  • 歩かせられなかった
  • 歩かせられます
  • 歩かせられました
Also, you will need to create entirely hiragana versions for each as well. And then romanized versions as well. We have now just tripled the list. And this list is hardly even comprehensive; there are many more patterns and variations. Now duplicate for each of the hundreds (maybe thousands) of verbs. This is insane and needs to be avoided. Bendono 14:26, 6 September 2009 (UTC)
Why is this a problem? We already add conjugated verb forms in Italian, Latin, Spanish, and French. Latin has more than 100 inflected forms for a regular verb, yet the number of inflected forms hasn't been an impediment to creating those entries. Since the forms follow patterns, we use bots to generate the forms. There's no reason I can see for not doing the same in Japanese. --EncycloPetey 16:21, 6 September 2009 (UTC)

Yes, trying to include all words of all languages is insane, but we try to do it nonetheless. The number of forms you mention is not very large compared to Italian verbs (no usual dictionary would list all Italian forms included here). This is a general comment, because I don't know Japanese. Lmaltier 14:35, 6 September 2009 (UTC)

Then here are my candidates, though this is still far from completion:
I guess the list can be longer than 1,000 entries for sure (I will do so if such a demonstration is actually needed). This is the situation we have to handle per each Japanese verb when accepting such combinations. And I just don't know how to set appropriate criteria for them. --Tohru 14:48, 6 September 2009 (UTC)
Thanks for the laugh, Tohru. 散歩でも歩きましょう Bendono 14:57, 6 September 2009 (UTC)
If you can learn to use a bot like the one we use for generating forms of Spanish verbs, then you could successfully create as many Japanese verb forms as you like in a very short time. --EncycloPetey 16:21, 6 September 2009 (UTC)
If this is a completely regular agglutinative action, then other than a lack of spaces, I don't see a difference between these and potential English entries such as have not been speaking, or the decision to not include the entries for English possessives. As the issue is presented I support deletion. However, if there exist any irregular combinations, I would conclude that including them would be needful, tho all but the irregular cases could easily be handled with bot support as EP points out. — Carolina wren discussió 16:39, 6 September 2009 (UTC)
You're missing the point: almost all of those, including the entry up for deletion, are not verb forms. They are sum of parts. More specifically, the verb aruk- (walk) has only four distinct forms: aruk-a, aruk-i, aruk-u, and aruk-e. (No, I am not forgetting aruk-o; leave me a message if you are curious.) Every thing above is derived by attaching various suffixes to these forms, occasionally followed by specific phonological changes. What would the POS be? Verb is not appropriate. Perhaps Quasi-Verb Phrase? Partial Predicate? The whole concept of a headword for Japanese is completely screwed up here on Wikipedia. Creating entries for the above would only compound the problem further yet. Bendono 16:41, 6 September 2009 (UTC)
Plus, I should have noted that I omitted variant forms that Bendono mentioned above, from the list. Once counting them, the number will easily reach ten thousand. Please don't forget it is the number of entries belonging to one verb. --Tohru 17:09, 6 September 2009 (UTC)
I hate to point this out, mainly because I'm against it, but we have lots of Spanish 'contraction' entries like llámame (call me) which I'd quite like to see deleted, but nevertheless they're here. Mglovesfun (talk) 17:49, 6 September 2009 (UTC)
It may seem impractical and terribly difficult to you, but we have exactly the same situation in hundreds of other languages, many much worse. In Arabic a verb can easily have over 20,000 different forms, and each form can be spelled in a multitude of ways. We include these forms, certain the more standard forms such as your -te verbs. The part of speech would be verb form, with an explanation in the definition line that it is the conjunctive of 歩く. Just as we do with Spanish and French, these Japanese verb forms can be handled by a bot. —Stephen 18:36, 6 September 2009 (UTC)
RE Bendono: I am not missing the point. The exact situation you describe exists in Hungarian, where the plan is to create the entries, even though we don't (yet) have a bot to handle those. Hungarian uses attached postpositions (like suffixes) instead of prepositions. Words formed by the attachment of a suffix are vaible as entries here. Especially so since a non-native or learner of the language may not recognize the suffix for what it is. --EncycloPetey 20:21, 6 September 2009 (UTC)
<joking> Well there's a way to get ahead of French wiktionary, just add hundreds of thousands of verb forms from every language imaginable . . . :) </joking> I don't really have an opinion on whether or not these get added/stay or not, possibly tending toward keep. L☺g☺maniac chat? 21:11, 6 September 2009 (UTC)
I think there is a key distinction between suffixes and particles, certainly for languages such as Korean and Japanese (and, I think, Hungarian). My understanding has been that, while we include root+suffix forms -- that is, true inflections -- regardless of quantity, we do not include word+particle, which are just two words that happen to be written as one, as with the English 's. In the name of sanity and all that is holy, I hope we will continue to maintain this distinction. I don't know how/if this issue applies to the above list . In Korean, at least, some "verb forms" are real forms and some are not -- 하겠어 is a true inflection, but 하겠어요 is inflected form + polite particle. -- Visviva 23:26, 6 September 2009 (UTC)
How do you define "particle" for this disctinction? Most of my Latin books dealing with the subject indicate there is a very fuzzy line between suffix, particle, and inflectional ending. Would you modify your definition when you consider that many, many Latin verbs are formed by prepending a preposition to a base verb? (See the derived terms under Latin sum (I am), for example). These aren't formed from prefixes in Latin, as similar words are in English, because the prepended item is a word (preposition) in its own right. Likewise, Hungarian adds postpositions to its nouns, and functionally these are inflectional case endings (and the endings are treated as such in grammars). For Spanish, a pronoun (or two) is often added to the end of a verb, and we include these words (e.g. dímelo) even though the added ending is not much different from the "polite particle" you mention. --EncycloPetey 23:32, 6 September 2009 (UTC)
I would define particle as "something that authoritative grammars agree is a particle". :-) That is, I think that any decision on what does and doesn't count as a word in a language needs to be bookended by a serious review of native-language and Western grammatical literature. The less similar a language is to English, the more critically necessary such a review is. The fact that I haven't seen anybody citing even, say, Martin's Reference Grammar of Japanese gives me pause that we would be making any sweeping judgments here. I don't know the first thing about Hungarian, but previous discussions here had suggested that the particle/suffix distinction was fairly strong in Hungarian grammar. If that's true, I hope that we would take this distinction seriously, rather than striking out on our own.
In the case of Korean, the South Korean and Western grammatical traditions, AFAIAA, concur in distinguishing particles from suffixes, which actually create new words/forms. (North Korean grammarians tread a somewhat different path, as one might expect, but the NK grammar texts I have managed to acquire are not really authoritative.) My initial efforts at treating Korean noun-particle combos as declined noun forms were rebuffed, and I have come to believe that this is correct -- both as a matter of grammatical fact and a matter of best Wiktionary practice. Regarding the polite particle , which can glom onto anything, verb, noun, adverb or determiner, with consequences that are pragmatic/discursive rather than semantic or syntactic, I can't imagine what purpose including its compounds would serve. -- Visviva 04:32, 7 September 2009 (UTC)

Keeping an eye on the bigger picture here, we're trying to be a useful dictionary, i.e. a resource to which someone can turn when they see an unfamiliar word (or idiom) and need a definition. If inclusion of the above will enable someone to obtain that benefit, and said phrase can not be readily understood by reference to its component parts, then we should include it. Quite frankly, since this is the English Wiktionary, and our readers are less likely to be able to figure out how to put together strings of Asian characters, then we should lean towards being more inclusive of such matters. bd2412 T 03:39, 7 September 2009 (UTC)

But if it can be demonstrated that these are simply collocations -- that is, independent words written together, which would be unsurprising in a spaceless language like JA -- there is an easy solution; include any collocations that are common enough to be plausible searchterms in the entry for the content word. Problem solved: no spurious entries, and users can easily find the information they need -- indeed, more easily than if we had a separate, content-free entry for each such collocation. Again, I would just like to see some authoritative sources on which items from the above list are, in fact, inflected forms. Might be all of them for all I know. -- Visviva 04:32, 7 September 2009 (UTC)

(e/c) User:Carolina wren has made a good comment. All of the above forms completely and automatically generated by a regular agglutinative process of attaching various particles and suffixes. So far there have been many comments about comparison with other languages, but beside myself and Tohru, few from anyone who actually speaks Japanese. So to give an idea what some of the above phrases mean, here is a brief selection with English translations:

  • 歩かせられなかった
    was not made to walk
    歩か ― 歩く (aruku), verb, imperfective form
    せ ― せる (seru), auxiliary verb, imperfective form
    られ ― られる (rareru), auxiliary verb, imperfective form
    なかっ ― ない (nai), verb, continuative form
    た ― (ta), auxiliary verb, terminal form
  • 歩いたときでなくても
    even though not the time that (I) walked
    歩い ― 歩く (aruku), verb, continuative form
    た ― (ta), auxiliary verb, attributive form
    とき ― (toki), common noun
    で ― (da), auxiliary verb, continuative form
    なく ― ない (nai), adjective, continuative form
    て ― (te), continuative particle
    も ― (mo), binding particle
  • 歩いたとするならば
    if (one) assumes that (I) walked
    歩い ― 歩く (aruku), verb, continuative form
    た ― (ta), auxiliary verb, terminal form
    と ― (to), case particle
    する ― する (suru), verb, terminal form
    なら ― (da), auxiliary verb, hypothetical form
    ば ― (ba), continuative particle
  • 歩いていただかなければ
    if (I) could not have (you) walk
    歩い ― 歩く (aruku), verb, continuative form
    て ― (te), continuative particle
    いただか ― 頂く (itadaku), verb, imperfective form
    なけれ ― ない (nai), auxiliary verb, hypothetical form
    ば ― (ba), continuative particle
  • 歩きたいかな
    (I) may want to walk
    歩き ― 歩く (aruku), verb, continuative form
    たい ― たい (tai), auxiliary verb, terminal form
    か ― (ka), sentence-final particle
    な ― (na), sentence-final particle
  • 歩きたくないときにも
    even when not wanting to walk
    歩き ― 歩く (aruku), verb, continuative form
    たく ― たい (tai), auxiliary verb, continuative form
    ない ― ない (nai), adjective, attributive form
    とき ― (toki), common noun
    に ― (ni), case particle
    も ― (mo), binding particle

As BD2412 said, we are trying to create a "useful dictionary, i.e. a resource to which someone can turn when they see an unfamiliar word (or idiom) and need a definition." I fully agree. However, a learner of English should not be able to expect to look up non-idiomatic phrases such as even though not the time that I walked and find a definition and translation any more than the reverse situation. Bendono 04:45, 7 September 2009 (UTC)

FYI, I segmented the above phrases based on a practical version of the Japanese school grammar, UniDic [1], which is used by The National Institute for Japanese Language [2] to annotate the biggest Japanese corpus ever built (Modern Written-Japanese Balanced Corpus [3]). You can see something similar is going on here between these Japanese and English constructions. --Tohru 17:04, 8 September 2009 (UTC)
he'D>FUZY BOUNDARYw/PHRASEBOOK here--onceUSERhasthatINFO>pushthe buton>wp,books etc,integrated oras isnow,4further elaboration'n'EFICIENTlearnin.
btw,awcanweNOT'v wii as aJ-entry??[i'd2go2wp to c the kana&ipa4engl,grrr..:(--史凡>voice-MSN/skypeme!RSI>typin=hard! 05:57, 7 September 2009 (UTC)

Delete. First of all, I always try to defer to those informed on the language in question, who all seem to be favoring delete. Additionally, both Carolina wren and Visviva have made some prudent and subtle distinctions. Take the first word of γλαῦκ’ εἰς Ἀθήνας for example, it is an abbreviated form of polytonic {{γλαῦκες}}. The dropping of the last couple letters does not form a distinct word, but it a regular feature of Ancient Greek morphology. As such, we absolutely cannot make an entry for γλαῦκ’, as every single word in Ancient Greek (and every inflection of those words) is subject to the same possible droppings. To be sure, we are not a paper dictionary, and can include a lot more than paper can, like inflected forms, but we need the SOP rule to make the project feasible. We need to expect a minimum of knowledge about the language from our readers, otherwise we'll end up having to have an entry for all possible sentences in the language. -Atelaes λάλει ἐμοί 06:19, 7 September 2009 (UTC)

Japanese natives have a very weak sense of what at word is in Japanese. Since the writing contains no spaces, words are not delimited in the spelling. Speakers of Indo-European languages, OTOH, have a strong sense of what a word is. When we transcribe Japanese to Roman script, we invariably spell these forms as a single word, not as a verb plus a particle. We write mite, ite, tabete, kite, shite, hataraite, aruite, itte, hanashite, atte, kaette, notte, sunde, yonde, katte, de. We NEVER write mi te, i te, tabe te, ki te, shi te, and so on. Some call this the conjunctive form, but most grammars that I have seen simply refer to it as the gerund (like English -ing words). In my experience as a linguist, particles are always separate words. The postpositions may be considered particles: ga, wa, o, ni, e, kara. Also the sentence-ending words such as ne, ka, zo, yo are particles. In my definition of particles, inflexions and suffixes, the -te of the gerund (hataraite, shite, sunde, kite) is a suffix, not a particle. —Stephen 09:37, 7 September 2009 (UTC)
I can go with this. If the people who actually work on Japanese entries consider that this is not a word, then delete, absent some strong evidence that they are mistaken. Now, if people only could have shown that same consideration on some Korean RFDs with hideous results.... -- Visviva 07:43, 7 September 2009 (UTC)
Let me explain how, in my opinion, a good online dictionary could handle this. A pop-up dictionary Perapera-kun (Mozilla Firefox Japanese dictionary plug-in) knows that 歩いて is a form of 歩く and translates it as such, ie. -" to walk", NJstar Japanese Word Processor displays the following (like with any verb form): 歩いて【あるいて】 <Verb - Gerund>; (v5k,vi) to walk; (P). It can even generate verb forms from a dictionary form. It would be ideal if we had this here and not just for the Japanese language but the implementation seems complicated. --Anatoli 08:55, 7 September 2009 (UTC)
This would be an excellent way for a mirror or other reuser to handle it, IMO. And IMO, if we take care of the content that matters, mirroring will take care of itself. Being prisoners of this incarnation (on ill-suited software running on servers administered by an organization that cares little for our needs), there is only so much we can do for the end-user. -- Visviva 10:16, 7 September 2009 (UTC)
If we collectively present this problem to the WMF people, I think they'll try to help us solve it. If I'm understanding you correctly, what we want is something that works like Google Translate-plus-definitions, yes? bd2412 T 15:58, 7 September 2009 (UTC)
Not sure Google Translate can always handle all forms correctly but there's sure some AI there, which looks not just at the dictionary forms of words. Can you give an example, please? --Anatoli 11:18, 8 September 2009 (UTC)
Give an example? Not really, no. I'm not a programmer! bd2412 T 04:37, 14 September 2009 (UTC)
This doesn't really seem that different from having bot-created entries for each inflected form. Which is what I assume we would want, long-term. -- Visviva 09:52, 11 October 2009 (UTC)

Keep, I'm the one who originally made the 歩いて page, and I was always taught that it was a verb form of 歩く. A lot of people post an extreme number of possible verb forms and I suppose it would be absurd to include all of those but I don't agree that those are verb forms! I would possibly call 歩きたい a verb form but 歩きたくなかったときには is not and I find it a misleading reductio ad absurdum. In the previous example 歩きたくなかった is the negative past form of 歩きたい, but とき and には are separate forms. This is my idea: The conjugation table for 歩く is not ridiculous:

The form in question is both included in the conjugation table and it has an equivalent translation in a lot of languages which makes it a good candidate simply for the reason of linking to "by foot". Simply let users create entries for forms in the conjugation table. --BiT 11:25, 22 September 2009 (UTC)

Strong keep. As Petey Mentioned above, Latin has quite a considerable number of verb forms. I think Lithuanian has more, if you count the participles and their forms. Lithuanian adjectives can have 150+ forms. Multiply 2 genders x 2 numbers x 7 cases x 3 degrees of comparison. Some of the forms coincide with each other, but not as many as say Slovenian. The issue here is differentiated between a verb form and verb phrase. "Aruite" is a verb form whereas 歩きたくないときにも "arukitakunaitokinimo" is a verb phrase. Verb forms should always be included, verb phrases should not (with conditions). — [ R·I·C ] opiaterein — 15:25, 8 October 2009 (UTC)

OK, now we're getting somewhere. Taking my own advice from above, I've taken an uninformed look into Martin's Reference Grammar of Japanese [4]. Martin has a rather lengthy discussion of these V-te forms in section 9.2, beginning on page 475; he uses the term "gerund". There does not seem to be any question that he considers these to be verb forms rather than verb+particle compounds; he calls them verb forms, writes them as a single word (which he does not do for particles), etc. Unless somebody has a better grammar that comes to a different conclusion, this is good enough for me. Keep, as ===Verb===. -- Visviva 09:52, 11 October 2009 (UTC)
Not really, and you are mischaracterizing Martin. Like grammatical past tense or other such constructions, Martin talks about a grammatical gerund. He writes this very appropriately as V-te (ex: kai-te, kai-de, kasi-te, kat-te, kot-te etc), ie with a dash followed by te. He is very careful about this point because Japanese grammar does not recognize a verbal te-from. If you read the whole discussion, it is nothing more than the adverbial form (which he calls infinitive here) with -te added.
Here is another point: "It is usually assumed the forms of the copula (such as da, na, no, ni ,de etc.) and the various postnominal particles (such as ga, o, kara, made,; gurai, dokoro, etc.) are attached to the noun to make a single phonological word" (page 34). So, shall we now add entries for 犬が, 犬を, 犬の, 犬に , 犬から etc? This is obviously nonsense, but not according to Latin grammar. But this is not Latin grammar.
The above Conjugation chart is ridiculous. Only the stem forms belong, and that could be improved. -te is not part of the verb. V-te is not a lemma and hence inappropriate. "But with do it in language X..." is fine for language X. Leave it to people who actual understand the language and do real work with it. Bendono 14:58, 11 October 2009 (UTC)
semi-strong KEEP. Okay, I can't speak Japanese but there are some things I would like to say. Regardless of whether we keep this or not some of the crap surfacing here is pissing me off :P
ex: 歩いたときでなくても - even though not the time that (I) walked
As people have said that is NOT a verb form >_> It's a sentence (fragment) and that should be blindingly obvious to anyone who can read Japanese as "とき" is a noun. So just because you could transliterate the whole string of characters as *aruitatokidenakutemo rather than something like aruita toki de nakutemo. Therefore, saying that that is a verb form is absolute BS. 50 Xylophone Players talk 17:01, 28 October 2009 (UTC)

Kept. Mglovesfun (talk) 19:31, 9 May 2010 (UTC)