User talk:LA2/Archive

From Wiktionary, the free dictionary
Jump to navigation Jump to search

Hello, and welcome to Wiktionary!

If you have edited Wikipedia, you probably already know some basics, but Wiktionary does have a few conventions of its own. Please take a moment to learn our basics before jumping in.

First, all articles should be in our standard format, even if they are not yet complete. Please take a moment to familiarize yourself with it. You can use one of our pre-defined article templates by typing the name of a non-existent article into the search box and hitting 'Go'. You can link Wikipedia pages, including your user page, using [[w:pagename]], {{pedia}}, or {{wikipedia}}.

Notice that article titles are case-sensitive and are not capitalized unless, like proper nouns, they are ordinarily capitalized (Poland or January). Also, take a moment to familiarize yourself with our criteria for inclusion, since Wiktionary is not an encyclopedia. Don't go looking for a Village pump – we have a Beer parlour. Note that while Wikipedia likes redirects, Wiktionary deletes most redirects (especially spelling variations), in favor of short entries. Please do not copy entries here from Wikipedia if they are in wikipedia:Category:Copy to Wiktionary; they are moved by bot, and will appear presently in the Transwiki: namespace.

A further major caveat is that a "Citation" on Wiktionary is synonymous with a "Quotation", we use these primary sources to construct dictionary definitions from evidence of the word being used. "References" (aka "Citations" on Wikipedia) are used predominantly for verifying Etymologies and usage notes, not the definitions themselves. This is partly to avoid copyright violation, and partly to ensure that we don't fall into the trap of adding "list words", or words that while often defined are never used in practice.

Note for experienced Wikipedians:
Wiktionary is run in a very different manner from Wikipedia and you will have a better experience if you do not assume the two are similar in culture. Please remember that despite your experience on Wikipedia, that experience may not always be applicable here. While you do not need to be an expert, or anything close to one, to contribute, please be as respectful of local policies and community practices as you can. Be aware that well-meaning Wikipedians have unfortunately found themselves blocked in the past for perceived disruption due to misunderstandings. To prevent a similar outcome, remember the maxim: be bold, but don't be reckless!
Having said that, we welcome Wikipedians, who have useful skills and experience to offer. The following are a couple of the most jarring differences between our projects that Wikipedians may want to learn up front, so things go smoothly for everyone. Changing policy pages on Wiktionary is very strongly discouraged. If you think something needs changing, please discuss it at the beer parlour, after which we may formally vote on the issue. You should also note that Wiktionary has very different user-space policies, we are here to build a dictionary and your user-page exists only to facilitate that. In particular we have voted to explicitly ban all userboxes with the exception of {{Babel}}; please do not create or use them.

We hope you enjoy editing Wiktionary and being a Wiktionarian. Conrad.Irwin 20:34, 13 April 2008 (UTC)[reply]

IPA[edit]

Hi, we don't allow IPA pronunciations next to the head word. Can you please move them to the ===Pronunciation=== section, and use slashes (//) or square brackets ([]) as I've just done for bergets. Thanks. Mglovesfun (talk) 14:39, 18 August 2010 (UTC)[reply]

Sure, I just copied what I found on another page. See also Wiktionary:Beer parlour#Open bot request. --LA2 14:45, 18 August 2010 (UTC)[reply]

"new" i mallnamn[edit]

De fyra sv-new-...-mallarna används av "You can create an entry..."-strukturen, d.v.s. "new" syftar inte på att mallen är ny, utan att den används för nya uppslag. (Den femte av de mallarna, sv-adj-new, var dock en temporär mall jag använde medan jag jonglerade med strukturen för att kunna göra mig av med ett par gamla mallar utan att riskera att förstöra uppslag under tiden. Dock hade jag bara omdirigerat den och glömt att radera den. Och ja, det finns fortfarande för många mallar. (Jag vet dock inte om vi egentligen kan göra oss av med -form--mallarna.) \Mike 00:34, 19 August 2010 (UTC)[reply]

Conjugation and declension templates[edit]

I have created Wiktionary:Conjugation and declension templates in parallel with Wiktionary:Inflection templates, following an old discussion on the Beer Parlour here. I have also added some 'policy' on naming, taken from Category talk:Conjugation and declension templates, and also on the difference between the two. You should probably move the appropriate Swedish conjugation/declension table templates from the latter into the former. —CodeCat 10:25, 20 August 2010 (UTC)[reply]

Passive verb forms[edit]

I just noticed that Swedish wiktionary lists passive verb forms in its conjugation tables, but English wiktionary does not. See e.g. sv:kalla. Since you're already working on the templates, do you think you could add those forms to the tables while you're at it? —CodeCat 23:37, 20 August 2010 (UTC)[reply]

Yes, this is one of many things that need improvement here. First, we need to rename and graphically redesign our declension and conjugation table templates. --LA2 02:52, 21 August 2010 (UTC)[reply]
Also reciprocal (is this what they are called in English?) such as slåss - sure, that's an exception as it doesn't coincide with the passive, but as has been remarked elsewhere, "passive" may not be the best name for that column. \Mike 05:46, 21 August 2010 (UTC)[reply]
My short mentioning of spelling reforms immediately led to more questions. After some thinking, I concluded that there needs to be a better article in the English Wikipedia, which en.wiktionary can refer to. There's little point in writing up detailed descriptions (e.g. Appendix:Swedish grammar, Appendix:History of Swedish orthography) here. The same goes for passive verbs, passive-only verbs, etc. They should be mentioned, named, described, all with source references in w:Swedish language or one of the more specialized Wikipedia articles. --LA2 06:46, 21 August 2010 (UTC)[reply]
You could create a page similar to Appendix:Dutch parts of speech to explain things that might not otherwise be obvious. Specifically, take a look at the section about the inflected form of adjectives on that page. —CodeCat 09:43, 21 August 2010 (UTC)[reply]

Inflected forms and other issues[edit]

In your 'diary' I noticed you were wondering a few things, so I'd like to give my opinion on them. How do we create entries for all inflected forms? I think you might want to look at MewBot, which is a bot written by me that handles Dutch verb and adjective forms. If you know Python, you can re-use most of the code, and use the rest as an example to create your own formbot for Swedish. If not, then I can extend MewBot myself to do it instead. For that, it would be necessary to standardise the templates and reduce their amount as much as possible.

How should templates be named? Is the -reg-/-irreg- part of the name really necessary? I don't think -reg- is really necessary. -irreg- could be shortened to -irr-.

Can we do with fewer templates and shorter names? To reduce the amount of templates, I think the best course of action is to devise templates that have several parameters so that one template can be used in many situations. To that end, it makes sense that the parameters should have sensible defaults so that they can be left out. An easy approach would be to make one template each for the five types listed here: 1, 2, 3, 4, irregular. But it is also easily possible to unify types 1 to 3 into a single template. Something like this:

Should templates for Swedish words be standardized across languages of Wiktionary? I wouldn't try. There's too much work involved in standardising it, and while it is convenient, it's probably not worth the trouble. —CodeCat 20:47, 21 August 2010 (UTC)[reply]

Yes, I think something along these lines is what we should aim for. What the Wikipedia article describes is what Swedish primary schools teach as "the four conjugations", along with "the four grains" (wheat, oat, rye and bug), "the four math operators" (addition, subtraction, multiplication, division) and "the four rivers of Halland" (Lagan, Nissan, Ätran, w:Viskan), but it is an overly simplified pattern (Swedish farmers grow other crops too). Already in the next table you see that group 2 needs to be split into 2a and 2b. I would suggest the parameters stäng|er|de, läs|er|te, smält|er|e, and sy|r|dde as the parameters, which should be easy to learn and remember. However, already the supine form creates problems: stäng-t, läs-t, smält-, sy-tt, so maybe that ending needs a parameter of its own. And for smid|er|de (weak verb, stem ends in -d) the supine was more regular in the obsolete spelling smidt (before 1906) than in the current smitt. That supine needs to be an additional, optional parameter. (The 1906 dt/tt spelling reform introduced a new conjugation, 2c, for weak verbs with -d stem, but Swedish primary schools pretended like nothing changed.) From a template designing perspective (something Swedish primary schools never prioritized), I'd use the name sv-conj for the strong conjugation, since that template should actually be useful as a base template to be called by the one for the weak conjugations. The "strong template" too needs the present tense ending, to distinguish stryk|er|strök from se|r|såg. Then still, the irregular verbs "vara" with the present tense "är" can't be constructed from an ending alone, so you might need the full present tense as the parameter, which leads up almost to the current {{sv-verb-irreg}}. As it happens, when I coloured the current templates blue, I already made that my base template, so renaming it to sv-conj is almost the next step. So, there are a few more details to sort out (conjunctives? passive forms? support for old -dt spelling?), but the main principle is sound. It would make sense to consider including Norwegian and Danish in any such redesign effort, so we end up with a common pattern of template design. --LA2 03:55, 22 August 2010 (UTC)[reply]
All very good points. I'll try making a start with a template for weak verbs, and then you can try breaking it with a verb I hadn't anticipated. :P —CodeCat 08:47, 22 August 2010 (UTC)[reply]
Ok, I've decided to settle for three templates for weak verbs: {{sv-conj-wk-ar}}, {{sv-conj-wk-er}}, {{sv-conj-wk-r}}. I've added various tests here. —CodeCat 09:20, 22 August 2010 (UTC)[reply]
These resemble the existing {{sv-verb-reg-ar}} and {{sv-verb-reg-er}}, so is there any major improvement? --LA2 09:24, 22 August 2010 (UTC)[reply]
I've changed the templates somewhat, {{sv-conj-wk-er}} now handles -ar verbs as well so the other one has become obsolete, and this probably should become just {{sv-conj-wk}}. -r verbs are still separate, wondering how I could unify them in a neat way. It does seem that the deciding factor for many forms is essentially the final letter of the stem: voiced (stänga), voiceless (läsa), r (höra, göra), t (smälta), vowel (sy). So perhaps there could be something like an end= parameter, which can be either vd, vl, t, r or vw. That would then decide most of the endings. Only the assimilation of the supine would need to be handled with a separate form. What do you think? —CodeCat 10:18, 22 August 2010 (UTC)[reply]
Everything would be so much easier if we had a substring function so we could determine the last letter of the stem, without requiring the template user to submit this as a separate paramter. --LA2 10:39, 22 August 2010 (UTC)[reply]
I have created {{sv-conj-st}} as well, and added a lot more test cases to my sandbox page. I believe the templates can cover all regular verbs now, and perhaps a few slightly irregular ones too. For the rest we have {{sv-conj-irreg}}. Mission accomplished, I would say, unless you find a serious flaw that I missed. —CodeCat 11:42, 22 August 2010 (UTC)[reply]

LA2-bot[edit]

Don't you think you might be going through the testing a bit too quickly? The recent changes is getting flooded... --Yair rand (talk) 06:03, 22 August 2010 (UTC)[reply]

blocked: too many edits, bad edits, e.g. egga. Robert Ullmann 06:38, 22 August 2010 (UTC)[reply]

You must manually check the "bot" edits. Every single one. And fix any problems. You would be well advised to slow down seriously, take a few weeks or months to get to know things. Robert Ullmann 06:47, 22 August 2010 (UTC)[reply]

I'm sorry, I thought I was taking it slow and easy and I did manually check each of the 185 edits made so far. I'm surprised I missed egga. I'll check the edits again. --LA2 07:52, 22 August 2010 (UTC)[reply]
There was apparently one other, and you've found it; thank you. I've removed the administrative block. Cheers, Robert Ullmann 08:26, 22 August 2010 (UTC)[reply]
There were three other: diskretisera, föra in and fäkta. My pywikipedia regex (shown at the top of user:LA2) missed the last line that didn't end in a newline. Four errors in 185 edits is far too many. I should have noticed this already at diskretisera. --LA2 08:31, 22 August 2010 (UTC)[reply]
However, would it be possible to stop the bot during the vote? It's standard operating procedure to make some test edits (less than 100) but it's way over that now. Mglovesfun (talk) 11:07, 4 September 2010 (UTC)[reply]
Sure, I have stopped it. Let's be clear, however, that the vote is not over whether those edits should be made. The vote is only over the bot status that would hide the edits from appearing on the Recent Changes page. I have made 3 manual edits per minute (as LA2) and you just made 6 manual edits in one minute (starting with realigeblaj) assisted by some JavaScript. The edits by LA2-bot were throttled back to 1 or 2 edits per minute. If you find errors in my edits, manual or automated, this is something we should of course discuss and correct. But that's not what the vote is about. --LA2 14:01, 4 September 2010 (UTC)[reply]
BTW if you have a way to convert * Swedish: [[link]] to * Swedish: {{t|sv|link}} then it would be nice to extend this to as many languages as possible. Mglovesfun (talk) 10:47, 6 September 2010 (UTC)[reply]
After some more testing, I have now expanded to the Scandinavian languages (Swedish, Danish, Norwegian, Icelandic, Faroese) which I can understand (more or less). I still have to sort out some cases manually, so I manually approve each edit. Some statistics can be found on Template talk:t#Creation of this template. --LA2 21:54, 16 September 2010 (UTC)[reply]

Can your bot add {{t}} to Armenian translations, like this? Note, Armenian has transliterations. --Vahag 14:49, 17 September 2010 (UTC)[reply]

In theory, yes, if all examples look exactly like that and if I had more time. In practice, however, variations are great and need manual intervention ever so often. And I have other priorities, so I will not do this for Armenian. --LA2 14:53, 17 September 2010 (UTC)[reply]
OK, no problem. --Vahag 14:54, 17 September 2010 (UTC)[reply]

Translations[edit]

Hi, I just wanted to point out a couple things about translation.

  1. Translations into English go on the definition (eg älva) or not at all if it is unnecessary (eg häradshövding, häradsrätt)
  2. Translation between foreign languages should go in foreign language Wiktioanaries. Could you move the translations of glassbil#Swedish, jäätelöauto#Finnish, ere#Danish, äro#Swedish to the foreign Wiktionaries (sv:glassbil, fi:jäätelöauto, da:ere, sv:äro respectively). Thanks.

Cheers. --Bequw τ 22:55, 25 August 2010 (UTC)[reply]

unna example sentence 2[edit]

Is this a quotation or an example sentence? If it is a quotation, the line should start with #*. Nadando 21:14, 27 August 2010 (UTC)[reply]

I've sent a friend request to quotations, but they haven't responded yet. Wrt quotations, I'm still in the "it's complicated" stage. --LA2 21:18, 27 August 2010 (UTC)[reply]
I think the requirements on a quotation are higher. It should be the first or a really representative use of the word. I'm just picking random examples, but I prefer to pick them from the Bible or well-known literature (here: Goethe), where I can find a ready translation instead of inventing my own. And so I postfix a parenthesis with the reference to the source. --LA2 21:44, 27 August 2010 (UTC)[reply]

Parts of speech[edit]

CGEL is the most modern (2002) comprehensive grammar of English. C $150, but at many better libraries in the Reference section. You should look at their remix of part of speech ("PoS") categories. There is usually not a major problem with nouns, verbs, and adjectives. Adverbs like those ending in -ly are OK too. After that, among the "closed classes of grammatical function words" (all elements of which expression need explanation) all bets are off. And there are even difficulties with forms of verbs like -ing forms and past participles and with proper nouns.

You may find the pathbreaking Comprehensive Grammar of the English Language (1985) more available and more congenial. Both this and CGEL have the important category of determiners. See Category:English determiners. We had an interesting (to me anyway) discussion about adding that to the list of allowed PoS ("L3") headers.

We try to retain all the traditional PoS headers because of their widespread usage and because they are an adequate starting point. You might take note of some of the grammatical subcategories in, say, Category:English adverbs and take a look at Wiktionary:English adjectives for an example of the type of criteria for determining whether something is a given PoS. The field is both fascinating and maddening.

It is important to have a realistic perspective about these things. Dictionaries throw things into these categories for their users. Grammarians get insight from reclassification. Most people speak correctly by any reasonable standard when they need to, without remembering the parts of speech from their schooling, let alone the more up-to-date grammatical classes grammarians may invent and use. DCDuring TALK 19:41, 20 September 2010 (UTC)[reply]

The writings of David Crystal (including his Encyclopedia of Language and Encyclopedia of the English Language) are ways of getting a handle on matters. DCDuring TALK 19:46, 20 September 2010 (UTC)[reply]

Why is this on my user talk page, and not in the Wikipedia article? --LA2 20:20, 20 September 2010 (UTC)[reply]
Because it sounds more authoritative than it is. It is an amateur's perspective. I'm not really even bi-lingual. My Babel box show various -1 levels is an exaggeration (as I disclosed at admin vote time). DCDuring TALK 20:26, 20 September 2010 (UTC)[reply]

The "l" template[edit]

In this edit, an incorrect substitution was made. The {{l}} template was created and should be used in lists, not within text. For that, we use the {{term}} template. --EncycloPetey 04:28, 23 September 2010 (UTC)[reply]

Really? What is the harm from using the l template in text? I have used it in text in my manual edits, so the bot edit is consistent with that. --LA2 20:54, 25 September 2010 (UTC)[reply]

erfarenheterna (and others)[edit]

Why have you generating plurals of words that we don't have in the singular? SemperBlotto 20:38, 16 October 2010 (UTC)[reply]

Is that a problem? I'm generating entries for common words, in no particular order. Sometimes the plural before the singular. The entry is correct, even if it contains a red link. --LA2 20:47, 16 October 2010 (UTC)[reply]
Indeed, you do the same, right, Semper? Mglovesfun (talk) 12:15, 30 October 2010 (UTC)[reply]

Article creation[edit]

Hello LA2, I saw on Special:RecentChanges when you create a new article, you write Swedish in the field Summary. If you do not fill the file Summary, then it is the beggining of the code which is showed. So, with the beggining of the code, we see in RecentChanges that it is Swedish and more. So, I think you could not fill the field Summary when you create a new article, there are more informations without filling. Thanks. Pamputt 22:08, 17 October 2010 (UTC)[reply]

Really, are you sure you want that? It will make things easier for me, but it will fill up the summary field with a lot of wiki codes that I consider ugly. --LA2 23:49, 17 October 2010 (UTC)[reply]
In fact, I prefer see the code. However, I do not contribute a lot on en.wikt (I am on fr.wikt) so I think if en.wikt contributors are not disturbed by that, then you do as you want :) Pamputt 13:13, 18 October 2010 (UTC)[reply]

Large POS categories[edit]

Hi, MglovesfunBot using AutoWikiBrowser can only convert lang=Italian to lang=it for categories with up to 25 000 members, this is because that's AWB's limits. So categories like Italian, Spanish and French verb forms (for example) are too big. Can LA2bot do it instead? In fact I only managed to do about 12 000 of the members of Category:Italian plurals yesterday as I built up so much lag. Mglovesfun (talk) 11:37, 22 October 2010 (UTC)[reply]

Swedish declension templates[edit]

If you know a way to do it, Swedish declension templates shouldn't go under the ===Noun=== header, but rather the ====Declension==== header. I've been fixing some by hand, but there are quite a lot of them! Mglovesfun (talk) 12:10, 30 October 2010 (UTC)[reply]

I tried to make a bot replacement, but there are too many exceptions, so for now I'm just fixing them manually as I stumble upon them. Maybe later I will find some way to automate it. --LA2 12:12, 30 October 2010 (UTC)[reply]
I have gone through the obvious cases among the Swedish verbs and started with the nouns. The remaining cases require a fresh XML dump, which I'm eagerly waiting for. --LA2 05:37, 4 December 2010 (UTC)[reply]

Adding words correctly[edit]

Hej! Yes, that was quite embarrassing to see those mistakes I made. Did I get it right this time? plasttallrik --dezzie 20:59, 3 December 2010 (UTC)[reply]

Looks fine! --LA2 21:05, 3 December 2010 (UTC)[reply]

Interwicket userpage links[edit]

Dear LA2,

May I add an Interwicket link to the Swedish version of your user page? If so, thank you. - Lo Ximiendo 05:32, 4 December 2010 (UTC)[reply]
I did so now. Thank's for reminding me. --LA2 05:35, 4 December 2010 (UTC)[reply]