Wiktionary talk:Translations/"Translations to be checked" - a proposal

From Wiktionary, the free dictionary
Latest comment: 16 years ago by Rodasmith
Jump to navigation Jump to search
from Wiktionary talk:Translations#"Translations to be checked" - a proposal Rod (A. Smith) 22:00, 3 November 2007 (UTC) Reply

The list of pages with translations that need to be checked is building up, and no doubt there are many, many more that need to be added.

User:Eclecticology posted to me wondering whether any of these translations ever are actually checked, and I wonder the same thing.

I would like to propose that we find contributors who are experts in (or very familiar with) each of the various languages and assign to them the translations that need to be checked in those languages. If we got a large enough team together, this work could be done relatively quickly and could be kept up to date, as those involved could check the list of pages from time to time to see what new pages had been recently added.

When definitions change substantially, when they are split and reworded, we cannot be sure which translation still applies. But in many cases I think we throw too much away. A minor clarification does not invalidate a translation.
Moreover, many of the definitions have not changed at all, yet we throw them into checktrans and our arms in the air. If this is really a problem, why not just go into the history? I hate tracking down changes on large pages—the software doesn't provide any easy way to do it—but the majority of pages have very few edits.

For starters, I'm prepared to take on French and Italian. At a push, I could do some of the other European languages too. I know there are people on here who specialise in some languages (such as Finnish) so it would be great if we could drum up a team that covered all of the languages that need to be checked.

Eclecticology proposes that translations not checked within 90 days of the page being added to the "checktrans" category be deleted. I'm in two minds about this - deleting unchecked translations would help keep Wiktionary accurate, but would result in possibly correct and hard-to-come-by information being lost.

So who else would be prepared to get involved with these suggestions?

Paul G 10:06, 18 July 2005 (UTC)Reply

I can help with the Japanese.
Sally Ku 12:07, 18 July 2005 (UTC)Reply
And I'm ready to take a look at Swedish. \Mike 12:37, 18 July 2005 (UTC)Reply
I can comfortably do Tok Pisin and Ga, but those are somewhat rare, if I may be permitted to understate the fact. I could probably do a decent job of German and French, and, as with Paul G, if pushed, some of the other European languages (Dutch and Welsh come to mind), though I imagine there's better candidates out there. --Wytukaze 16:15, 18 July 2005 (UTC)Reply
Great, it's great to see people willing to contribute to this project. At the moment the list at Category:Check_translations stands at 264 entries, and by the time I have added the words on my user page, there will be almost 350. No doubt there are hundreds more that could be added.
A few questions come to mind:
  • How can we monitor this to ensure which languages have been checked on which pages? Perhaps the people volunteering to do a particular language could take a copy of the list and then work through it, ticking items off as they check the languages (or if their language(s) don't require attention).
  • How can we track down other pages that have multiple senses where the translations are misaligned with the senses, or there are no translation tables at all? One way to find the latter is to search on Google in the wiktionary.org namespace for pages that don't contain the components of translation tables. No doubt this would generate a very large number of hits.
  • How can we encourage people to prevent the list from becoming longer, that is, to add translation tables as senses are added? Ideally, there would be a script that would add a translation table whenever a new sense was added to a page, but I think we are still a long way off from that sort of automation. — Paul G 17:41, 18 July 2005 (UTC)Reply
  • I'm trying to create a script to parse the Wiktionaries. Like everything I do, it is taking a lot of time, but I guess I will get there sometime.. One thing that is a bit problemeatic with the translation tables is that is hard to make the link between the definitions and the tables. The clues given next to the tables are very clear for humans to interpret, but it will be extremeley hard to parse them with a program. Therefor I would like propose to add something to both the definiitons and the translations (and maybe also the synonyms). Something like <!--defA-->, <!--defB-->. It doesn't really matter what is in the html comment, as long as it is the same between a definition and a translation and as long as it is never changed afterwards. So if during cleanup the definitions are reordered, the labels should stick with their original definitions. It doesn't matter they aren´t ordered. Polyglot 22:39, 18 July 2005 (UTC)Reply
It's encouraging when a comment that I made draws our most co-operative into seeking a solution. Bravo!
What makes the corrections so difficult is that these translation lists to be checked cover so many different languages, and no one person is able to do them all. For individuals take responsibility to systematically go through the entries for specific languages will be a big help. The difficulties that Polyglot raises have plagued us from the beginning. I've even considered giving each definition a separate heading below the POS level. This could allow derivative forms and translations to be allocated accordingly. I'm not about to promote the idea yet because I can't convince myself that it will work. Perhaps, Polyglot, you could do a test page to illustrate your idea. Eclecticology 06:26, July 20, 2005 (UTC)
Hi folks,

I adapted total to illustrate what I mean by adding tags to synonyms and translations (and antonyms too of course). For the sake of demonstration, I first assigned labels and then decided the order of the two noun senses should be reversed. I makes more sense to me to first talk about the mathematical meaning and then the economical one. Anyway, it illustrates that the order of the letters inside the tags doesn't matter. If people feel like putting numbers or some random Chinese characters that's fine too, as long as there is a clear matching between the definition and the translation group. I must say that as far as scripting a bot goes, it would be easier to have the translations and the synonyms as subheadings in between the definitions themselves. For presenting to the user it would get really messy though. Especially when there are many translations. OK, now I'll go on hitting random pages to solve the capitalization and some other issues. I kind of feel guilty for the capitalization change since I started the last vote, so now I became a bit more active again to help clean up the mess. Polyglot 08:18, 20 July 2005 (UTC)Reply

It seems like a sound idea. (By the way, total needed a lot of cleaning up.) If these are to be added by a bot, then great. I think adding them by hand would be tedious and most people would not see the point in doing it. — Paul G 16:15, 20 July 2005 (UTC)Reply