Wiktionary:Project – Unified Wiktionary outreach
|This page is no longer active. See http://omegawiki.org for the result of ideas mentioned here.|
|No discussion is needed to revive this page; simply remove the |
A unified wiktionary containing content from all languages (and interface / comments in all languages) has been considered for a while. Software and scripts for this, and for converting and migrating existing Wiktionary content in all languages to such a unified database, is (as of August 2005) under active development. This is a good time for enthusiastic community feedback, both positive and negative.
Some core ideas and FAQ about this should be gathered, and distributed widely to the community. This project aims to help collect and disseminate this information.
History of the idea
People have been talking about making wiktionary more structured in its data since its inception, and about how to include many languages in one database since it was split into multiple language-databases. Some older ideas along these lines can be found drifting about the Meta-Wikimedia wiki.
In early 2005, GerardM found outside funding for the development of software for an "Ultimate Wiktionary", pursuing a project both to provide structured metadata for words and to combine content from the many different language wiktionaries to remove redundancy. You can see some of the design goals for that effort here: (meta link) and proposed data schema here: (tables link).
Early design goals
A few early design decisions have specifically influenced the current proposed implementation; comments on those decisions are also welcome.
Please comment on the current design and timeline on Meta [as these issues affect all Wiktionaries, and some developers] here : (design | timeline)
If there are other active discussions about this or other plans for unifying languages, please link to them from this page. Contributors to all Wiktionary languages should be made aware of these proposals; help outside the en:, nl:, and it: wiktionaries is particularly needed. See Interested people, below if you would like to help get more community feedback on this matter.
In particular, many projects have asked explicitly for separate databases and wikis for each language; the idea of a unified wiktionary seems to move in the opposite direction. Editors should think about how they wish to see recentchanges and comments, what this means for interface options for a unified database, &c.
Wiktionary talk:Policy Think Tank – English Wiktionary, Foreign Words & Translations A big argument, developed there, in favour of a unique database is that several wiktionaries, including this, english, one, have met the need for foreign words. But these articles have many mistakes and omissions, because they are not checked by natives. So I support the idea "1 word, 1 entry". The good project would only display the languages interesting the reader.--Henri de Solages 18:07, 27 August 2005 (UTC)
I (Connel MacKenzie) have gotten the impression that UW intends/threatens to replace all Wiktionaries. At this point I think that is absurd. I can see perhaps it proceeding as a fascinating experiment. If it exceeds my (and many others') expectations, then it possibly will.
One thing I think has not been considered carefully is resistance to change. In this case, there are many facets. Humans dislike change and will probably get very emotionally charged about any such proposal. Also ignored is the daily-increasing amount of external software that depends on certain Wiki configuration features. Software does not "like" change. That may be an extroardinary understatement though. Software is astronomically less adaptable than people are. At this point, it is unknown how many external forks there are. To propose that they all wither and die is (I think) counter to the Wikimedia Foundation's stated goals.
- This is an important thing to consider. Can you list some of the extrenal software packages you konw of that depend on such features? +sj +
I also believe that the UW proposals and discussions that I have seen to date have not even attempted to address the issue of how to interface, for languages that are not fully defined.
I would love to see a read-only UW project that continually accepted updates from the various language wiktionaries. But at this point in time, I don't see how a coherent interface can exist, so that people can see information that they understand, in their language. From what I've heard about it so far, one will need to be fluent in multiple languages to navigate the user interface. That's a pretty scary concept to this American.
I also would like to see the UW experiment proceed. If it *is* a software pie-in-the-sky pipe-dream then it will certainly collapse of its own weight. If it turns out to be way better than any of us expected, more the better. --Connel MacKenzie 17:59, 24 August 2005 (UTC)
- I'm with you on this last point. The designs I see are much better than I expected to see six months ago; and there are many standards bodies and existing repositories that seem interested in helping out.
- I believe the interface will be monolingual, set according to user preferences, as they currently are in Wiktionary itself. Can you explain what you see as potential interface problems? +sj +
- It was never meant as a threat that the existing Wiktionaries would wither. It was rather a way to express how profoundly Gerard and others, like me, believe that UW will be a success. The interface will be entirely in the language of the user. Editing, however will be a lot more granular. I suppose it makes sense to do that, since the data is also stored in a very granular way. For a mere user, it should look more or less like the current Wiktionaries look. Of course, not all Wiktionaries look the same, so it will necessarily be a cross cut. That's why it's important that people from the English Wiktionary have their say and be interested. Otherwise it might look rather different and maybe important details are not taken into account. Polyglot 00:16, 8 September 2005 (UTC)
I share many of the concerns raised by Connell. Polyglot, what do you mean by granular editing?
From the very beginning (when the English Wiktionary was the only Wiktionary) I have supported the vision of its housing every word in every language. But that is the language of vision rather than practicality. Practicality tends to impose limitations on vision, and I have understood that even a result with unlimited growth would fall far short of the vision. I was glad to see the other Wiktionaries (beginning with French and Polish) activated because it presented an opportunity for this same vision to be applied in the respective psycho-linguistic contexts of those languages.
I see two major perspectives from which one may approach a dictionary: that of the person interested in his own language, and that of the translator. A single language based Wiktionary reflects the first perspective, while a UW reflects the second. Amazingly, these two divergent visions reflect a dynamic interpersonal tension that we have experienced on the wikis throughout all of their existences, and probably long before in other social structures. There are those of us who can function well in a world of vaguely defined guidelines, and those of us who require clearly stated rules for our universe to function. The analog world has a place for fuzzy thinking; the digital world cannot conceive of anything being placed between the digits.
The current English Wiktionary (I do not have enough experience with the other language projects to comment about them.) functions very well with the existing software. How far it will scale up remains to be seen, but we are still far away from hitting that wall. Despite being the biggest of the Wiktionaries it is still small compared to what it could become. For now our structure is still flexible enough that we can adapt to change. When someone complains that an idiom is not a part of speech we still have the option putting that idiom somewhere else or modifying our working definition of "part of speech". Working definitions can vary considerably from dictionary definitions.
There are areas where the single language dictionary works very poorly. The translation lists are a notable example. I have no idea where many of these were pulled from. The twenty or so different bi-lingual dictionaries that I have could serve as a basis for checking entries in those twenty languages, but even if I take an hour to go through that mechanical task of checking those entries, with all the pitfalls connected to a dictionary that gives me more than one option. If the article purports to give translations in 100 languages there will still be 80 others that are unchecked. This certainly points to an area where some kind of UW as meta-Wiktionary could be very valuable. This still does not detract from the vision of a single language based Wiktionary to include all words in all languages.
A good translator is not satisfied with simply looking up words in a bilingual dictionary; that is only a starting point for unfamiliar words. If he translates into his native language his understanding of the source language must be sufficiently sound to allow him to consider the term as it is presented in a dictionary or other context that is entirely in that source language. He must be in a position to carefully weigh those options. Someone who translates sonnets from French to English needs to grasp why the French hexameter becomes the more concise English pentameter. The individual Wiktionaries should be capable of presenting that depth in their own language. They should provide rudimentary information about words in other languagee, enough to bring the reader to the point where he can extract some meaning from any term he encounters. A reader of Ezra Pound should not be dissuaded by Pound's randomly (?:-)) inserted Chinese characters. English speakers in particular have a desparate need to grasp what's happening in the rest of the world.
All that being said, I will be happy to cede the translation lists to the UW, and let them co-ordinate the interwiki links that will become necessary. Beyond that it's hard to know what to expect. This idea in various forms has been talked about for some time, and we have yet to see software that we can poke and prod to see if it will stand up under real world usage. Some comments have proposed various policies that must be adopeted for the site before the software is even available. That strikes me as very unwiki. My response to those policy proposals can never be stronger than "Maybe" because those policy proposals are completely out of context.
Let's see what the software, as a real rather than a hypothetical entity, can do first. We will then all have an opportunity to comment, and send it back for revisions with clearer policy implications. Eclecticology 16:52:35, 2005-09-08 (UTC)
Please list yourself below if you are interested in joining this project.
- +sj + (I don't know what the right implementation is, but I'd like to see all language interfaces use the same database of terms, definitions, and translations)
- Polyglot 00:16, 8 September 2005 (UTC), interested? Of course I'm interested. I once had the idea of doing something like that myself. I hope some alpha version will be available for testing soon.
- Dvortygirl 04:58, 8 October 2005 (UTC). I suspect we will have a tremendous mess on our hands when we first open the doors and let the world into the infant project, but I also think the project has tremendous potential and deserves at least enough support from the community to prove the concept.
- \Mike 08:47, 19 October 2005 (UTC) Yes, I do think it's necessary in the long run, in order to really make Wiktionary flourish in *all* languages. Let's face it: MediaWiki was written with the needs of wikipedia in mind, what we need – as a dictionary – has never really been taken into account in the development of the software.
- -- 21:07, 11 November 2005 (UTC)
- Wietse Zuyderwijk 15:00, 28 December 2005 (GMT) I think that we should join efforts to make one multilingual dictionary. Having the exact same headwords in Wikipedia and Wiktionary is bad as it is!
- Sergio.ballestrero 09:08, 7 May 2006 (UTC)