Wiktionary:Big implementation problems with Wiktionary
|This page is no longer active. It is being kept for historical interest.|
|No discussion is needed to revive this page; simply remove the
Big implementation problems with Wiktionary
Please read this carefully - maybe even twice. I don't mean to offend anyone - these are serious issues and concerns regarding my first impressions of Wiktionary.
Having come from Wikipedia, which I regularly use and contribute to, I have just decided to check out this sister project, and have found it to be next to useless! Selecting random page gave me 5 oriental character definitions in a row, none of which had a page title (my browser doesn't display the font). The articles that I did find were pretty much unreadable (e.g. hold), appearing more like a list than a definition and full of irrelevant material (translations etc.)
For the Wiktionary to be at all useful, I would suggest the following changes as essential:
- All articles names in English. If non-roman script is used it should be solely as a parenthetical illustration within the article of how it would be written in its native language.
- The formatting of articles has got to change. A dictionary doesn't suit the heading and sub-heading approach used by Wikipedia. It should use a different (or expanded) mark up, ideally flagging different components of a definition (word, pronunciation, type, definition, translations, etc.) Also, the word being defined is rarely the most prominent item on the page. For example, I was randomly presented with donner, and it took a while to figure out what I was reading. I thought the word I had been given was French!
- Allow people who actually use the dictionary to disable certain sections of the article. I find it very hard to read articles that contain long lists of the word in other languages after each definition. These should be optional (and in my opinion default to disabled) or replaced with a 'translations' link. An expanded mark-up should make this very easy to implement. Other items that could be optional are pronunciation guides, synonyms/antonyms,
- All foreign words should be moved to a separate namespace. When I look up a random word on an english dictionary I expect to get an english word. I don't object to the usefullness of having definitions of foreign words, but without some kind of separation it makes the dictionary seem like a joke.
I think this is a minimum list of things that needs to be implemented before the project can be taken seriously. I am very much behind the idea, and I am fully appreciative of the effort that people are putting into it, but until these issues are resolved I kind of feel that everyone's wasting their time.
--HappyDog 03:05, 15 Feb 2004 (UTC)
Response to HappyDog
(Further responses from HappyDog marked -HD).
Random page & Oriental character definitions
- The reason you get many Chinese characters when selecing "Random Page" is simply because many many Chinese characters have (bare bones) articles here compared to the relative few Western European words. This is because a lot of work has been put into the Chinese characters. If you want to change the ratio, you are welcome to add lots of Western European words.
- I understand this. It's a little like Wikipedia's US census information stubs (for which there has also been a lot of argument for and against the removal). I guess it's a problem when developing something like this from scratch in that natural biases are bound to occur. Hopefully it will sort itself out as the Wiktionary develops, but at the moment the Random Page function is nearly useless (I just tried it again and 6/10 were chinese characters). -HD
- The Chinese character articles do have page titles. Every article is the word (or phrase) the article is about. When the article is about a Chinese character, the page title will be a Chinese character. If you want to be able to read the Chinese character titles, install a Chinese character font. If you do not care to read them, then do not read articles on Chinese characters. If you want to read a random Western European article then be patient and select "Random Page" a few more times.
- The point is, it's only because I understand (a) how the wiki works and (b) how computers work that I realised that this is a problem. If I didn't I would have either thought the wiki was broken (it looks pretty broken in these situations) or just been terribly confused by it. I would suggest at the least that they are renamed to something like Chinese character: 'x'. It might even be sensible to add a msg: tag saying 'Charcters not displaying properly? click here for advice.' or whatever. -HD
- Wiktionary is not simply a definition dictionary but also a translating dictionary. The reason we seem to have many more translations than definitions at this point is because we currently have more people who are interested in translations than are interested in or good at making definitions.
- That's fine. I respect the huge scope of what you are trying to acheive. However perhaps some sort of 'current status' on the front page with a little paragraph to that effect would allow visitors to know exactly what to expect. The important thing is not to make new visitors turn away because the Wiktionary doesn't live up to their expectations. -HD
- On the extremely subjective of subject of relevance, obviously what is relevant to one person is not what is relevant to all. The translations are very relevant to many of the current contributors. If you think good definitions or good pronunciation fields or something else is more relevant, then please please add more of them.
Article names in English
If a language is in a non-roman script, then each word of that language is only correctly spelled in its native script. For many languages there are several different ways to represent words in roman script. No serious study can be made of foreign words if you only look at English representations of them. Now English translations and transliterations are added but sometimes it's easier to find the correct native way to write a word than to find the correct transliteration.
- This is true. This suggestion was primarily in response to the Chinese character issues. My feeling is that if a page's title can't be read on a default browser set up then it is not a good title. See my suggestion for the Chinese characters. -HD
- I have a browser, set up in a default way on Mandriva Linux with all the fonts for all the languages installed and everything displays just fine. I still can't read it though, but that's another problem. If Wiktionary wants to describe all words in all languages, it would be stupid not to do so in the native scripts. Polyglot 22:57, 18 July 2005 (UTC)
I completely agree. But I think we have to be very pragmatic in the early stages because we haven't been able to "feel" exactly what the better format would be. A general idea for a new format is easy but a good dictionary has many intricacies that need to be planned for well and to make a good plan you need experience in building a dictionary. I think when Wiktionary is more mature it will become easier to design a new format. Doing so at this stage would mean redoing it again in the future.
- And not doing it now will mean doing it in the future, with articles that don't have any kind of structure at all. I know that dictionary needs are complex, but even just to add a <translation> tag round the translation section, a <pronunciation> tag round the pronunciation section would immediately give a useful benefit, and help automate future changes as well. The tags I would suggest are: <translation>, <pronunciation>, <rel_words>, <definition>, <type>. -HD
- I'm not sure who made the observation about needing experience, but I agree. I think it's pretty obvious that the Wiktionary needs more structure, but being a dictionary of all languages, it's not at all clear to me what that structure is. Languages are messy things that tend to defy any rigid structures. Is there some concensus that we would like more "dictionary" specific macros to help standardize representation of the data? If so, then I'm happy. The details will work themselves out as we really learn what we're doing... I hope. -- CoryCohen 03:52, 9 Sep 2004 (UTC)
I agree here too. The online OED has a version of this but I think it could be better. This requires non-trivial code changes to the Wiki engine as well a new format and I don't think Wiktionary has any hackers of her own yet.
- I know it will need changes to the code, and I will happily volunteer to help make those changes (I am in the process of setting up my own wiki anyway). Don't know how quickly they would happen as I am pretty busy at the moment, but if people can decide on the implementation I would give it a go. If the above tags are enabled, then it would simply be a matter of showing or ignoring sections based on a user option. -HD
- I agree completely that disabling sections should be a priority for feature development. In particular, translations, meanings in multiple languages, and quotations all seem to be likely to overwhelm the reader if they grow too large. Do we still lack a developer? What language is Wiki in? Perl? -- CoryCohen 03:46, 9 Sep 2004 (UTC)
Foreign words & namespaces
Firstly, Wiktionary is not an English dictionary but a dictionary of all languages, written in English. I do however think the various languages are too tightly connected currently. Namespaces are one proposed solution to this but there may be others and I think it may have been discussed before.
- If you could point me to the discussion that would be useful. I am not proposing a namespace for each language, just a single foreign: namespace. However this may be too limited to be of full use. -HD
Hippietrail 06:14, 15 Feb 2004 (UTC)
- I totally agree with Hippietrail. We do what we can with the means that were really created for Wikipedia. Translations are what I look for when I open a dictionary, so they are extremely relevant to me. I also don't expect things to be 'intuitive'. Most things take a bit of getting used to and a bit of study how they are set up. It's worth it to invest some time into them. Of course, since this is a project where volunteers contribute, you will find that it will be shaped along the interests of those volunteers. Please be welcome to give it your touch as well and contribute what has your interest. Cheers, Polyglot 10:26, 15 Feb 2004 (UTC)
I too generally agree with the position taken by Hippietrail and Polyglot. Of course there is room for improvement. Perhaps the idea of tags might work; it seems as though it would replace what are now headings. The Chinese character problem is a reflection of a much broader issue across the entire net. We agreed to Unicode at a very early stage of Wiktionary. It attempts to deal with a large range of issues in a multilingual world. It would be wrong to completely dumb down the system while waiting for people to update their browser. People with old browsers should have a right to see what they could always see, the Unicode additions would mean that there is a lot more material that would now become available.
Personally I tend to work more on the English definitions than on the translations, but if other prefer working on the translations that's fine too. I admit that there are times when I consider some of the contributions by my colleagues are perfectly useless, but rather than wasting my time complaining about their efforts I can benefit the project much more by stressing my own efforts. Eclecticology 00:59, 16 Feb 2004 (UTC)
Views of another (primarily) Wikipedian
- I understand the issues of Chinese chars showing up via the random page, similar to the predominance on Wikipedia of small towns (someone there actually listed all small town pages on VfD!), and that the long term solution is to expand the English defs. There has been talk on WP of adding code to suppress the small town entries from random page--if it happens there, hopefully we can leverage the code to suppress the Chinese chars here (at least for the time being).
- I have no problem with Wikt being a translating dict, but agree that should be made more clear on the front page. As for unicode/special chars, I don't agree with the newer vs. older argument--my newer non-IE browsers (Opera, Mozilla, Netscape) can display the chars used in the IPA/SAMPA uses, for example, just fine, unlike my older IE 5.5. However, even my newer browsers can't display the Chinese chars. If this is to be an English dictionary, page titles, and initial descriptions/usages should be in some format a default English-language browser installation can display. I generally resent having to install additions (especially things like Flash or Shockwave) to view English Internet content. It also raises accessibility concerns.
- One thing that wasn't mentioned that I think makes Wikt look bad is the Search page--instead of Google and Yahoo, with the search term in both (like WP), Google is listed twice, with $1 in one search box. Somehow that needs to get cleaned up. Also, the page should have the create this page if you can shortcut link.
- I don't know a good solution to the pronunciation system alternatives. I find the AHD (probably not the best term to use here) and m-w.com forms MUCH more understandable, and easier to learn/remember, as they use more letters, and accents like I was taught in grade school, instead of (to me) random symbols. (That's aside from the display problems.) If extended ASCII included the correct accented vowels for the "long" forms, I would strongly support that type of system. However, since it doesn't, I can't get as strongly behind it. The suggestion of optionally always displaying the guide does have merit, especially since SAMPA/IPA is so cryptic. Using gifs or jpgs for the symbols might be another alternative. Realistically, assuming I stay (I don't know if I can take how much slower Wikt seems to be), I'll probably just ignore this area, as it is gibberish to me, and focus on what I know (spelling and grammar), and what I am interested in (antonyms and synonyms), and let others backfill.
- Wait a minute, I just looked at hold, and the AHD version uses the correct symbol, and is immediately understandable to me, unlike the other two. Assuming the other long vowel symbols are similarly available, I'm back to strongly supporting using that system, at least in addition to IPA/SAMPA--it gets around the display problems, and is immediately understandable to those of us that were taught that system as children (is it really only in the US?)--I would face a steep learning curve to be able to decipher the IPA and SAMPA versions.
- I would also like to see the template, and cited leap sample, fleshed out in a couple areas.
- US vs UK (and I assume usually the rest of the Commonwealth/former colonies) spelling and pronunciation. Currently there's no guidance how to handle that. Of course, it's gets dicier with -t vs -ed than -or vs -our: US is pretty much exclusively -or, but has partial adoption of -t: slept is fully adopted (I don't know of anyone that would use sleeped). But, say, dreamt and dreamed are closer to 50/50, and, further complicating things, are sometimes considered to be slightly different. For example, the same person might be more likely to say "Martin Luther King dreamt of equality", but would also be more likely to say "Last night I dreamed I was on a desert island".)
- Other forms of the word: I know verbs and nouns and past and present tense, and kinda know adjectives and adverbs, but once you get into past participle, intransient, etc. I am completely lost. I can write very well, I just don't know the terminology for what I am doing--I couldn't diagram a moderately complex sentence to save my life. EG Where would "leaping" go? Niteowlneils 16:56, 7 Apr 2004 (UTC)