Wiktionary talk:Project - Cleanup of basic English entries

Definition from Wiktionary, the free dictionary
Jump to: navigation, search

This seems like a useful idea. Go ahead and start with it. The big difficulty will be in getting people to properly report what they have done. Eclecticology 01:20, 20 Dec 2004 (UTC)

Some Present Comments[edit]

  • I've been working on some F's. There are a lot that need to be surgically cleaned and reworked. Maybe someone could look at one or two of the ones I've done, like fly, and comment. Mostly I clean out old Webster quotes etc etc. But then the word categories seem to replicate in the Translations area. So, for instance, there are two ADVERB entries on the page, each exactly the same. One for the main, and one for Translations. What say? I deleted 1 of several.
  • Just a suggestion here--take some of those clumsy entries and dump everything that bogs down the page, all the way to the translation tables (keep those). Sand off the rust, right down to bare metal. I can't see any other way of tackling this project. The clutter. Oh, the clutter. --HiFlyer 18:41, 25 Feb 2005 (UTC)
I would be happy to get a word or two approved for actual use. Maybe someone could look at a few of the "f"s I've cleaned...I would like to whittle the list down for the team, but as of now some input would do. Also, as mentioned in my previous post above, some of these words need to be sanded down to bare vowels. Any suggestions? --HiFlyer 00:03, 7 Mar 2005 (UTC)

Your Past Comments[edit]

  • This seems like a useful idea. Go ahead and start with it. The big difficulty will be in getting people to properly report what they have done. Eclecticology 01:20, 20 Dec 2004 (UTC)
  • Yes, this is worth pursuing. Several people have commented that Wiktionary is concentrating too much on newer words to the detriment of long-standing ones (although, of course, words in both of these categories need to be added at some point).
Initially, I understood "NOTS" to mean "the word does not have an entry yet", but does it actually mean "this word has not been reviewed yet"? Perhaps this title could be renamed, and another added, such as "DNE" (for "does not exist") or "NYW" ("not yet written") or somesuch.
Means NOTS - Not Started (the checking of the word has not yet started)--Richardb 08:48, 23 Dec 2004 (UTC)
The use of categories is a good one. Richard, could you link the phrase "Basic English word list" to the Basic English word list (which I will refer to as BEWL), so that users can see which words need to be added?
Could the table easily be generated by copying and pasting the BEWL and having a text editor add the tabulation and initial "not started" status of all the words?
How do you plan to coordinate this? I think this ought to go to the beer parlour for wider discussion. Is this something you only want the most active contributors to work on?.— Paul G 09:57, 20 Dec 2004 (UTC)
At this stage I didn't want the "general public" alerted to the shortcomings of Wiktionary to easily. And didn't want to get the vandals on board to easily. Please just spread the word to whoever needs to know.
Also, I prefer to organise activities into some sort of project, rather than just a log of a discussion--Richardb 08:48, 23 Dec 2004 (UTC)
I wholeheardedly support the notion of cleaning up the basic words. I'm not sure we need anything like this much process to do it. First, all the words on the Basic English list proper at least have entries. I can't speak for the "extended" versions, but I personally added entries for all the missing items a while ago (see the history for trick -- the comment indicates that trick was the last such entry).
That's not to say there aren't interesting basic words left undefined. Again, I've put up a frequency-ordered list of 1,000 words on User:dmh/playpen. In any case, "Not Started" as I understand it is redundant with "wikified link is red". The next level would be "entry is there but useless", and I would mark that as rfc just like anything else. Beyond that, there isn't really a well-defined ending point. It might be nice to know that three people had looked over an entry and approved it, but until such time as this is dead easy to do — that is, the system is helping us out on this — I'm not enthusiastic about keeping a detailed manual status list on some page.
Wikimedia are fundamentally biased against this sort of process-heavy approach to begin with. That's not to say that there's no room for process. I like the new rfd/rfc apparatus. It's a good example of using the system to our advantage: categories allow us to leave the centralized list to the system. However, I don't think the approach will scale particularly well to multiple categories, much less to the sort of lifecycle-based process we're discussing here.
In an alternate universe, it might be nice to have a common data model, with, say "part of speech" and "definition" as primitive objects, and with life cycles defined for various entities. With the right tools, this could still be wiki. There's no reason a data model can't be edited collaboratively just like anything else. However, we don't have these tools, and without them, I don't think it's a good idea to go too far down that path.
On the other hand, it would be nice to at least do the intial review of sorting out the rfc entries in a somewhat coordinated manner. Here's a suggestion:
  • I'll take the A's. Anything that's blatantly bad, I'll mark rfc. I may or may not tweak the others as I go. By blatantly bad I mean malformed, badly formmatted, or missing obvious definitions. When I'm done, I'll note that here and grab another chunk. Everyone else can do likewise.
  • This will leave untouched some articles that really should get closer attention. I would welcome everyone to go down the frequency list in order and apply close scrutiny. At any time. When I last did this, flushed with success from finishing Basic English, I believe I got as far as of. Doing a proper job of many of these is a lot of work (see the discussion page for of for the raw material I gathered). I'm not willing to say anything on the top 100 by frequency is "done", but basics like "the", "a(n)" and "of" are at least closer than they were.
I mention the short words specifically because I don't think they fit the process above well at all. There is just too much space between "mininum usuable definition" and "really done and dusted." Indeed, "done" may not even be possible, though at least people don't often try to invent new senses of of. -dmh 18:14, 20 Dec 2004 (UTC)
I'm through the A's. I marked 7 as stub, 2 as musty and 1 for cleanup (see the respective categories for which ones). Several others could stand work, but I'd encourage people to scan the whole list for that at leisure, and make notes or adjustments as they see fit.
I'm currently working on the B's. It may take me a while, but I'd suggest that anyone else joining in start at C's or later.
----- Please now report progress in the project table Wiktionary:Project - Cleanup of Basic English Words/Table-----
All in all, the basic words are not in such bad shape as I'd feared. Many could use more definitions etc., but most at least have the primary senses. I've made a lot of small formatting changes, but even there the most frequent problems are things like a missing English heading, or excessive wikification. I'm starting to think that the idea that the basic vocabulary is a shambles is outdated, and that a more accurate view is that while there are some conspicuous gaps, most entries just need to be visited and fine-tuned from time to time, just like any other entry.
That said, it would be good to make a systematic sweep through and ensure that every entry
  • Is properly formatted, as we understand it.
  • Has definitions for every major familiar sense of the word.
  • Is tagged with the appropriate categories.
  • Has examples for all senses unless there is clearly no need.
  • Has a reasonably complete set of related terms and "see also" items.
I've been paying particular attention to format, missing senses and related/see also terms. I've been filling in examples if they come to mind or seem to be conspiciously missing, and I've been trying to tag idioms where appropriate.
I haven't put much emphasis at all on quotations, because I don't think they're as important for common words. I don't really learn anything new from knowing that Shakespeare used "when" or "if". As I've tried to point out with rather, quotations can get in the way as much as helping.
If an entry seems to need more than just quick work, I've tagged it with a category like "cleanup", "stub" etc. I've also tried to link together the various flavors of "cleanup needed" so they all show up under Category:Requests_for_Cleanup.
For my money, the next big projects of this type after finishing the sweep through Basic English are making the same sweep through the top 1000 by frequency, and making sure that all the related/derived/see also terms mentioned on the Basic English list are defined (right now almost none are). Another idea is to fill in the topical index. -dmh 18:10, 22 Dec 2004 (UTC)

Translations of basic English words[edit]

I encourage people to give translations of basic English words into other languages. Some complicated words may have translations but the most useful words may not have a translations or it is incomplete, missing transliteration or grammmar info. Anatoli 09:17, 23 August 2009 (UTC)