User talk:Brettz9

From Wiktionary, the free dictionary
Jump to navigation Jump to search

testing here too...

Thanks for adding Dutch to the Swadesh list. For your information, my question was related to where Dutch should go, not how it should be done. D.D. 09:32 Mar 11, 2003 (UTC)


Could you please have a second look at what you did with Swadesh list. With the second of your recent changes to improve the Chinese, it seems that some of the French and German accented words got mangled. This may have happened if you downloaded the entire article in order to edit offline before re-uploading if some step along the way was not compatible with UTF-8 encoding.

Some Dutch words ended up in the wrong column too. D.D. 20:33 Mar 16, 2003 (UTC)
Go ahead and revert to an older version if there is a problem. It was taking forever for Wiktionary to load that day, so I didn't think I had even fixed it, but maybe I did cut and paste back and forth if that's what you mean.
I'll take a look again to see what can be done. - Ec

In a completely different issue there may be a particular challenge about linking romanized Chinese with tone indications. At this point I'm only putting this forward as a "heads-up" for something to think about. Eclecticology 19:54 Mar 16, 2003 (UTC)

Do you mean a problem with using the numbers as tones or the actual tone accents? It'd also be nice to put in the Chinese characters.
Putting the simple numbers after the word (wo3) is certainly the easiest way to represent these. Putting these numbers as superscripts (wo3 or wo³ is more work or not all supported. Using the tone mark (wǒ) isn't recognized by all browsers, nor is the chinese character (我). Should the entry be for wo3 or wo3 etc. This was the kind of problem I had in mind. The link to the Chinese character should be additional. Eclecticology 05:48 Mar 18, 2003 (UTC)
Well since representing just the sound without the tone (such as wo3) would involve definition pages of some 25+ definitions, it may be better to just do wo3. That is not as specific as the character, but at least it narrows things down somewhat (though I admit while learning the language, it can be helpful to compare all of the similar sounding words together (regardless of tone)). I've wondered whether anyone has devised a system to represent the character more exactly such as by wo3a (the letter following the number allowing distinctions to be made with other characters with the same tone). In cases where characters had two pronunciations, they would be represented distinctly one one might lose their association (though they could be cross-referenced) but I think this makes sense, especially since the separate sounds may also correlate with different meanings and they need to be separately memorized.

It is these kinds of areas (where a person might like the option of different groupings) when a database would come in handy. Of course, if someone wanted to do all the work, a person could make wo3 and wo3 and wo3a/b/c, etc., but that would get pretty unwieldy.

By the way, rather than reinventing the wheel on all of this, I've seen (at least for Chinese) that there are a lot of (perhaps even public) dictionaries and databases.

Also, I haven't posted a suggestion (or looked around) at Metapedia, put I'm curious if anybody here knows anything about opportunities for creating interactive databases here (particularly but not exclusively if they can be customizably viewed and sorted). I don't know how easy it is to get information out of an html table into a database, but in my mind, there could be a lot of effort wasted (or at least our resources not used at optimal efficiency) if we spent all this energy in this table and then an interactive database was just around the corner (or already possible), given that a database would have many other advantages.

I guess cutting and pasting out of an html table does not take that much work to convert it into a database (especially if there is already software which does it automatically), but having an interactive database would still be nice for other reasons. - Brettz9 04:39 Mar 18, 2003 (UTC)
It would be nice. -Ec

Also, I know this is only a 500 [Only 207 -Ec] word list here, but it would seem to be logically a subset of a larger database cross-referencing an array of languages if that is technically possible. - Brettz9 04:23 Mar 18, 2003 (UTC)

With the present structure the number of languages is a bigger challenge than the number of words on the list. Eclecticology 05:48 Mar 18, 2003 (UTC)
Oops sorry, I didn't doublecheck the number before writing it. I confused it with word frequency lists I've been looking into.
What I meant by a larger database was one single integrated database of all words with some marked as Swadesh words (i.e., an auto-generated Swadesh page based on words being flagged as such). A person could just specify which langages they wanted to see Swadesh words for (or see which languages had some, etc.).
There are some incredible resources out there already (varying as to the extent of their being available to the public) for using or generating language lists (e.g., the Swadesh words could be put through a word-by-word parsing translator or we could also represent word frequency lists by putting them through as well). Imagine a database which could calculate and automatically update word frequency. - Brettz9 03:36 Mar 19, 2003 (UTC)
Yes there are incredible resources, as in the Rosetta project, and I can't imagine that the lists temselves could be copyrighted. I would be suspicious of the machine translators, since many of the lists in question will not only distinguish clearly different languages, but also differences in dialects that would enable the drawing of dialect maps. Word frequency lists can be useful in the forensic applications of linguistics when trying to identify forged or wrongly attributed works of literature and how they differ stylistically from an author's genuine works. Eclecticology 20:04 Mar 19, 2003 (UTC)

I notice that you set up a link for a Persian Swadesh list. May I suggest the title Swadesh list (Indo-Iranian) As part of an expansion of the concept what is now in the Hindi column of the original article could also be moved there. At some later stage what remains on the original article could be reallocated to Germanic, Romance and Far Eastern pages. Eclecticology 10:52 Apr 12, 2003 (UTC)

Yes, that sounds like a good idea (I couldn't remember the family off hand (besides Indo-European)). I started the link, but I realized I should verify something before posting it. - Brettz9 21:15 Apr 12, 2003 (UTC)


Reply to questions about regionalisms and chronologies

A chronological list (and to a somewhat lesser extent a regionally based one) could rapidly become useless because of its eventual length and wide range of topics. As it grows it also becomes manual searching becomes more difficult. There is, of course, some interest in the sort of points that you raise, but I don't see that manually developed lists will solve the problem. Manual lists tend to be very hit and miss, and we can't impose any kind of disciplined treatment that would ensure that a given word is placed on every list where it may belong.
The single most important thing in this project is the words themselves. Unfortunately, we have a significant number of Wiktionarians who tend to be obsessed in one way or another with creating lists. To some extent these are useful in identifying the work that needs to be done, but we can never rely on them as an indication of what Wiktionary contains. The article for a word should eventually show when it was first used or where it tends to be used in preference to another word. The word practitioner is a common enough word, but in the 1860s it was also used as slang to mean a thief. If we put this word in a chronological list of 1860s slang terms it may show up in blue as a word that already has an article, but we have no way of knowing without seeing the actual article whether it includes anything but the ordinary meaning. I would very much prefer to see an appropriate indexing and search system where by entering the desired search terms any such list can be created on demand. Eclecticology 04:37, 25 Sep 2003 (UTC)

你好!你看到了?[edit]

你好!你看到了?[edit]

testing here too...

Thanks for adding Dutch to the Swadesh list. For your information, my question was related to where Dutch should go, not how it should be done. D.D. 09:32 Mar 11, 2003 (UTC)


Could you please have a second look at what you did with Swadesh list. With the second of your recent changes to improve the Chinese, it seems that some of the French and German accented words got mangled. This may have happened if you downloaded the entire article in order to edit offline before re-uploading if some step along the way was not compatible with UTF-8 encoding.

Some Dutch words ended up in the wrong column too. D.D. 20:33 Mar 16, 2003 (UTC)
Go ahead and revert to an older version if there is a problem. It was taking forever for Wiktionary to load that day, so I didn't think I had even fixed it, but maybe I did cut and paste back and forth if that's what you mean.
I'll take a look again to see what can be done. - Ec

In a completely different issue there may be a particular challenge about linking romanized Chinese with tone indications. At this point I'm only putting this forward as a "heads-up" for something to think about. Eclecticology 19:54 Mar 16, 2003 (UTC)

Do you mean a problem with using the numbers as tones or the actual tone accents? It'd also be nice to put in the Chinese characters.
Putting the simple numbers after the word (wo3) is certainly the easiest way to represent these. Putting these numbers as superscripts (wo3 or wo³ is more work or not all supported. Using the tone mark (wǒ) isn't recognized by all browsers, nor is the chinese character (我). Should the entry be for wo3 or wo3 etc. This was the kind of problem I had in mind. The link to the Chinese character should be additional. Eclecticology 05:48 Mar 18, 2003 (UTC)
Well since representing just the sound without the tone (such as wo3) would involve definition pages of some 25+ definitions, it may be better to just do wo3. That is not as specific as the character, but at least it narrows things down somewhat (though I admit while learning the language, it can be helpful to compare all of the similar sounding words together (regardless of tone)). I've wondered whether anyone has devised a system to represent the character more exactly such as by wo3a (the letter following the number allowing distinctions to be made with other characters with the same tone). In cases where characters had two pronunciations, they would be represented distinctly one one might lose their association (though they could be cross-referenced) but I think this makes sense, especially since the separate sounds may also correlate with different meanings and they need to be separately memorized.

It is these kinds of areas (where a person might like the option of different groupings) when a database would come in handy. Of course, if someone wanted to do all the work, a person could make wo3 and wo3 and wo3a/b/c, etc., but that would get pretty unwieldy.

By the way, rather than reinventing the wheel on all of this, I've seen (at least for Chinese) that there are a lot of (perhaps even public) dictionaries and databases.

Also, I haven't posted a suggestion (or looked around) at Metapedia, put I'm curious if anybody here knows anything about opportunities for creating interactive databases here (particularly but not exclusively if they can be customizably viewed and sorted). I don't know how easy it is to get information out of an html table into a database, but in my mind, there could be a lot of effort wasted (or at least our resources not used at optimal efficiency) if we spent all this energy in this table and then an interactive database was just around the corner (or already possible), given that a database would have many other advantages.

I guess cutting and pasting out of an html table does not take that much work to convert it into a database (especially if there is already software which does it automatically), but having an interactive database would still be nice for other reasons. - Brettz9 04:39 Mar 18, 2003 (UTC)
It would be nice. -Ec

Also, I know this is only a 500 [Only 207 -Ec] word list here, but it would seem to be logically a subset of a larger database cross-referencing an array of languages if that is technically possible. - Brettz9 04:23 Mar 18, 2003 (UTC)

With the present structure the number of languages is a bigger challenge than the number of words on the list. Eclecticology 05:48 Mar 18, 2003 (UTC)
Oops sorry, I didn't doublecheck the number before writing it. I confused it with word frequency lists I've been looking into.
What I meant by a larger database was one single integrated database of all words with some marked as Swadesh words (i.e., an auto-generated Swadesh page based on words being flagged as such). A person could just specify which langages they wanted to see Swadesh words for (or see which languages had some, etc.).
There are some incredible resources out there already (varying as to the extent of their being available to the public) for using or generating language lists (e.g., the Swadesh words could be put through a word-by-word parsing translator or we could also represent word frequency lists by putting them through as well). Imagine a database which could calculate and automatically update word frequency. - Brettz9 03:36 Mar 19, 2003 (UTC)
Yes there are incredible resources, as in the Rosetta project, and I can't imagine that the lists temselves could be copyrighted. I would be suspicious of the machine translators, since many of the lists in question will not only distinguish clearly different languages, but also differences in dialects that would enable the drawing of dialect maps. Word frequency lists can be useful in the forensic applications of linguistics when trying to identify forged or wrongly attributed works of literature and how they differ stylistically from an author's genuine works. Eclecticology 20:04 Mar 19, 2003 (UTC)

I notice that you set up a link for a Persian Swadesh list. May I suggest the title Swadesh list (Indo-Iranian) As part of an expansion of the concept what is now in the Hindi column of the original article could also be moved there. At some later stage what remains on the original article could be reallocated to Germanic, Romance and Far Eastern pages. Eclecticology 10:52 Apr 12, 2003 (UTC)

Yes, that sounds like a good idea (I couldn't remember the family off hand (besides Indo-European)). I started the link, but I realized I should verify something before posting it. - Brettz9 21:15 Apr 12, 2003 (UTC)


Reply to questions about regionalisms and chronologies

A chronological list (and to a somewhat lesser extent a regionally based one) could rapidly become useless because of its eventual length and wide range of topics. As it grows it also becomes manual searching becomes more difficult. There is, of course, some interest in the sort of points that you raise, but I don't see that manually developed lists will solve the problem. Manual lists tend to be very hit and miss, and we can't impose any kind of disciplined treatment that would ensure that a given word is placed on every list where it may belong.
The single most important thing in this project is the words themselves. Unfortunately, we have a significant number of Wiktionarians who tend to be obsessed in one way or another with creating lists. To some extent these are useful in identifying the work that needs to be done, but we can never rely on them as an indication of what Wiktionary contains. The article for a word should eventually show when it was first used or where it tends to be used in preference to another word. The word practitioner is a common enough word, but in the 1860s it was also used as slang to mean a thief. If we put this word in a chronological list of 1860s slang terms it may show up in blue as a word that already has an article, but we have no way of knowing without seeing the actual article whether it includes anything but the ordinary meaning. I would very much prefer to see an appropriate indexing and search system where by entering the desired search terms any such list can be created on demand. Eclecticology 04:37, 25 Sep 2003 (UTC)

Sub-Project -Template renaming[edit]

I propose to move/rename the page Wiktionary:Basic English template (and all other related Template pages), as this is not a template at all. I will rename it to "Basic English list", which is what it is.

It confused the hell out of me for a while when I started. I looked for a template to create an English entry, and I got this strange so-called "template".

It's going to take me a time to renmae all those pages, and refix all the links. But, unless basic concepts are upheld, Wiktionary will remain confused.

I'm giving notice now, and will do it in the next few weeks. Unless anyone has a good argument against it.

I've set up a page Wiktionary:Sub-Project -Template renaming to conduct any discussion around this sub-project.--Richardb 05:47, 12 Dec 2004 (UTC)

Idea for a project to get basic words defined.[edit]

Idea for a project to get basic words defined. Check my idea page. User:Richardb/Project - Basic English Word Cleanup. See what you think. I've not publicised this in the Beer Parlour because I don't want just anyone to test out the idea, only the currently active players--Richardb 11:37, 18 Dec 2004 (UTC)

"Self-promotion"[edit]

At Stranger's request I have looked at your recent exchange with him. I think you managed to step on a rake. :-) The words "which I just started" were bound to get that kind of reaction. I do think that you were acting in good faith.

I've never been very kind with protologisms, but a cooperative arrangement could have benefits for both sides in that ongoing debate. I would suggest that you open a discussion on the matter in the Beer Parlour to see what kind of reaction there would be to such an outside site that siphons off a topic which would arguably violate the "no original research" restriction. Eclecticology 18:54:35, 2005-09-09 (UTC)

self-admitted neo[edit]

Re: Intolerista

This is a self-admitted neologism; but I don't know what to do with it. You seem to have knowledge in this category, so I respectfully request that you file this appropriately (or fix it, if possible). Is there a neologism category, or should it just be listed under RFC? I can't find a neo category at the moment. Cheers, --Stranger 02:54, 10 September 2005 (UTC)[reply]