Wiktionary talk:Thesaurus

From Wiktionary, the free dictionary
Latest comment: 5 months ago by This, that and the other in topic Deletion debate for "All (5) "Thesaurus:*/translations" pages"
Jump to navigation Jump to search
See Wiktionary:Thesaurus considerations for the debate about how a Thesaurus should be implemented in Wiktionary.

Why Wikisaurus[edit]

Perhaps this is addressed somewhere, but I haven't found it yet. I find myself wondering what the need for a separate wiki-thesaurus is. The separation expressly produces duplicate entries on any given word and complicates the effort of finding a good word for a given subject. The current construction is difficult to navigate, I don't know if a given word has a wikisaurus entry even if it has a Wiktionary entry. Dictionaries (at least the dictionaries I'm accustomed to) generally include synonyms/antonyms as part of their entries and the use of Semantic wiki should make that even easier. Even Wikisaurus itself, with proper template formatting, doesn't contain anything that couldn't be neatly added to the mainspace entries. I don't mean to denigrate or minimize the work being done at this project, but it seems like an unnecessary division of projects that creates unnecessary work for editors (you don't need to specify that you use the same inclusion criteria if you're no forking the project) and makes finding good information more difficult for users (it took me a while to realize that wikisaurus even exists). Darker Dreams 07:20, 7 December 2011 (UTC)Reply

My reasons why it is worthwhile to have Wikisaurus are spelled out here: User:Dan_Polansky/Wikisaurus#Benefits_over_the_mainspace. --Dan Polansky 10:14, 7 December 2011 (UTC)Reply
I started to try a point by point reply, but that huge block of unformatted text is not particularly easy to parse, so I'll just leave it that I disagree. As a user, having the thesaurus (synonym) functions separated out makes wiktionary less useful and usable for me. I'm also disinclined to put energy into working on what looks like a restart of the mainspace project. Honestly, the wiktionary articles are more akin to what I'm accustomed to from dictionaries (a definition and some related synonyms. Add a pronunciation and some example uses and you're there) than the mainspace's articles which are almost entirely focused on the word's association with other languages. Darker Dreams 21:14, 7 December 2011 (UTC)Reply
What you are referring to as a "huge block of unformatted text" is a paragraph that has 640 words. The only way in which it is unformatted is in that it does not use typographical bullet points; instead, it uses natural language keywords such as "First", "Second", and "Third". If you are serious about discussing why a thesaurus is better hosted in the main namespace, you can select just one of the 7 reasons that I have listed, and discuss that one.
You are not being asked to put energy into Wikisaurus; you are putting close to no energy into Wiktionary anyway with your 16 mainspace edits since 26 April 2006.
What you are accustomed to from other dictionaries are lists of synonyms associated with entries rather than a full-fledged thesaurus. For an example of a full-fledged thesaurus, see Thesaurus of English Words and Phrases by Peter Mark Roget from 1911, linked to from Wiktionary:Public_domain_sources#Roget's_thesaurus_1911. Wikisaurus does not prevent lists of synonyms from being listed in the main namespace, though. --Dan Polansky 12:32, 8 December 2011 (UTC)Reply
I agree. Any advantages that exist in having separate Wikisaurus entries for words is outweighed by the advantages of making the thesaurus entries part of the main entries. Each main entry lists translations of the word; why shouldn't it also list synonyms, etc.? Furthermore I had no idea Wikisaurus existed until I came to this page, and probably never would have had I not gone here. Kraŭs (talk) 02:44, 25 June 2012 (UTC)Reply
I agree that it should be in the main term, just like translations. I'll format the unformatted text so we can have a good discussion, and then we should bring this to vote.Pashute (talk) 16:04, 11 July 2013 (UTC)Reply
It seems its been formatted, but still lengthy. Here are the claims in short, lets discuss it in next section. Pashute (talk) 16:51, 11 July 2013 (UTC)Reply

Inline Wiktionary thesaurus (inside terms) vs. Wikisaurus[edit]

As listed by User:Dan_Polansky/Wikisaurus#Benefits_over_the_mainspace

  • 1. Need for separation from synonyms, antonyms etc. - The terms as they are defined with synonyms antonyms etc. are not enough. A separate section or namespace is needed.
  • 2. Easier on editor, without dictionary context.
  • 3. Ease of maintenance - automatically linking from one ws:term to another ws:term as in translations is easy.
  • 4. Ease of Use problem with collapsible sections - in order to get to the thesaurus you'll have to collapse scroll down and open the collapsed thesaurus section. Not intuitive or easy.
  • 5. Scattered thesaurus leads to out of date info - Better to have everything in "one place"
  • 6. Easy moving of semantic clusters
  • 7. Linking to non wiktionary terms

My reply to these is:

  • 1. Separation from the linguistic and wiktionary sections - granted. But no reason for wikisaurus. Just a Thesaurus section.
  • 2. Easier on editor? Why. Give the editor the possibility to add remove and sort thesaurus terms at will, in its section. A user interface could be developed similar to that for the translation interlinks, and users (not editors) could even vote the usage of a thesaurus term up or down, moving it forward or backwards in the current context. This will create a true mapping between all entries.
  • 3. Ease of maintenance - only if there are less terms, and all connected terms are kept with same order of importance. But when you link from animal back to cat, I don't think it will be in the same order as from cat to animal.
  • 4. Ease of use - It can show up on the search engines just the same. Probably. Maybe I'm wrong. So when you link to the wiktionary on the search engine, if you chose the thesaurus section, your taken to it, open already.
  • 5. You mean that if I update Cat, the Animal thesaurus may be missing something? I don't think so.
  • 6. Easy moving of semantic clusters. - copy and paste?
  • 7. Linking to non wiktionary terms - Well your right. But since your not going to explain the term, leave those terms out altogether as links, and keep them as 'thesaurus terms', or as wikipedia terms or even as plain text. When searching for "alcoholic beverage" you'll find it in the thesaurus definitions of wine, beer and methanol. So what's the problem?

it would immediately make wiktionary move up on web searches, help millions of users, cause more terms to be entered, and most importantly, a Wikisaurus can automatically be created from that information, so any benefits of the wikisaurus are preserved.Pashute (talk) 16:57, 11 July 2013 (UTC)Reply


Dan noted that I misrepresented his discussion in my summary, that the vote is too preliminary, and that it would be preferable to discuss his claims one by one.
The proposal here is to have a new 'Thesaurus' section in Wiktionary for each definition of each term. It will be open to terms and phrases not necessarily linked to Wiktionary, and will have a "user vote" option in order to create a "web cloud" for the thesaurus links.
Following this, as Dan requested on my talk page, the proper location for this discussion should be the Witionary Beer Parlour.
I ask Dan to please revoke the vote in the next section and we'll continue the discussion in the main Wiktionary Beer Parlor. Pashute (talk) 10:10, 14 July 2013 (UTC)Reply

Vote for inline thesaurus section on each term[edit]

  1. Support - as stated above Pashute (talk) 16:57, 11 July 2013 (UTC)Reply
    • You have 1 edit in the mainspace. Your responses above to my reasoning are as substandard, ignoring most of the substance of my reasoning. Your boldfaced summaries of what my reasoning is are misrepresentations. If you want to have a serious discussion, pick one or two items of my list, and have a serious look at what they say. Then try to come up with a reasoned and substantive refutation. Then I will be able to response to it. In your response, there really is close to nothing to respond to. --Dan Polansky (talk) 18:09, 11 July 2013 (UTC)Reply
    • Wikisaurus is not preventing you from expanding lists of synonyms in the mainspace. There are lists of synonyms in the mainspace; check e.g. cat#Synonyms. I have never removed synonyms from the mainspace only because they were already in Wikisaurus or because I was about to add them to Wikisaurus. I think I have actually rarely ever removed any synonyms from the mainspace at all.
    • Discoverability of Wikisaurus: Wikisaurus is linked from synonyms sections of the mainspace, a natural place to look for synonyms. --Dan Polansky (talk) 18:17, 11 July 2013 (UTC)Reply
  2. Vote cancelled, as per diff. --Dan Polansky (talk) 19:01, 14 July 2013 (UTC)Reply

Coordinate terms[edit]

I find the section "coordinate terms" rather useless for most Wikisaurus pages. Listing "fish", "amphibian", "reptile", and "bird" as coordinate terms in WS:mammal is fairly pointless, as they are already listed at WS:vertebrate (one click away from WS:mammal, in its "hypernyms" section), and as the practice leads to repetition of the five coordinate terms (including "mammal") on each term's page. The repetition gets much worse with larger lists of coordinate terms. Some more explanation follows.

The section "coordinate terms" is occasionally used in the mainspace. "Coordinate terms" are those terms that are siblings in the hyponymy tree (or direct acyclic graph); they are on approximately the same level of detail or specificity and they share a near hypernym. Thus, "mammal" has "fish", "amphibian", "reptile", and "bird" as coordinate terms with respect to "vertebrate". The term "arthropod" is not a coordinate term of "mammal", as it is separated from it by the vertebrate-invertebrate division. However, "arthropod" can be seen as a coordinate term of "mammal", if the vertebrate-invertebrate distinction is ignored, and both are seen as hyponymic descendants of "animal" that have approximately the same level of specificity. --Dan Polansky (talk) 07:45, 9 April 2012 (UTC)Reply

save raw data, into this thesaurus?[edit]

some "incomplete" "dirty" raw information is getting swiftly deducted on wikipedia. Those data-varieties like partly fuzzy knowledge from farmers and housewifes could be integrated in such a thesaurus like this one. What do you think? save raw data at all costSimon Jäkle (talk) 13:13, 7 July 2013 (UTC)Reply

Name Change:[edit]

Name should be changed to "Wiksaurus [ˈwik.saʊ.ɹʌs] --" I have to admit, when I read first "Wikisaurus," I was expecting a dinosaur. If Wiktionary's name was "Wikitionary [wiki̩ː.ʃʌn.ɛɹi]," as I first thought, then "Wikisaurus" would be acceptable.

StarDict format[edit]

Hi all,

Do we have Wikisaurus available in stardict format for offline use?

Regards,
--Gryllida 01:41, 22 June 2017 (UTC)Reply

404 links[edit]

Hello,

Of these links [here https://en.wiktionary.org/w/index.php?title=Wiktionary:Wikisaurus&action=edit&section=10],

The external links are 404.

Regards,
--Gryllida 01:41, 22 June 2017 (UTC)Reply

Beer parlour convo[edit]

--Barytonesis (talk) 16:24, 3 July 2017 (UTC)Reply

Vote page[edit]

Could someone please redirect this page to Wiktionary:Thesaurus (currently a redirection itself)? --Barytonesis (talk) 18:12, 12 October 2017 (UTC)Reply

Done, see this discussion. --Barytonesis (talk) 10:01, 11 November 2017 (UTC)Reply

Adding other languages[edit]

Languages such as Chinese, specifically Mandarin, have lately improved a great deal their entries, yet the Thesaurus project seems to be of little interest, so I'd like to know whether there's a specific reason for this. --Backinstadiums (talk) 15:50, 9 February 2018 (UTC)Reply

@Backinstadiums There is not, except that noone has bothered to create them yet. Also, there are several models for what title to give the pages, e.g. WS:Danish/furthermore, WS:da:beautiful. Most entries currently existing do not specify the language in the title, and for non-English entries the title is usually a native word, which was discussed here. Don't let that dissuade you from contributing to the thesaurus, though.__Gamren (talk) 20:48, 12 May 2018 (UTC)Reply

Language categorization[edit]

{{ws header}} now accepts a lang parameter, which categorizes into [Category:XX thesaurus entries]. Disagree on implementation (e.g. category name)?__Gamren (talk) 20:54, 12 May 2018 (UTC)Reply

Grouping *nyms - text labels vs. generic horizontal rule separators[edit]

This page links to Thesaurus:beverage as an example entry that organizes its hyponyms into groupings. It does so using the template {{ws ----}}, which generates a bullet with a horizontal rule. I've seen other pages that do something similar but with text labels for each grouping, e.g.

I personally like this style a lot better, both as a reader and an editor. It's easier to take in at a glance, and if I want to add a new term, I don't need to guess at what the different groupings are meant to represent. (Why are punch and cordial grouped with tea and coffee rather than with soda and juice? I'm guessing a confused editor added them in the wrong place, but maybe there's some deeper principle I'm not understanding.)

Would anyone object to my refactoring entries like Thesaurus:beverage to use text labels for groupings? I'd like to also add a template for generating the labels. The three examples above each use subtly different formatting (indented, unindented, indented with colon). It would be nice to be able to define a consistent style in one place. I would lean toward the 'indented no colon' style (as in Thesaurus:animal sound), but don't feel strongly about the particulars of the format so much as having some consistency. Colin M (talk) 20:50, 17 March 2021 (UTC)Reply

I agree that the labeled are both more intuitive and more useful. Thesaurus:bird is another example that uses labeled separators for grouping the hyponyms.
In a similar note, I have suggested a change of the {{ws ----}} template, to make it render as a horizontal rule (as I've experimented with here), which IMO would alleviate the issues described here (though an explicit label would still be preferable). I'd appreciate any comments in that discussion. --Waldyrious (talk) 23:41, 4 January 2023 (UTC)Reply
I agree that textual labels shall be used going forward, for ease of use. I oppose the use of horizontal rule, breaking the long-term tradition; eventually, neither the current {{ws ----}} solution nor the horizontal rule shall be needed as textual quasi-headings are going to be used instead. --Dan Polansky (talk) 13:10, 6 January 2023 (UTC)Reply
It has been my position that editors with zero or very little substantive contribution to the Thesaurus have no business introducing deviations from the predominant practice. One ought to first do some real work and prove one's mettle, then rule. --Dan Polansky (talk) 13:11, 6 January 2023 (UTC)Reply

Make multilingual thesaurus at Wikidata:Lexicographical data[edit]

Hi all, please involve with Wikidata:Lexicographical data and create a mechanism for Thesaurus as well. Because the Lexicographical data project is intended to provide structured support for Wiktionaries, I think a multilingual thesaurus page would be very nice feature.

Rightaway, you can contribute by adding Synonym (P5973) property to all the 600,000+ lexeme entries. Thank you! Vis M (talk) 04:40, 22 October 2021 (UTC)Reply

Deprecating "translations" subpages[edit]

I boldly went ahead and deprecated the practice of translation subpages. The practice has not caught on for over a decade. Two main impacted entries:

Revision histories still show the translation lists, likely uncurated and full of errors.

I created French and Spanish Thesaurus:pénis. I created Portuguese section in Thesaurus:vagina. I expanded Thesaurus:Penis. I moved Thesaurus:beer/translations to Thesaurus:olut. I relinked entries from the mainspace. I did not much else.

The only significant possible loss that I see is of the translations that were in the two main impacted pages. Interested parties can create thesaurus entries for them using the overwhelming practice. --Dan Polansky (talk) 19:33, 9 October 2022 (UTC)Reply

I now also deprecated Thesaurus:idiot/translations, which had only Portuguese synonyms: these are now at Thesaurus:idiota.

Rationale: Reduce the number of different practices; this practice is a tiny minority one. Reduce the number of languages per thesaurus entry.

Consensus: Many editors consider the prevalent practice of using native headwords without language code to be an acceptable practice. Those who disagree and want to use the language code in the entry name do so so that there is only one language in an entry. /translations subpages go in the opposite direction, hosting all languages in a single entry. The editors who want to use /translations subpages are a tiny minority.

--Dan Polansky (talk) 06:46, 10 October 2022 (UTC)Reply

Key players[edit]

Here is some history of who the key players in the Thesaurus were, from memory:

  • Richardb was a great champion of the thesaurus back in 2006. He was working in the spirit of absence of regulation, wanting to cover as much as possible.
  • TheDaveRoss advanced the project by bringing other semantic relations to it, not just synonymy and antonymy. (Needs double check.)
  • Amina Sack36 designed some neat templates and warned that thesauri are made by people who are not thinking out of the box, and that users may be looking for words related in ways that are not rigidly prespecified. 2008.
  • Dan Polansky (me) did a lot of cleanup and expansion in 2008, and later, insisting on using all semantic relations and on the thesaurus being a cluster thesaurus, not having a dedicated page for each mainspace word but rather a page per word cluster.
  • Daniel Carero did significant expansion later.
  • Adam B. Morgan made a lot of expansion over multiple years.

Would need a proper double checking to see whether the details are correct. Dan Polansky (talk) 07:30, 4 January 2023 (UTC)Reply

Deletion debate for "All (5) "Thesaurus:*/translations" pages"[edit]

The following information has failed Wiktionary's deletion process.

It should not be re-entered without careful consideration.


A really redundant and ultimately not all that useful format to present this information. The content should be distributed to various Thesaurus:* pages where * is the respective "default" word in that language, whatever that is. If this RFD passes, I'll take the freedom of adding some text to WT:Thesaurus making it clear that we don't want to have such "Thesaurus:*/translations" pages. — Fytcha T | L | C 13:38, 6 January 2023 (UTC)Reply

Support, I see no reason for these few pages to use a format inconsistent with everything else in the Thesaurus namespace. Make sure all the backlinks are fixed though. 70.172.194.25 06:35, 11 January 2023 (UTC)Reply
  • Virtually delete by redirection, to keep the revision history available to all editors. As a second best option, truly delete. In any case, abandon this practice of using /translations pages. This could have been done without a formal process as the obviously right thing to do, but using a formal process (RFDO) confirms that this is the will of multiple editors, which is fine and prevents future squabbles with trolls and/or incompetents. --Dan Polansky (talk) 08:36, 16 January 2023 (UTC)Reply
    I agree that the history should ideally be preserved for the sake of attribution. 70.172.194.25 19:07, 22 January 2023 (UTC)Reply

RFD-delete(d) - I'm gradually moving the content to more suitable locations. This, that and the other (talk) 01:14, 3 November 2023 (UTC)Reply

The only remaining page is Thesaurus:penis/translations. This has a lot of content and it will take some time to move it to the correct locations. I also discovered Thesaurus:sound/fi, which has the same problem as the pages mentioned here, and needs the attention of our Finnish editors to sort out. This, that and the other (talk) 11:50, 7 November 2023 (UTC)Reply