MediaWiki talk:Edittools/Archive 1

Definition from Wiktionary, the free dictionary
Jump to: navigation, search


This feature does not work on many setups, even though the former "pure javascript" version before the recent upgrade did work on those setups. (Some setups apparently didn't work with either). See the Wiktionary:Beer parlourHippietrail 11:54, 4 Jan 2005 (UTC)

WTF? Paul G, can this be condensed come?! --Connel MacKenzie 07:16, 19 Mar 2005 (UTC)

I wonder if we can change how <charinsert> works so it can enter UTF-8 directly instead of HTML entities. For one thing, entities do not work the same for search purposes, even with the new search engine. I'll try to find out where to report a bug... — Hippietrail 02:11, 7 Jun 2005 (UTC)

Bug has now been reported. Watch here if you're interested:

More characters[edit]

Any chance of getting ¿, ¡, «, and » into the list? - TheDaveRoss 02:16, 7 Jun 2005 (UTC)

Done! Thanks for asking. — Hippietrail 03:11, 7 Jun 2005 (UTC)

IPA characters not in the "character palette" below the edit box: ɝ ɾ --Denelson83 05:30, 21 July 2005 (UTC)

Done! What Hippietrail said. --Connel MacKenzie 05:59, 21 July 2005 (UTC)

Bogus Esperanto characters?[edit]

There are some characters here which are probably not needed and I'm guessing were assumed to be for Esperanto. One edit comment said "u macron for Esperanto" for a character which didn't have a macron and isn't used in Esperanto. Meanwhile, the character ŭ (u breve) which is used in Esperanto is not here at all. Since there are no other letters using the breve I didn't want to step on anybody's toes by removing some and adding this without consultation. — Hippietrail 11:56, 21 July 2005 (UTC)

Menu-driven special character feature[edit]

I've been working on a feature I wanted to do a couple of months ago but at that time I didn't have the HTML/CSS/Javascript skill needed. I have it working in standalone HTML and just need to integrate it with Wiktionary. I'll post the source here so anybody who wishes can take a look. Note that the standalone version doesn't actually insert any text since the function insertTags() is part of Monobook, not Javascript. — Hippietrail 23:58, 21 July 2005 (UTC)

I've gone ahead and added code to the global Javascript file (MediaWiki:Monobook.js) to make subsets of the special characters selectable rather than seeing the whole lot at once.
I think it might be better with this functionality to make a subset for each language but I'm open to suggestions. Of course let me know here or on my talk page if there are any bugs, if you would like any features added, or if you think I'm barging in and taking over without proper consultation (-; — Hippietrail 08:09, 22 July 2005 (UTC)

It's only a problem when one needs the characters that are now missing. I just needed the "æ" with a macron, and it was no longer there. What I see now is a reduced set of special characters and no obvious way to get at the others. The feature seems like a fine idea if it worked. If it doesn't work I'm willing to put up with the long box. As you know I do not hesitate to complain when you do something that I feel is strange, but I do not think that in this case. My complaint now is not about the idea, but about the implementation. Eclecticology 05:00, July 23, 2005 (UTC)

Hi Ec, let's try to diagnose your problem. In which section was this character? I can't see it anywhere but then again I'm on somebody else's computer without a full set of fonts. You should see a dropdown menu between the three buttons (Save page, Show preview, Show changes) and the "Characters:" section. The default setting is "Latin/Roman". If you don't see it, try pressing CTRL=F5 to refresh your browser's stale cache. Another possibility is that you might be using a skin other than Monobook. I haven't added support for other skins. I can do two things if this is the case: a) add support also to your skin b) make it work with the new way for monobook and also the old way for other skins. Let me know which you prefer. In the meantime I'll investigate the code. Unfortunately these Wiktionary-only features only have us for Betatesters. The Mediawiki devs get a look more exposure for their code to help get it right for everybody. — Hippietrail 07:01, 23 July 2005 (UTC)

It looks like the problem is the skin. I use the Classic skin. And you're right the ǣ character was not in the list before. I was using it for an Old English word in the Etymology section of wapentake. I think that the idea of having support for this idea in other skins is preferable. I didn't mind the long character box, but it could get even longer as we try to meet the various needs that people could have. Once I can see the new approach in operation I should be able to have more comments. Thanks for working on this. Eclecticology 08:13:11, 2005-07-23 (UTC)

While I was waiting for your reply I've gone ahead and made the long character box the default for skins which do not explicitly have the new functionality. Let me know if it works again for you. I'll now see if I can add the javascript functions to the classic skin for you... — Hippietrail 08:29, 23 July 2005 (UTC)
Sorry Ec, I tried to add the functions but in the end I found out that the classic skin lacks the global javascript file which is what Monobook uses. It might be possible to put the function in your user javascript page. If you try the Monobook skin you'll see I've also added an Old English section by the way. — Hippietrail 09:38, 23 July 2005 (UTC)

"Menu-driven special character feature" request for Serbian Cyrillic characters[edit]

Could somebody please add a separate option for Serbian Cyrillic characters? Under the "Cyrillic" option, only Russian characters are displayed. The Serbian Cyrillic characters are the following: А, а, Б, б, В, в, Г, г, Д, д, Ђ, ђ, Е, е, Ж, ж, З, з, И, и, Ј, ј, К, к, Л, л, Љ, љ, М, м, Н, н, Њ, њ, О, о, П, п, Р, р, С, с, Т, т, Ћ, ћ, У, у, Ф, ф, Х, х, Ц, ц, Ч, ч, Џ, џ, Ш, ш. I also wanted to ask you if you could change the name of the "Croatian" option to "Serbo-Croatian". Thank you. --Dijan 07:56, August 7, 2005 (UTC)

Adding of the Serbian Cyrillic characters is done. I do not know how MediaWiki:Monobook.js is supposed to be modified, so the renaming will have to wait until Hippietrail gets to it. --Connel MacKenzie 09:19, 7 August 2005 (UTC)

I tried copying this over into the Anglo-Saxon wiktionary, but it didn't work. All the characters showed up at once, and no menu option was present. Any ideas how to make it work? --James

This is basically the data, you'll also need the code which is on MediaWiki:Monobook.js - but it will probably need editing since there is other code there too and the Anglo-Saxon Wiktionary might already have code that would need to be merged. — Hippietrail 07:09, 6 September 2005 (UTC)

Ndash & mdash[edit]

IMO, these should stay as HTML. The reason is, it is virtually impossible to separate -, – and — in edit mode. They look just the same. Jon Harald Søby 19:46, 2 December 2005 (UTC)

Spanish exclamation and interrogation marks ¿ ¡[edit]

Could some administrator include the Spanish exclamation and interrogation marks ¿ ¡ to the Edit tool?, thank you:O) --Javier Carro 10:55, 7 January 2006 (UTC)


Devanāgarī script (which can be used for Sanskrit, Hindi, Marathi and other languages that use it:

क़ ख़ ग़ ज़ ड़ ढ़ फ़ य़ ि

--Connel MacKenzie T C 05:19, 3 April 2006 (UTC)

You know, what would really help if added to the Devanagari edittools are the transliteration characters. It seems most people want to use the w:IAST system, and that it is most common, with similar being used for Hindi. I need all the characters for those, with the dots under, over, etc, but I don't know how to get them in. Thanks - Taxman 19:22, 12 April 2006 (UTC)
I think the best bet is cut-n-paste. I have no idea how you'd like them ordered/laid out. --Connel MacKenzie T C 19:36, 12 April 2006 (UTC)
I'll set them up here like above when I get some time on a computer I have the fonts set up right. But it should basically be the same order that is above. - Taxman 19:56, 13 April 2006 (UTC)

Ok like this would be great, added to the devanagari ā Ā ī Ī ū Ū ñ Ñ ś Ś Thanks - Taxman 02:36, 14 April 2006 (UTC)

I wouldn't add them to the Devanagari section. The same characters are used in transliteration of all the Indic and Dravidian languages of India, Pakistan, Sri Lanka, and I guess Bangladesh, as far as I know. It would make more sense to just include all the characters needed for all of them in one place. — Hippietrail 19:02, 14 April 2006 (UTC)
Oh, that's fine too, I forgot about that. Could call them Indic or Indic transliteration or just transliteration in case they are even more widely used. - Taxman 20:39, 14 April 2006 (UTC)

Community portal?[edit]

Could someone more organizationally inclined than myself please put some helpful links to this talk page as the best place to request additional custom scripts for the edit box? The Devanāgarī example above would probably be the best way to format an edit box request. --Connel MacKenzie T C 05:18, 3 April 2006 (UTC)

Perhaps there should be a whole separate requests page thing for just this, with archives, etc.? --Connel MacKenzie T C 05:20, 3 April 2006 (UTC)
Kind of a late answer, but the Grease pit seems fit to deal with this. — Vildricianus 16:53, 15 June 2006 (UTC)


I couldn't find any Polish characters; therefore, a request. Perhaps it would be more convenient to list all Slavic languages that use Latin script together?

The Polish characters are the following:

Ą ą Ę ę Ć ć Ł ł Ń ń ó Ó Ś ś Ź ź Ż ż

Vildricianus 09:44, 4 April 2006 (UTC)

Done. Merged with Croatian and Czech under "Slavic Roman" section. This way there's not much repetition. --Dijan 19:28, 4 April 2006 (UTC)
But who says repitition is a bad thing when most people will just want to find the language they're using? — Hippietrail 23:56, 4 April 2006 (UTC)
So you think that we should include a separate listing for all the languages? (that's about how many?) --Dijan 00:06, 5 April 2006 (UTC)
I didn't say that but I can be as clever as you:
So you think that we should merge all the languages into a single listing? (that's about how confusing?) — Hippietrail 20:55, 5 April 2006 (UTC)
I'm sorry Hippietrail. I was not being "clever". And yes, it's exactly what you said. "...most people will just want to find the language they're using" - does that not imply that we should list each language as a separate section? Originally, I was planning on listing those characters under the already existing "Latin/Roman" section because that's exactly what they are (and most are already there). But then again, if you think it's for the better do it your way, feel free to change it. --Dijan 21:38, 5 April 2006 (UTC)
It's reasonable to list these characters together as they're of the same nature and used by related languages. These characters already take long enough to load each time. — Vildricianus 20:57, 5 April 2006 (UTC)
Sorry for my cheeky response to what was ambiguously interpretable as cheeky or not. My point is that there are several valid ways to go about this, each having their own pluses and minuses. We could do what codepages do and have "Eastern Europe" or "Central Europe", we could lump all variations of Latin together, we could have one entry for each language. In fact in the last couple of hours I needed the German "ß" which I couldn't find on the Latin/Roman section though now I can, so I went to the German section and found it easily. Another question is the arrangement of the letters in each section. Currently Latin/Roman sorts by accent then by letter. To my mind it would have made more sense to order by letter and then by accent. This way "ß" would be with the other "s" variants. Anyway there's many ways to skin this cat. The trade-off is mostly: "an unmanageable number of entries with a manageable number of letters each" or "a manageable number of entries with an unmanageable number of letters each". — Hippietrail 21:58, 5 April 2006 (UTC)
So, what are we going to do? I recently added Arabic to the list. I was not able to add the accents for Arabic (for some reason I was not able to properly paste the Unicode characters for those). I do agree that lumping everything together is not a good idea. When I added Arabic, I also wanted to include Persic script (Persian, Urdu, Pashto, etc.), but I ran into trouble when I tried to lump them together because of different code points for Persic letters that look like Arabic letters. I do think that some need to be separated, in this case Arabic and Persian. Should we have specific language sections (but limit them to certain languages)? Should we have language family sections? Or should we do what you suggested...something like "Eastern European" or "Central European"? What would we call other potential groups? I really do agree that listing each languauge is a best choice for users, however it is not good because it would take up too much space and would take too long to load. I want to make this my current priority, because I do mostly translations (and I'm glad that I finally added Devanagari and Arabic). I really do want to work with you on this, so do get back to me on this. Maybe we should move this to a new heading (it's getting too long).  :) --Dijan 03:50, 6 April 2006 (UTC)
Perhaps to the BP, where more people can see it. Yay, even longer!Vildricianus 08:07, 6 April 2006 (UTC)

Language names don’t match the scripts[edit]

I tried to use Czech and got Devanagari. Catalan is Arabic. Misc. seems to be the last one that is correct. —Stephen 22:18, 4 April 2006 (UTC)

I think you clicked on it right after I included Arabic. Restart your system and clear the cache. It works fine on my home computer and on the one at school. --Dijan 00:06, 5 April 2006 (UTC)

Romaji script[edit]

I see that the Romaji script includes āēīōū. Usually we only use āēōū, and long i is written ii instead. Glancing through my Sanseido’s Japanese-English dictionary, I do not find a single case of long ī anywhere in the book. My old grammar book even remarks: "In the Hepburn Romanization of Japanese, which this book uses, the double vowels are usually written with a macron (-) over the simple vowel, except in the case of i, which is written double: ii". —Stephen 20:51, 23 September 2006 (UTC)

I imagine there are a ton of similar errors. Those of us who have edited this file, usually are not experts on those languages. Well, I suppose I should speak for others, but I certainly am no expert on those scripts. Please be bold in correcting them (within any given section.) --Connel MacKenzie 20:57, 23 September 2006 (UTC)

Ancient Greek[edit]

It would be nice to have Greek (Ancient) too: α ε η ι ο υ ω ά έ ή ί ό ύ ώ β γ δ ζ θ κ λ μ ν ξ π ρ σ ς τ φ χ ψ ϝ ϙ ϟ ϡ ϛ ϗ ·. For simplicity's (and usage's) sake, I neglected to include majuscles as modern orthographical conventions dictate the use of lowercase letters. (If someone really wants to deal with polytonic capitals, be my guest.) I likewise disregarded "sho" (as it is used for Greek transliteration of Bactrian and not standard Greek) and "san" (as my computer can't handle the character!). Thank you! Medellia 20:57, 6 December 2006 (UTC)

Please nag someone, if these requests don't get completed. A reminder somewhere on WT:GP is usually best. I'll try to do this in the next few minutes. --Connel MacKenzie 02:18, 1 January 2007 (UTC)
Please Ctrl-shift-R a page to refresh your javascript, then give it a try. I get it right? --Connel MacKenzie 03:53, 1 January 2007 (UTC)
I added in the capital letters. We have folks entering place names and such, so they are necessary. ArielGlenn 23:19, 20 December 2007 (UTC)

illegal XHTML on edit pages[edit]

could someone please replace all <br> on this page with <br />? without this change, every page with action=edit generates non-wellformed XHTML. -- 02:10, 1 January 2007 (UTC)

Let me double and triple-check the Javascript that refers to it first. Wait, why are you validating edit pages? --Connel MacKenzie 02:20, 1 January 2007 (UTC)
using the XMLParser in firefox seems to be the easiest way to get at the fields of an editform when using XMLHttpRequest to load it. sadly, i didn't find a way to make it parse non-wellformed XHTML. -- 02:17, 6 January 2007 (UTC)
Well, Dvortygirl did it, but I removed them all when I added Greek (Ancient). The secondary use of this page is to allow special characters to be input into the search box (via one of the WT:PREFS options.) The
s seem to have been causing grief there as well. They should all be gone now. --Connel MacKenzie 03:52, 1 January 2007 (UTC)

reminder to sync[edit]

I had the line:

<noinclude>::''Don't forget to keep [[MediaWiki:Monobook.js#addCharSubsetMenu]] in sync!''

included, completely forgetting that the MW software is grabbing this page as part of the UI (not a regular page, by any stretch of the imagination.) If I make this mistake again, and that line appears in edit boxes, someone PLEASE correctly remove it. (With the new javascript caching stuff in place, I did not see this mistake on preview, nor did I notice it at all, right away.) I've yanked it again, for now, but am mentioning it here because I've made this mistake twice now. --Connel MacKenzie 19:00, 1 January 2007 (UTC)

Proposal to format a bit differently[edit]

A few thoughts:

  1. Instead of using ··· to separate groups of characters, it seems like it would present a clearer image if we used whitespace. This has the advantage that we can use linear whitespace (a few &nbsp;s) for relatively minor separations, and line-breaks (<br />) for more major separations. (The top-level separation would still be the categories identified in the dropdown.)
  2. Ideally, most or all of the characters should be documented, so that editors don't have to rely entirely on a glyph's appearance to determine if it's that of the right character. (For example, I often used ’ instead of ’ before seeing a comment by Raifʻhār Doremítzwr that made me realize they were different.) I think the best way to do this is to use <span title="description"> elements, so when you hover over a character, you see a description. (That said, I don't think it's necessary for every character to be documented separately; I think it's fine, say, for all the characters with acute accents to share a single <span title="letters with acute accent"> element or something.

Anyone have any thoughts on either of these? Should I be bold?

RuakhTALK 03:44, 9 June 2007 (UTC)

Blank space may not render uniformly, but I have no personal opinion there. As for line breaks, I know that some earlier versions used them but that editors have removed them. Possibily, they did this because diffferent monitor resolutions and sizes (and different font sizes) led to ugly wrapping as a result. --EncycloPetey 15:16, 10 June 2007 (UTC)
If the white space doesn’t render properly, and we end up keeping the ···s, then I suggest substituting some with ⋯s (midline horizontal ellipses), and some with individually-linked ·s (interpuncts) — thereby killing two birds with one stone in turning character group separators into useful, insertable symbols. As for documenting characters, that sounds great — go for it. † Raifʻhār Doremítzwr 15:34, 10 June 2007 (UTC)
The only potential problem I see with the edit of the Latin section is labelling the "edh" as such. The "capital edh" is only called that in English. It has a different name (and a different lowercase form) in Croatian. I do not think that there is a separate Unicode form since the two "letters" look identical. --EncycloPetey 16:59, 10 June 2007 (UTC)
Unicode has a policy of distinguishing characters, not glyphs. This means both that one character can have more than one glyph, like how the character U+0034 DIGIT FOUR (pardon the all-caps; that's how Unicode names its characters) can be closed or open at the top, and that multiple characters can have the same glyph in a given font, like U+00D0 LATIN CAPITAL LETTER ETH and U+0110 LATIN CAPITAL LETTER D WITH STROKE. (Note: In practice, it deviates from this policy as often as we deviate from WT:CFI.) Our character is the former. The good news is that this proves my point about the need to document these characters. :-)
So, should I add U+0110 LATIN CAPITAL LETTER D WITH STROKE and U+0111 LATIN SMALL LETTER D WITH STROKE to the Latin/Roman section?
(If you're interested, by the way, the entire Unicode specification, including the character charts, is available in PDF form from the Unicode Consortium's Web site, Links to the various character charts are arranged conveniently at
RuakhTALK 17:54, 10 June 2007 (UTC)
We do very much need to document the gylphs/characters. As for labelling the batch of acute letters rather than labelling each letter, the only reason I see not to label each acute letter is that it would be time-consuming — but that is no reason not to do it; we can batch label them to start and go back and label them individually whenever we find time. — Beobach972 19:39, 10 June 2007 (UTC)
Unicode does reduce some characters (each two different U+#### codes will display the same exact gylph), but it preserves the code distinction, as far as I can tell, as we ought to do so as well. — Beobach972 19:39, 10 June 2007 (UTC)
As for the specific example, D WITH STROKE is used in Vietnamese, Skolt Sami, etc (whereas ETH is familiar to everybody from Old Norse, etc). ETH should go in the Latin set, and D-stroke in the Vietnamese set. That may not be the best example, by the way, as the characters are quite distinct, as they have different minuscule forms. — Beobach972 19:39, 10 June 2007 (UTC)
Well, as I see it there are four reasons not to document every single character:
  • Would be time-consuming.
  • Would make this page harder to edit.
  • Would increase the bandwidth of every edit page on the site.
  • Would cause the tooltips to appear only when hovering over a specific character. (Currently hovering anywhere over the "acute" section, for example, gives you the relevant tooltip.)
(That said, none of these is a big deal, so if you think it's worthwhile to document each one separately, I will.)
I'm not sure what you mean when you say "Unicode does reduce some characters". There are sequences of code-points that identify the same character (for example, U+00C0 LATIN CAPITAL LETTER A WITH GRAVE is exactly the same as U+0041 LATIN CAPITAL LETTER A plus U+0300 COMBINING GRAVE ACCENT), but in theory two distinct code-points always identify distinct characters, even if many, most, or even all fonts will use the same glyph for both. (In practice, there are some exceptions; most notably, Unicode has a policy of "round-trip compatibility" with existing character sets, such that if an existing character set mistakenly includes a given character twice, then Unicode will offer a "compatibility character" for one of the inclusions, and recommend that it not be used. Even then, there's no guarantee that an implementation will use the same glyph for both; in particular, it's quite likely that an implementation won't support the compatibility character at all, and will display a box or question mark or whatnot.)
Incidentally, U+0111 LATIN SMALL LETTER D WITH STROKE is a good example of the glyph/character distinction: the character charts mention that Croatian, Vietnamese, and Sami orthographies use a glyph with the stroke through the stem, while Americanist orthographies use one with the stroke through the bowl. BTW, we already have eth in the Latin/Roman section and d-with-stroke in the Vietnamese section. :-)
That said, there are some cases where I really think Unicode is just wrong. Umlauts and diaereses look identical nowadays, but they have different origins and I really don't think they can be considered the same character in any sense. *shrug*
RuakhTALK 20:18, 10 June 2007 (UTC)
Regarding 'reducing' characters: I may be wrong, then; I was under the impression that Unicode itself 'degenerated' (I cannot think of the word they used) certain glyphs to one if they were (like the majuscule forms of ETH and D WITH STROKE) identical. On further consideration, that may indeed be a language-support-software move or even a font move.
Regarding documenting each letter separately, I had not thought of the bandwidth concerns (though in labelling the letters, I did notice the clutter made). It is probably best not to identify each separately after all, in most cases. However, it may be worthwhile to document specific letters if Unicode does not distinguish letters that do have different values in different languages. Czech 'Ř' / 'ř' vs Upper Sorbian 'Ř' / 'ř', for example — we could label the characters as 'R with x/y' (or 'x/y') and 'r with x/y', as separate from 'letters from carons/háčky/whatever', where 'x' is the Czech name (either for the character, or the whole letter, if it has one) and 'y' the Sorbian one. (Erm, I think I made that as clear as mud...)
PS- That's because the Czech represents /r̝/ and the Upper Sorbian /ʃ/, by the way. After further consideration, that may not be worth distinguishing, either. I don't know. — Beobach972 21:06, 10 June 2007 (UTC)
I'll undo the clarifications I made to the acute section in a moment.
Regarding diaereses and umlauts: they do not look the same (I assume you mean only that they often use the same glyph on computers, which they do) — the umlaut is, perhaps ironically, more fit to be combined with the double acute than the double-dot diaeresis. — Beobach972 21:03, 10 June 2007 (UTC)

I suggest that Æ & æ be labelled “Æshes (AE ligatures)” (“Ashes” is a more common spelling, but lacks the exemplifying quality of the ligatural spelling), Œ & œ be labelled “Œthels (OE ligatures)”, ß be labelled “Eszett (ſz ligature)”, the letters labelled “with rings above” be relabelled “letters with kroužek diacritics”, Ð & Þ be labelled with a initial capitals (“Eth” & “Thorn”, not “eth” & “thorn”), the IPA characters be individually labelled — especially the suprasegmentals (yes, a gruelling job; I’d do it myself if I could), and the “Hawaiian” category in the drop-down menu be renamed “Hawaiʻian” (that is, with the more correct ʻokina added) or “ʻŌlelo Hawaiʻi” (the native name). Are these suggestions of mine OK? † Raifʻhār Doremítzwr 14:33, 12 June 2007 (UTC)

I think we ought to leave Hawaiian sans ʻokina, but labelling the AE and OE ligatures is an excellent idea. I agree with you on eth as well, and generally I agree with you on thorn — although I wonder if it is proper to write Thorn (as opposed to THorn). Lastly, yes, each IPA character will be labelled by the time we're through with this; they're the glyphs that need it most. — Beobach972 20:49, 12 June 2007 (UTC)
After further consideration, Æ and Œ should be left as they are now in the Latin menu; we can give their Old English names in the Old English section. — Beobach972 20:53, 12 June 2007 (UTC)
I now see that the IPA is already documented as well. Click on the words 'Vowels', 'Suprasegmentals', and 'Pulmonic consonants' — they link to the IPA chart on Wikipedia. :-) — — Beobach972 21:57, 12 June 2007 (UTC)
I agree that “ʻŌlelo Hawaiʻi” would be pushing things a little far; however, for the sake of correctness, I still think the category title should be “Hawaiʻian” — though it doesn’t matter very much. I reckon it’s best we stick to labelling Þ as Thorn for the time being until / unless someone complains (for I’ve never seen it written THorn). Umm… by documenting the IPA symbols, I meant each one individually (so that ‘ʃ’ would be labelled as “voiceless postalveolar fricative”, and so on) — hence my recognising that it would be a gruelling job. By the way “æsh” and “œthel” are the Modern English names of Æ / æ and Œ / œ — the Old English names therefor are “æsc” and “eðel” (which would, indeed, be more appropriate in the Old English symbols category). † Raifʻhār Doremítzwr 14:00, 14 June 2007 (UTC)
Oh, yes, I know what you meant; but that would increase bandwidth considerably (which, as Ruakh and others have said, should be avoided); the way it is now, users can click the links to get to the documentation. However, we probably do need to distinguish any symbols that are similar and might be confused. — Beobach972 16:26, 14 June 2007 (UTC)
We could do that by labelling the IPA characters by chart column (so that they’re labelled by group as “retroflex”/“retroflected”, “pharyngeal”, et cetera). That would disambiguate most of them. Also, some characters have short names with which they could be labelled, such as ‘θ’ = “theta”, ‘ʃ’ = “esh”, and ‘ŋ’ = “agma”. Would that help? † Raifʻhār Doremítzwr 14:11, 15 June 2007 (UTC)
I could see some very lengthy descriptions making their way into the IPA section. Imagine each IPA character expressed with the IPA formal name, the SAMPA equivalent, the three major enPR equivalents, a link to an audio sample, plus an example word or two! Such links would unfortunately need to be HTML, not wikilinks, IIRC. But if all that were done, even I would be able to enter IPA pronunciations. --Connel MacKenzie 06:44, 15 June 2007 (UTC)
P.S. Why wasn't this major change announced in WT:BP or even in WT:GP? --Connel MacKenzie 06:44, 15 June 2007 (UTC)
Using the IPA involves a pretty steep learning curve — my idea was to just give the long formal name, as, once one understands the jargon, which character ought to be used can be deduced from where and how one is producing a particular speech sound. However, I think the idea which you (jocularly) propose would be useful for Translingual (or IPA?) entries for each of the IPA characters, linking from a chart with a walkthrough (and, ideally, a tutorial). † Raifʻhār Doremítzwr 14:11, 15 June 2007 (UTC)
Or we could put all that on WT:IPA, where it would indeed be useful. I'll look into that... — Beobach972 17:40, 26 June 2007 (UTC)
We should also put that information on the entry page for each symbol. That is, ʌ, ʊ, ʒ, etc. should each have an entry and that entry can give the English name for each symbol. Of course, this does not take the place of having a concise table summarizing everything at WT:IPA. --EncycloPetey 19:11, 26 June 2007 (UTC)
Come to think of it, we do have an English pronunciation key, linked to from WT:IPA. — Beobach972 20:38, 26 June 2007 (UTC)

Requested IPA characters[edit]

Thank you to whoever recently added the IPA half-long marker («ˑ») to the IPA symbols list. I make use of a few other symbols which I request be added thereto. They are: the suprasegmentals the IPA extra-short marker — the combining breve («  ̆»), the extra-stress marker, and the syllabic linker — the character tie («⁀»); the diacritics denoting labialisation («ʷ»), glottal onset («ˀ»), a lowered phone — the combining down tack («  ̞»), a raised phone — the combining up tack («  ̝»), advanced tongue root — the combining left tack («  ̘»), and retracted tongue root — the combining right tack («  ̙»); and the pulmonic consonantal ligatures ‘tesh’ («ʧ»), ‘dezh’ («ʤ»), and ‘ts’ («ʦ»). Nota bene that the combining characters will appear about one–two spaces to the left of where one would need to click in order to add them — how this can be solved, I don’t know — perhaps the link can be moved leftwards a little too? † Raifʻhār Doremítzwr 13:08, 12 June 2007 (UTC) ⋯ It may be worthwhile adding a few other symbols as given here too. † Raifʻhār Doremítzwr 13:17, 12 June 2007 (UTC)

As for the half-long mark, you're welcome; as for the rest, I'll look into them. Tesh and dezh won't be added, as doing so might encourage editors to use them (which we strongly discourage, in favour of using the individual gylphs). 'ʦ' is probably excluded on similar grounds. As for the combining letters displaying improperly, I'm sure that there's a way to override that — perhaps by enclosing them in brackets as you have done above. — Beobach972 20:45, 12 June 2007 (UTC)
‘Tesh’ and ‘dezh’ do not denote the same phonological clusters as ‘tee + esh’ & ‘dee + ezh’ — whereas the two ligatures denote the English ‘tch’ (as in ‘church’) and ‘j’ (as in ‘judge’) sounds, the two digraphs denote neighbouring but seperate ‘t + sh’ (as in ‘hat-shop’) & ‘d + ĵ’ (as in ‘hard-genre’) sounds. A minor distinction perhaps, but if we differentiate between /e/ & /ɛ/ invalid IPA characters (/&/) in our transcriptions, then so ought we to differentiate between /ʧ/ & /tʃ/ obsolete or nonstandard characters (ʧ), invalid IPA characters (ʧ/&/), replace ʧ with t͡ʃ and between /ʤ/ & /dʒ/ obsolete or nonstandard characters (ʤ), invalid IPA characters (ʤ/&/), replace ʤ with d͡ʒ. Since the phonological cluster denoted by ‘ts’ (/ʦ/ obsolete or nonstandard characters (ʦ), invalid IPA characters (ʦ), replace ʦ with t͡s) is not native to English, it’s not as necessary that it be added (I’ve used it a few times in the transcriptions of “non-naturalised” borrowings only). There are three other IPA ligatures — /ʣ/, /ʥ/, & /ʨ/ obsolete or nonstandard characters (ʣʥʨ), invalid IPA characters (ʣ//ʥ/&/ʨ), replace ʣ with d͡z, ʥ with d͡ʑ, ʨ with t͡ɕ — however, I’ve never had cause to type them, so I don’t know how useful they’d be.
Notice that I’ve not only enclosed them between guillemets, I’ve also preceded each combining character with two spaces each. The problem is that users would have to click a bit to the left of a combining character in order to insert it (this is a particular problem if the symbols are listed without being bracketed, as clicking on a character where it is visible would probably insert the neighbouring character on the left, or even the character left thereof). A solution may be to make a click anywhere inside the entire area enclosed by the parentheses / guillemets / whatever cause the character therein to be inserted; another is to move the “click-to-insert” point a little to the left so that the point and the visible character match. † Raifʻhār Doremítzwr 13:08, 14 June 2007 (UTC)
I think we should be using tesh and dezh, but this isn't the forum to discuss that. As for the combining characters, I think a proper solution would require a software change, either to the MediaWiki software, or to Common.js. Similarly for some other things that should be included, such as KH in the enPR section. (I'll look into this and see if I can find a good way to do it.) —RuakhTALK 15:51, 14 June 2007 (UTC)
Yes, this should be brought up for discussion. I suppose WT:BP would be the place, because even though there are software concerns, this is mostly a policy matter. I do very much agree that we ought to switch to using tesh and dezh. (I know that might not have been obvious from my comments (heh); I'm basing my present opposition off of EP's comments on the matter and the consensus the last time this was discussed.) Frankly, I think that this is a different matter from IJ and such — those would mess with page names, but these would only be used in pronunciation sections. — Beobach972 16:17, 14 June 2007 (UTC)
PS- as long as you're looking through IPA characters, if you see any that aren't on the WT:IPA page, we ought to list them. Only the useful ones go in Edittools, but we can but them all there, so we'll have them to copy-and-paste for those rare entries that use them. — Beobach972 16:17, 14 June 2007 (UTC)
We could do with all the symbols (including the unofficial ones) given hereon. † Raifʻhār Doremítzwr 14:13, 15 June 2007 (UTC)
Why do we strongly discourage using tesh, dezh, etc. where appropriate? And where is this written down? I don't see anything on WT:IPA... I've been using these quite freely in the depiction of Korean pronunciation; should I go back and change those entries? Is this an English thing, or something to do with general principles of IPA usage on Wiktionary? Using English as the guide for Edittools IPA seems a little silly, if that's what's going on, since the vast majority of entries here should eventually be non-English. -- Visviva 17:02, 8 July 2007 (UTC)
It's a technical character issue that I don't understand all the ins and outs of. Dijan seems to understand the issue though, and could probably explain it. The problem and issue are not specific to the English Wiktionary, to English, or to IPA. They have to do with digraphs in general across several languages and projects. Yes, you should probably go back and change them. --EncycloPetey 17:42, 8 July 2007 (UTC)
I thought that the problem was with using ligatures in entry names — I can’t see how there’s a problem with using IPA ligatures in pronunciation sections. Furthermore, problems with encoding don’t apply either — whereas basic ASCII would fail to display the Dutch ligature “ij”, meaning that writing the digraph “i + j” would be preferable in such situations, any such display problems with [ʧ] obsolete or nonstandard characters (ʧ), invalid IPA characters (ʧ), replace ʧ with t͡ʃ and [ʤ] obsolete or nonstandard characters (ʤ), invalid IPA characters (ʤ), replace ʤ with d͡ʒ would be shared by [ʃ] and [ʒ], meaning that the technically more correct character would be preferable in all situations. Would you mind asking Dijan whether the use of the IPA ligatures is verboten? † Raifʻhār Doremítzwr 17:58, 8 July 2007 (UTC)
But it does affect page names. We have "Rhymes" pages whose names are coded with IPA characters. So, no, we should't use those characters. --EncycloPetey 19:25, 9 July 2007 (UTC)
Point taken, though its validity is dependent upon whether there exist any display problems suffered by [ʧ] obsolete or nonstandard characters (ʧ), invalid IPA characters (ʧ), replace ʧ with t͡ʃ and [ʤ] obsolete or nonstandard characters (ʤ), invalid IPA characters (ʤ), replace ʤ with d͡ʒ which are not shared by [ʃ] and [ʒ]. If no such particular problems exist, then there is no reason not use those ligatures (unless you were to argue against the use of [ʃ], [ʒ], and any other character thus afflicted). † Raifʻhār Doremítzwr 19:38, 9 July 2007 (UTC)
Perhaps I need to give an example of where this sort of issue has already caused problems. When I tried to link pettifogger to its corresponding Rhymes page, I was surprised to see it come up as a red link. I knew the Rhymes page existed because I had just been there to check and to add pettifogger to the list of Rhymes. It turns out that the Rhymes page in question was named Rhymes:English:-ɒɡə(r), but it should have been named Rhymes:English:-ɒgə(r). See the difference? I couldn't. I figured that although the characters looked identical on my screen, there had to be a Unicode difference in there somewhere. The result? I had to go through character by character, copying from the IPA and using my browser search function in the page to find out which character didn't match. It was the g. So, if we allow the digraphs to be used, we're going to run into this problem often. I would rather be consistent and avoid driving people insane. --EncycloPetey 20:10, 9 July 2007 (UTC)
Example accepted. That shows that there is a problem with /[ɡ]/ invalid IPA characters ([]), but you still haven’t shown whether there are problems with [ʧ] obsolete or nonstandard characters (ʧ), invalid IPA characters (ʧ), replace ʧ with t͡ʃ or [ʤ] obsolete or nonstandard characters (ʤ), invalid IPA characters (ʤ), replace ʤ with d͡ʒ. Could you experiment to find out please? † Raifʻhār Doremítzwr 20:21, 9 July 2007 (UTC)
Good lord, man! If the Rhymes page is named Rhymes:English:-ʤi then a link to Rhymes:English:dʒi won't function properly. I had thought the implications of my comments were obvious, but appraently not. If we end up with multiple ways to write the same IPA pronunciation, we're in for a world of hurt. --EncycloPetey 20:24, 9 July 2007 (UTC)
Sorry, that was rather obtuse of me. In that case we ought to prescribe a standard for IPA transcriptions (using [ʧ] obsolete or nonstandard characters (ʧ), invalid IPA characters (ʧ), replace ʧ with t͡ʃ, [ʤ] obsolete or nonstandard characters (ʤ), invalid IPA characters (ʤ), replace ʤ with d͡ʒ, et cetera). It seems as if rhymes links are already different from what is written for the transcriptions in the entries (e.g., [r] is used in rhymes, but the character used in transcriptions is either [ɹ] (for UK pronunciation) or [ɻ] (for US pronunciation); furthermore, rhymes links often contain parentheses, which are never used in actual transcriptions), so what is the harm in using the ligatures tesh, dezh, et cetera in transcriptions, but using the equivalent digraphs in rhymes links? † Raifʻhār Doremítzwr 21:14, 9 July 2007 (UTC)
I believe you've answered your own question. --EncycloPetey 23:20, 9 July 2007 (UTC)
Sorry to be obtuse again, but I just don’t see it. You’ll have to spell it out for me. Perhaps I need to get some shut-eye… † Raifʻhār Doremítzwr 23:37, 9 July 2007 (UTC)

Is there a reason why the Rhymes issue (and any other issues that might crop up) couldn't be solved with redirects? I know they're frowned upon in the mainspace, but rhymespace would seem like a useful place for them. Aside from that... are Rhymes: pages intended to cover all languages, or only those in which end-rhyme is a recognized concept? It would seem like a strange thing to have for Korean. -- Visviva 01:05, 10 July 2007 (UTC)

Arriving at this conversation months late, I'd just like to point out that the IPA withdrew approval of ligatures like [ʧ] obsolete or nonstandard characters (ʧ), invalid IPA characters (ʧ), replace ʧ with t͡ʃ and [ʤ] obsolete or nonstandard characters (ʤ), invalid IPA characters (ʤ), replace ʤ with d͡ʒ years ago. In the rare case of needing to distinguish between the affricate [tʃ] and the cluster [t]+[ʃ] invalid IPA characters (]+[), the affricate can be shown using a tie bar thus: [t͡ʃ]. Angr 21:30, 4 November 2007 (UTC)

Germanic instead of Scand, Icel, and German[edit]

What does everybody think of combining Scandinavian, Icelandic, and German into a section called 'Germanic'? See [1] and [2]. It would reduce the size of the page (and, thus, load time etc). — Beobach972 17:20, 14 June 2007 (UTC)

It would still be ~12KB either way? Isn't this page cached on browsers (only loaded once, in most cases?) --Connel MacKenzie 06:56, 15 June 2007 (UTC)
No, the results of this page actually go into the HTML of every single edit page: 60+2n bytes for each insertable character (or sequence of characters, in the case of e.g. “”), where n is the number of bytes in the character itself (1–3 for characters in the BMP, but the Gothic characters take 4, and character sequences can obviously take more), plus all the HTML we include. It currently adds about 108 KB to every single edit page; on the section-edit page I'm currently typing in, it's about 86.6% of the page's total HTML file-size. —RuakhTALK 16:25, 15 June 2007 (UTC)
Wow. I can see, then, that cutting down the size even a little bit helps. Shall I make the combination? — Beobach972 20:37, 15 June 2007 (UTC)
Really, I think it might be best to jettison the current system of <charinsert> tags, and to home-roll our own system. (It would be better if the MediaWiki software handled <charinsert> tags less bloatfully, but it doesn't, and I don't see why we should wait for the developers to fix it.) —RuakhTALK 22:08, 15 June 2007 (UTC)
Hmm, well, if somebody can design such a system, that sounds ideal. The problem will be designing it, I assume. — Beobach972 16:07, 16 June 2007 (UTC)
Ohh, a thought... we could, perhaps, make a template that would be called are would do that? Or would that make the page larger? — Beobach972 16:07, 16 June 2007 (UTC)

If we did that we could then also include symbols used in writing Proto-Germanic such as đ, ƀ and ǥ which are not available anywhere else (well all right, đ is there). Widsith 07:45, 15 June 2007 (UTC)

Oohh, good thought. By the way, I kept Old English separate because it probably sees as much usage as all of those combined in its own right, and it has enough symbols to merit (a) its own entry, and (b) not swamping the Germanic section. — Beobach972 14:57, 15 June 2007 (UTC)

Middle Korean characters...[edit]

Any chance of adding something like User:Visviva/Yethangul here? Just curious; it would help me, and probably others, to avoid the cardinal Unicode sin of using the Private Use Area for these. Happiness, -- Visviva 17:06, 8 July 2007 (UTC)

Welsh accents[edit]

The following characters used in Welsh need to be added to the Latin/Roman section: ŵ Ŵ ẁ Ẁ ẃ Ẃ ẅ Ẅ ŷ Ŷ ỳ Ỳ (of these only the ŵ/Ŵ and ŷ/Ŷ are used frequently) Thryduulf 11:30, 20 July 2007 (UTC)

Not only that — ‘Ẃ’ & ‘ẃ’ need to be added to the Welsh section too. † Raifʻhār Doremítzwr 14:02, 20 July 2007 (UTC)


א ב גדהוזחטיכלמנסעפצקרשת

In addition to the ending characters: ךםןףץ

And that should be it. I'm going to slap myself for taking those individually out of the main page and an article on the he wikt, when the regular letters were all lined up in front of me :-/ --Rory096 06:06, 25 March 2006 (UTC)

The points for niqqud are all missing though. They don't go in article titles, but they should be used in headwords, translations, etc. —Muke Tever 14:56, 25 March 2006 (UTC)
I moved this from further up on this page down to here because Muke's issue has not yet been dealt with (and I'd like it to be). I think that the added points should if possible either be in a large font or listed by name (but added as points only of course), as some of the points are hard to distinguish from one another in small-type.—msh210 17:22, 28 August 2007 (UTC)

These are the characters that need to be included. אבּבגדהוזחטיײײַכּכךלמםנןסעפּפףצץקרשׁשׂתּת


--Neskaya talk 18:00, 28 August 2007 (UTC)

Optionally, also aleph-with-kamatz and -patach, and, if Yiddish uses it (I don't know Yiddish), yod-with-chirik. I don't think we need beged-kefet with dagesh; but if we have it, then include also gimel, dalet, and final kaf, and maybe also he.
Oh, and definitely vav-with-dagesh a/k/a shuruk.—msh210 18:06, 28 August 2007 (UTC)
Point. Kometz-aleph:אָ
I'll add more later, when I have more time to type in Hebrew (with the slighty unfamiliar keyboard layout that it entails). --Neskaya talk 18:09, 28 August 2007 (UTC)

Unfortunately, I don't think the edit-tools really work with diacritics … I have a thought how to handle this, that will also solve some of the other problems with edit-tools. Let me work on it, and I'll post back here in a day or two. (Also, this is neither here nor there, but final kaf never takes a dagesh kal; when it takes a dagesh, which is pretty much only in weird Hebrew, it's always a dagesh khazak.) —RuakhTALK 19:08, 28 August 2007 (UTC)

How about this? (Names of characters are taken from Unicode.) <p class="speciallang" id="Hebrew" style="visibility:hidden; display:none"> '''Letters''': <charinsert>א ב ג ד ה ו ז ח ט י כ ך ל מ ם נ ן ס ע פ ף צ ץ ק ר ש ת װ ױ ײ</charinsert> ··· '''Vowels''': <small>qamats:</small> <charinsert>ָ</charinsert> <small>patah:</small> <charinsert>ַ</charinsert> <small>tsere:</small> <charinsert>ֵ</charinsert> <small>segol:</small> <charinsert>ֶ</charinsert> <small>hiriq:</small> <charinsert>ִ</charinsert> <small>holam:</small> <charinsert>ֹ</charinsert> <small>qubuts:</small> <charinsert>ֻ</charinsert> <small>sheva:</small> <charinsert>ְ</charinsert> <small>hatef qamats:</small> <charinsert>ֳ</charinsert> <small>hatef patah:</small> <charinsert>ֲ</charinsert> <small>hatef segol:</small> <charinsert>ֱ</charinsert> ··· '''Other:''' <small>dagesh:</small> <charinsert>ּ</charinsert> <small>shin dot:</small> <charinsert>ׁ</charinsert> <small>sin dot:</small> <charinsert>ׂ</charinsert> <small>geresh:</small> <charinsert>׳</charinsert> <small>gershayim:</small> <charinsert>״</charinsert> <small>maqaf:</small> <charinsert>־</charinsert> <small>rafe:</small> <charinsert>ֿ</charinsert> <small>meteg:</small> <charinsert>ֽ</charinsert> </p> —msh210 21:30, 18 December 2007 (UTC)

Two questions: 1) wasn't this already rejected previously as unwanted? 2) If they are going to be included, why just the individual diacritics, instead of the whole alphabet repeated with each diacritic combination? --Connel MacKenzie 22:15, 28 December 2007 (UTC)
Two answers: 1) Not so far as I know. 2) There are just too many; there are hundreds of possible combinations. At 80-some bytes per combination, we're talking maybe 30K of HTML on every edit-page on the entire site.
Since you don't seem to be actually objecting, I've gone ahead and made the change (I'm visiting my Internet-having parents until New Year's, so can keep tabs on complaints); but if I've misunderstood and you do oppose this change, feel free to revert.
RuakhTALK 00:38, 30 December 2007 (UTC)
Thanks, Ruakh. To respond further to Connel's questions: (1) Not as far as I know either. (2) What Ruakh said, but also, would it work the way you suggest? Note that a letter+diacritic is two characters, not one. (But Ruakh's reason is sufficient.) And it's easy enough to click the diacritics separately.—msh210 18:00, 2 January 2008 (UTC)
Multi-character charinserts are supported; check out the "templates" section, for example. —RuakhTALK 23:47, 2 January 2008 (UTC)

Adding templates to Non-English sections[edit]

Can we add templates to the non-English sections? If so, could we add these to the Spanish section:

Thanks. --Bequw¢τ 21:19, 21 December 2007 (UTC)

For French:
Circeus 17:15, 26 December 2007 (UTC)

For Latin:

Participles and infinitives
Nouns and adjectives

...and there are probably more that I missed. This is why Connel balks at including all the useful templates; there are just too many. --EncycloPetey 19:13, 29 December 2007 (UTC)

Old Norse[edit]

Should we add this under "Scandinavian", or should this be a separate list? It does include quite a number of symbols that seem not to exist in modern Scandinavian langhuages. See w:Old Norse alphabet. --EncycloPetey 12:22, 28 April 2008 (UTC)

Ancient Greek[edit]

I can't seem to find capital omega with smooth breathing. In the capital smooth breathing section we only have , which doesn't seem right. Widsith 06:58, 5 July 2008 (UTC)

I believe I have fixed it (but then again, it's difficult to say for sure). However, the unicode converter seems to agree, as well as a linked version. Sorry about that. -Atelaes λάλει ἐμοί 07:19, 5 July 2008 (UTC)

Ewe characters (Eʋegbe)[edit]

Is it possible to add the following Ewe characters as a seperate section for Ewe or in another section such as Latin/Roman?

Ɖ ɖ Ɛ ɛ ɛ̃ Ƒ ƒ Ɣ ɣ Ŋ ŋ Ɔ ɔ ɔ̃ Ʋ ʋ  |  Á á É é Í í Ó ó Ú ú  |  À à È è Ì ì Ò ò Ù ù  |  Ã ã Ĩ ĩ Õ õ Ũ ũ   --Natsubee 00:32, 11 September 2008 (UTC)

CSS class[edit]

Some paragraphs (i.e. buttons on the drop down menu) have class="speciallang", whereas others have class="specialbasic". Which is which and what should be where? H. (talk) 15:53, 20 November 2008 (UTC)

Latin characters in Hebrew section[edit]

Any objection to my adding the characters áéíóú to the Hebrew section? They're used in transliterating Hebrew on Wiktionary, and it's a pain to go back and forth between the Hebrew and Latin sections.—msh210 23:07, 1 April 2009 (UTC)

Done (and tested: works).—msh210 20:40, 8 April 2009 (UTC)


The Middle English and Old English letter yogh (Ȝ, ȝ) needs to be added to the Latin/Roman section and/or the Old English section.  (u):Raifʻhār (t):Doremítzwr﴿ 22:32, 19 April 2009 (UTC)

Where exactly should it go? (Sorry, I don't know the order of letters in the OE alphabet.) —RuakhTALK 23:18, 29 April 2009 (UTC)
I’m not sure myself, but since it derives from the Anglo-Saxon form of the letter <g>, your best bet is to place it between the dotted ‘g’s and the ‘i’s with macra.  (u):Raifʻhār (t):Doremítzwr﴿ 22:30, 31 May 2009 (UTC)
I'd put it near the end, since it's an odd character and because it survived in Scots as a "z". --EncycloPetey 22:36, 31 May 2009 (UTC)
  • Yogh wasn't used in OE, by the way. Insular-G yes, yogh no. It's a Middle English thang. Ƿidsiþ 07:26, 1 June 2009 (UTC)

Typo in Latin/Roman[edit]

One of the emboldened tags is spelt Diareses — it should be Diæreses.  (u):Raifʻhār (t):Doremítzwr﴿ 22:32, 31 May 2009 (UTC)

Why? The spelling diareses is not incorrect. --EncycloPetey 22:34, 31 May 2009 (UTC)
Umlauts looks distorted... Are you sure Umlaut does not accept the established plural suffix - Umlaute in English as well? The uſer hight Bogorm converſation 07:10, 1 June 2009 (UTC)
I'm sure. (BTW, in English it's umlaut, not Umlaut.) —RuakhTALK 12:38, 1 June 2009 (UTC)
Doremítzwr, I have no problem with you adding and using (possibly peculiar) orthographic variants and alternative spellings but please, for goodness sake, realise that in standard "Western" English ligatures (whatever about diacritics) are almost never used. I, personally for one, have never read any texts of any kind (while I may not have read a lot) which are written in English and use ligatures. I don't like this idea because it smack of superimpositon of rules of one language onto anther like other crap (that stupidly has become to be accepted) such as cactuses. Only in this cases English isn't "raping" words from another language, other languages might as well be "raping" English words (even though they are not borrowing the words) to make them look (IMHO) horribly non-English. 50 Xylophone Players talk 13:02, 1 June 2009 (UTC)

Dutch ij[edit]

In Dutch stress is usually not represented, but the spelling rule do allow its rendition if clarity is a problem. It is done bij adding a n acute like: múúr or -for diphthongs- kóúde. The digraph ij presents a problem. The official rules require acutes on both the i and the j. Any way of adding such a symbol? Jcwf 13:18, 1 June 2009 (UTC)

Well, since we don't use the actual ij ligature that Unicode provides, I guess the question is how to provide a j with acute accent? Unicode does provide a combining acute accent, so theoretically we can use that after the j, but it doesn't come out right: < j́ >. Apparently it doesn't replace the dot, but rather gets added in addition to the dot, which depending on the font can produce any number of weird results — even the spitten image of a circumflex, which (while kind of cool) is not really desirable. Do you know of any specific words that people do this for? We can Google those words, and see how other Internauts have addressed this … —RuakhTALK 13:54, 1 June 2009 (UTC)
Unfortunately I cannot find any web examples where it is done correctly. Even the TaalUnie that regulates spelling throws in the towel "for technical reasons the stress marker on j of the long ij often drops out" Jcwf 14:19, 1 June 2009 (UTC)
Then for now, I think it's probably best to follow the TaalUnie's lead and use < íj >. If a better solution ever comes along, a bot can fix existing uses. Sorry! —RuakhTALK 14:31, 1 June 2009 (UTC)
What The Not So Short Introduction to LaTeX2e suggests for a j-acute is to use the dotless j and put an acute mark over it. Unicode can do that too: &#x237;&#x301; = ȷ́. But see [3], which says not to do that, and suggests that a regular j with combining acute accent should come out without the dot, despite what your or my browser may do.—msh210 16:51, 1 June 2009 (UTC)
For what it's worth, the combining does come out without a dot (when I view it, but not when editing), as it is dependant on which font you use (maybe someone can find a windows one or which it works and insert that into Template:Unicode?). <insert long rant about how Unicode could have been so much easier for everyone if only they decomposed everything always...>. I think Ruakh's solution is better if we can't get < íj́ > to display for everyone. Conrad.Irwin 17:13, 1 June 2009 (UTC)
Arial, as of Windows 7, does exhibit the intended behavior (without a dot). The problem is that it probably doesn't for pre-XP versions, and we can't differentiate between the versions. -- Prince Kassad 12:31, 5 June 2009 (UTC)
Re: "a regular j with combining acute accent should come out without the dot": Yes, you're right: as of Unicode 3.2 (which cane out in March 2002), there's a boolean "SOFT_DOTTED" property to control this (and < j > has said property set to TRUE). Nonetheless, it doesn't seem to be well supported. :-/   BTW, on this computer I can't see < ȷ >. There are some languages where we just have to accept that not all users will see everything OK, but Modern Dutch? Really? —RuakhTALK 17:30, 1 June 2009 (UTC)
I agree Ruakh, it is rather odd. Anyway, thanks for your trouble. Jcwf 13:36, 3 June 2009 (UTC)

+ & to the German section[edit]

and — the single-stroke forms of and — need to be added to the German section. Thanks.  (u):Raifʻhār (t):Doremítzwr﴿ 10:22, 18 June 2009 (UTC)

add і to Cyrillic[edit]

і should be added to the Cyrillic section. I'd do it myself, but don't know where in the list to put it.​—msh210 18:33, 6 August 2009 (UTC)

Note that this is note that Latn character.​—msh210 18:34, 6 August 2009 (UTC)
Added it to a hopefully sensible location. Though it might make sense to add the other Ukrainian characters too, while we're at it. -- Prince Kassad 18:58, 6 August 2009 (UTC)

number symbol[edit]

Please can we have the symbol # included, in the Misc. section - some keyboards include this, others not. --Volants 12:21, 9 October 2009 (UTC)

Done, thanks! —RuakhTALK 20:16, 9 October 2009 (UTC)

Tie bar[edit]

I can't for the hell of me get the character ͡    to appear in the IPA/Suprasegmentals section. I add it and it disappears when I use the preview. It doesn't like being inside another element or something. —Internoob (DiscCont) 23:36, 3 June 2010 (UTC)

Ooh, it's a combining character. Those are evil. Even if you can get it to show up, there's no way you'll be able to click on it because it has zero width...
With all the other combining characters, we simply resorted to using precomposed combinations. Surely it should be possible in this case too, since there are not that many combinations using the tie bar (ts, tʃ, tɕ, dz, dʒ, dʑ, pf, kp, gb). -- Prince Kassad 08:37, 4 June 2010 (UTC)
Probably not worth adding them then. If someone else wants to though, I won't stop them. —Internoob (DiscCont) 00:54, 25 June 2010 (UTC)

Transliterations section[edit]

I'm unsure whether to include transliteration characters in non-Latin script section. I did remove the Armenian one, but there's still groups in the Devanagari and Hebrew sections. The transliteration characters are duplicates of those in the "Latin/Roman" section. Having a transliteration part in all non-Latin scripts would make Edittools too big. Therefore, I'm leaning towards wanting all of them removed, but maybe the convenience value outweighs the duplication cost? But I think whatever decision, it should be applicable to all non-Latin scripts. --Bequw τ 02:29, 9 September 2010 (UTC)

I imagine that whenever one edits in a language using a non-Latin script, he will almost always need the transliteration characters as well, so I'd favour retaining the transliteration characters in those non-Latin sections, for convenience. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 11:26, 9 September 2010 (UTC)
It will become obsolete once we finally get the Transliterator. -- Prince Kassad 14:19, 9 September 2010 (UTC)
At which time they can be removed, but until then, they're useful, wouldn't you agree? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 16:03, 9 September 2010 (UTC)
No, it won't: I don't know about Devanagari, but the Transliterator certainly can't support Hebrew. —RuakhTALK 19:27, 9 September 2010 (UTC)
I don't see much problem with them. They don't add much to the file-size or the memory requirements, they don't make those sections harder to navigate IMHO, and they don't make the edit-tools as a whole harder to navigate. I'm all for merging sections, because the edit-tools are harder to use when the drop-down has too many options, but I don't see much point in removing a few characters from a well-organized section. —RuakhTALK 19:27, 9 September 2010 (UTC)