Template talk:Han char

From Wiktionary, the free dictionary
Jump to navigation Jump to search

Template for Han character section[edit]

This is used under the Han character header, in the Translingual section of the entry for a single Han character, which precedes the language sections. It adds the character to Category:Han characters in radical/stroke sort order (then by Unicode order).

Usage: {{Han char|parameters}}

All parameters are named:

  • alt= alternate form(s)
  • rad= single character radical
  • rn= radical number
  • as= additional strokes. This must be a two digit number for the radical/stroke sorting to work in Category:Han characters! So 3 additional strokes is "03".
  • asj=additional strokes in Japanese
  • sn= total strokes. Three here is "3", not "03".
  • snj= total strokes in Japanese
  • so= stroke order, image width should be 40px times the number of characters shown, no caption[1]
  • four= four corner system (format 1234, 12345 or 1234.5, multivalue separated by comma)
  • canj= Cangjie input (A-Y only, multivalue separated by comma)
  • ids= IDS composition sequence (multivalue separated by comma, regional annotation allowed)
  • gso= graphical significance and origin[1]
  • cmean= common meaning[1]

Notes[edit]

These are all from the information loaded by "NanshuBot". The idea is that some of them will change/go away/be represented differently. These changes can be made with the template in some cases; even when a bot is required, it can find the template instance and operate on it.

  • alt= is usually used for simplified/traditional Chinese. But not always. Might take manual fix.[1]
  • gso= should become Etymology, but this probably takes manual edits in all cases[1]

The template uses 2-column format; this was just done to make the first pass easy.

  1. 1.0 1.1 1.2 1.3 1.4 parameter proposed but not used

Regional annotation in the ids parameter states the character's shape that commonly used in some regions:

(G) PRC and Singapore, (C) only PRC, (S) only Singapore,
(H) Hong Kong, (T) Taiwan, (M) Macau, (J) Japan, (K) both Korea, (V) Vietnam,
(X) font-variation that does not appear in the UCS specification but matches the same codepoint

Example[edit]

This shows using the seal and bronze script images from Commons, as well as the stroke order image from Commons; these are available for a number of characters.

==Translingual==
 
[[Image:字-bronze.svg|thumb|70px|[[w:bronze script|Bronze]]]]
[[Image:字-seal.svg|thumb|70px|[[w:seal script|Seal]]]]
===Etymology===
From [[宀]] roof + [[子]] child.  Root meaning: to give birth.  [[子]] also serves as a phonetic element.
 
===Han character===
{{Han char|alt=|rad=子|rn=39|as=03|sn=6|so=[[Image:字-bw.png|280px]]|four=3040<sub>7</sub>|canj=十弓木 (JND)
|gso=|cmean=[[letter]], [[character]], [[word]]}}

(see ) which looks like:

Translingual[edit]

Bronze
Seal

Etymology[edit]

From roof + child. Root meaning: to give birth. also serves as a phonetic element.

Han character[edit]

  • Radical 39 + stroke: 子+03
  • Stroke count: 6
  • Four-corner system: 30407
  • Cangjie input: 十弓木 (JND)
  • Common meaning: letter, character, word

Template:mid2

Stroke order

visual appearance of template[edit]

Can this template be worked on? The layout, in two columns, is confusing and unattractive. Badagnani 09:57, 24 October 2006 (UTC)[reply]

Well, I kind of like it the way it is ... ;-) But yes, of course; the layout is just a simple first pass; the idea is that the template can be improved. I just used top2/mid2 etc as a starting point. What did you have in mind? Robert Ullmann 12:00, 24 October 2006 (UTC)[reply]

Well, good job creating the template (though I'm not sure how it improves on the previous system, which I'm now quite used to). The main thing is that one has to look over to the right for the normal meaning. It seems that everything should be in one column unless it's a single section that's being broken up (as with "translations," that are broken into columns to save space). Also, the columns are showing up in different widths, especially if there's a long URL in the left column. Badagnani 22:37, 24 October 2006 (UTC)[reply]

This new arrangement seems better, with important info kept on the left. The only thing I see on the right is the Unihan, which seems to tie in with what's on the left. Badagnani 04:25, 27 October 2006 (UTC)[reply]

Oh, I notice that the "Graphical Significance and Origin" field doesn't appear. I think it's important that this remain as a blank field in all the hanzi entries so that knowledgeable editors may add this eventually for each character. If the blank field isn't there, editors may not know to add it. Badagnani 04:28, 27 October 2006 (UTC)[reply]

(I was going to ask what you thought ...) I hadn't played with the Han ref template again yet, I'm wondering if there are other sources besides Unihan we could point to. (The problem being that most aren't as comprehensive.)
gso is moved to Etymology (where it belongs) if not blank. Someone adding this information should be adding an Etymology section, not adding gso to the template. And I think you underestimate editors, anyone knowledgeable about etymologies who has worked with the wikt even a little bit will see what is missing! Robert Ullmann 11:35, 27 October 2006 (UTC)[reply]

Personally, I think the etymology should be moved to be a subsection of the han character section. Is that okay? Habemus 17:38, 22 February 2009 (UTC)[reply]

Expansions[edit]

May I open a discussion about expansions of the template? I suggest the following:

  1. categorization: The category:Han characters comprises 21461 elements; a bit crowded. From my work with Chinese characters I would prefer an own category for the 214 radicals, with all radical variants it may be about 270 which is an amount giving a better overview. Either the radicals are only in category:Han character radicals (a subcategory of category:Han characters), or they occur redundantly in both (which I think better).
  2. variations: When a radical has also variations, it could be distinguished between the main radical and its variations, by setting a new parameter e.g. var=0 if one of the 214 main radicals. Variations may get var=1, var=2 and so on, the number depending on Unicode codepoint ascending order.
  3. commonscat: In the Commons, any image of a chinese character has either an own category which is linked to the Wiktionary character article, or the character has not an own category but is linked itself to the Wiktionary. This is a fine option to look for the characters specifications. Another fine thing would be to link each Wiktionary radical to its Commons category (since not each Wiktionary character has an image at the Commons, I restrict for the moment on radicals).
    This link is established since recently from the 214 articles Index:Chinese radical/...

Because such a link might need frequent maintenance – at least during its developement stage – the best thing would be to solve that by a sub-template, that is invoked only in the ~270 cases when additional strokes are zero, e.g. {{#ifeq:{{{as|}}}|00|{{Han rad|{{{rad}}}|{{{rn}}}|{{{var|}}}}}}} (passing character, radical number and the variation distinguisher var whether it exists, or not). Of course that template can care for category:Han character radicals.

Your opinion? -- sarang사랑 14:05, 17 April 2012 (UTC)[reply]

This is a good idea. :) I will let one of our more template-savvy editors comment on the technical details. - -sche (discuss) 20:54, 17 April 2012 (UTC)[reply]
Your agreement enjoys me. When everything about is discussed, I can continue and create an empty template {{Han rad}} or another accepted name, so the authorized template-savvy can insert the string above. This done, further discussions my come up about the sub-templates layout. I have my ideas for that and will then discuss them on the sub-templates talk page. -- sarang사랑 05:45, 18 April 2012 (UTC)[reply]
Well, no-one has commented here with any concerns, so I've unprotected this template, so you can improve it. :) If you need me to unprotect any other templates so you can lower them, let me know. - -sche (discuss) 02:28, 23 April 2012 (UTC)[reply]

Expanded[edit]

The suggested expansion, for linking to the Commons category, is now inserted. For the details, see the documentation of {{Han rad}}.

A large number of transclusions happen with insufficient parameters. To find and repair them, the contained maintenance category Category:Chinese terms needing attention is expanded too, and provided with the pages in error and their error type (a, n, q, s); for the details, see the category. Of course, the category should always kept empty. -- sarang사랑 16:47, 24 April 2012 (UTC)[reply]

Characters with multiple possible IDS descriptions[edit]

There seems to be characters that can have different representations in IDS due to Han unification (like (G:⿰虫单 J, K:⿰虫単) or (G, K:⿰甬攴 T:⿰甬攵)). Is there a way to handle this? (Is this the right place to discuss this?) —umbreon126 06:59, 8 November 2014 (UTC)[reply]

Now supporting multiple value in four-corner and composition. You can put it like
ids=⿰虫单<sub>(G)</sub>,⿰虫単<sub>(JK)</sub>
ids=⿰虫单<small>(G)</small>,⿰虫単<small>(JK)</small>
in one parameter (as I did earlier). Links are generated on only component characters; others are ignored. --Octahedron80 (talk) 23:36, 11 December 2015 (UTC)[reply]
@Octahedron80 Neat. However, maybe it would be better to use small instead of sub, because sub is for subscript text, and (G) has no reason to be subscript. —suzukaze (tc) 07:25, 12 December 2015 (UTC)[reply]
Let's follow that. Use small instead of sub. --Octahedron80 (talk) 07:28, 12 December 2015 (UTC)[reply]

I wrote about regional annotation at the above section. --Octahedron80 (talk) 04:11, 8 January 2016 (UTC)[reply]

Invalid parameter[edit]

𠕇 has been put in the hidden [[Category:Invalid parameter in Han char]]. What's wrong with it? Justinrleung (talk) 06:11, 7 October 2015 (UTC)[reply]

|as= is picky; if there are less than ten strokes then the number must start with zero. (IMO I find it bizarre that the template can detect this but it can't automatically add the zero itself) —suzukaze (tc) 06:16, 7 October 2015 (UTC)[reply]

Add Zhengma to the template?[edit]

I noticed that this template already has parameters for Cangjie and the four-corner method. But could we and should we add a parameter for the Zhengma method? - VulpesVulpes42 (talk) 18:48, 4 November 2015 (UTC)[reply]

There is no Zhengma data (or something similar) in Unihan database. If it's possible, where could we get the data from? I don't recommend to manually input because there are ten thousands of Han characters to say.--Octahedron80 (talk) 09:44, 7 December 2015 (UTC)[reply]
@Octahedron80 zdic.net has zhengma data for most characters, in fact, if you look up a character at zdic.net, you are more likely to find how to input the character using zhengma than how to input it using cangjie. Most likely because cangjie does not have an input for those characters. VulpesVulpes42 (talk) 16:24, 28 December 2015 (UTC)[reply]
@VulpesVulpes42 I am looking for the raw database of zhengma, like tabular data, if it is available to be imported by bot. And it can be used in index pages either. Editing this template is just easy. --Octahedron80 (talk) 06:25, 7 January 2016 (UTC)[reply]
@Octahedron80 There are multiple input tables available online that can be rather easily "mined" for codes for most characters (rime input tables are tab-separated). The only challenge might be dealing with alternative codes where more than one is available, as well as Simplified First/Traditional First division. For example in the Traditional First tables, 見 has the code lr,and 见 has the code lra. Those codes are reversed in Simplified First tables (i.e. lr = 见 and lra = 見). --Mea Gratia (talk) 21:00, 5 September 2020 (UTC)[reply]
If possible, it would be nice to reignite this discussion and see if it is possible to go forwards with adding zhengma codes. Zgw3kszo (talk) 21:16, 25 February 2024 (UTC)[reply]

I have a question: if there are more input method references in the future, is it suitable to include them in one line? --Octahedron80 (talk) 06:00, 1 February 2016 (UTC)[reply]

Font (language) used in translingual section[edit]

Hello. Which font (language) is implied in the translingual section? Ideally I would say it should be the Kangxi form, but I do not know even if there is a technical mean to display the Kangxi form. Thank you!! --Maidodo (talk) 12:34, 8 May 2016 (UTC)[reply]

It seems like it is PRC-style Chinese. I like the idea of using Kangxi forms, but I don't think it is feasible. —suzukaze (tc) 22:58, 8 May 2016 (UTC)[reply]

Multiple Cangjie support[edit]

There can also be multiple Cangjie versions for one character, see here. --Sarefo (talk) 12:04, 11 June 2016 (UTC)[reply]

|rn= (radical number)[edit]

We don't need this, do we. It can be determined from |rad=. @KevinUpSuzukaze-c 01:45, 15 March 2020 (UTC)[reply]

Additional/total strokes options[edit]

I'm wondering why there isn't an option for |as=(traditional Chinese and Japanese) and |asck=(simplified Chinese and Korean). Or if there is, it's not listed.

Also, the table for this section was a bit unclear to me at first, as I had thought that there were two columns each describing separate parameters. But in reality, there is only one column describing the two parameters in tandem. Perhaps it could be clarified that |as is always present but changes depending on the second parameter. There could be darker borders between the rows to help signify that each row is a pair together, but I'm not sure how to edit to table to show that. ChromeGames923 (talk) 06:26, 20 April 2021 (UTC)[reply]

Support for negative values for as?[edit]

Some Han characters consist of a deletion of stroke from a radical, for example w.r.t. and the variant radical w.r.t. . Could support for negative values for the additional strokes parameter be added? On the page for 王, you can see that someone has entered -1 for as, but it displays as "玉+-1"

173.72.124.197 21:27, 30 July 2021 (UTC)[reply]

Three or more stroke parameters[edit]

Discussion moved from User talk:Erutuon#Help with Template:Han char.

Hi, I notice you've worked on this template in the past so I'm wondering if you can help with the formatting of the stroke counts. Currently it somewhat breaks when there's more than two different stroke count parameters. For example, produces "12 strokes in Chinese in traditional Chinese" when it should just say "12 strokes in traditional Chinese".

Even worse is the page I just edited, . With the parameters "|sn=21|snm=20|snj=19|snk=21", it produces "21 strokes in Chinese in Chinese in traditional Chinese, 21 strokes in Korean..." Alternatively with the parameters "|sn=21|snm=20|snj+=19" it produces "21 strokes in Chinese and Korean in traditional Chinese..." Ideally it would say "21 strokes in traditional Chinese and Korean" but an acceptable middle ground would "21 strokes in traditional Chinese, 21 strokes in Korean". As long as it doesn't keep saying "in Chinese" or mix up Korean with (traditional) Chinese, that would be much better. Is there any solution to this or a fix that can be done to the template? Thanks, ChromeGames923 (talk) 08:11, 3 September 2021 (UTC)[reply]

@ChromeGames923: The system of stroke number parameters described in Template:Han char/documentation § Variations in additional strokes and total strokes and the accompanying code in Module:zh-han (which I rewrote at one point) was designed for cases where there are one or two stroke numbers, but it's not clear to me how it's supposed to work with three or more stroke numbers. For the cases you give it needs two steps: figure out which language system (Chinese, traditional Chinese, Japanese, etc.) has which stroke number, then format it. The first step requires interpreting what each of the explicit parameters, |snm=, |snj+= mean and then figuring out what |sn= means based on that. I'm not sure what |sn= means in each case you give, particularly in the case with |snj+=19 (though I could try to guess), because the template documentation only explains what |sn= means when there is just one other stroke parameter. It would be much clearer, and easier to write code for in Module:zh-han, if there were only |sns= (Simplified Chinese), |snt= (Traditional Chinese), |snj= (Japanese), |snk= (Korean), and whatever else, and you had to write all of them when they have the same stroke number. However, that would be less convenient.
Anyway, if you can clarify what |sn=21|snm=20|snj=19|snk=21 and |sn=21|snm=20|snj+=19 mean, in terms of "Traditional Chinese has 21 strokes, Simplified Chinese has 19 strokes, ...", then I or someone else could decide how Module:zh-han would best implement this. Or perhaps it would be better use the clearer system of parameters (no plain |sn=) when there are three or more stroke numbers. — Eru·tuon 19:09, 3 September 2021 (UTC)[reply]
@Erutuon: The expected behavior isn't clear to me either in the case of three or more stroke parameters; I've seen a few pages that say "in Chinese in traditional Chinese" and at first I thought it was intentional. I'm not sure if there's a clean definition of |snj+=19 either, I only mentioned it because I was trying different combinations with +, and that one just happened to produce a close result. But |snj+=19 was confusing to me so I ultimately decided not to use it when I saved my edit.
Within the current system of using |sn= with four main script styles (traditional, simplified/mainland, Japanese, Korean), I can think of having |sn= automatically adapt to the other parameters when there are more than two parameters. For example, if there are three different parameters (|sn= plus two others) and:
  • the other two are Japanese and Korean, then |sn= could say "Chinese"
  • the others are simplified/mainland and Japanese/Korean, then |sn= could say "traditional Chinese"
  • the others are simplified/mainland and Japanese/Korean, and there is a "+" sign somewhere, then |sn= could say "traditional Chinese and [Korean/Japanese]", whichever of Korean or Japanese wasn't defined manually.
In the case of four numbers, |sn= should probably be "traditional Chinese" since everything else would already be specified. This is definitely messier than the current system and a bit confusing, but I think it probably wouldn't break anything that's currently working.
As for a system with no |sn=, this does seem like it would be clearer in behavior (actually that idea is kind of how I wrote right now). It might be helpful to have as an alternative to the current system, so that way everything doesn't have to be changed and the more convenient way is still available. But even with the current format, there are some options missing: earlier on this talk page I mentioned how I wanted to use |snck= or |snck+=, though I don't remember which character I was talking about. Those two could probably be added to the template without much difficulty, but having the alternative system would be able to handle everything flexibly without having to code in new combinations (for example if there are differences between Hong Kong and Taiwan that aren't already covered). I realize though that it's probably a lot amount of work for something that'll be used quite rarely, so it might not be worthwhile. ChromeGames923 (talk) 20:25, 3 September 2021 (UTC)[reply]

idea for treatment of regional standard forms[edit]

instead of having convoluted parameters like "asm++", why not have multiple headword templates, marked with {{tlb|Mainland China}} or sth, and additionally display the correct form using appropriate language codes like "zh-CN"? —Fish bowl (talk) 00:14, 7 March 2023 (UTC)[reply]