Jump to content

Appendix:Easily confused Chinese characters

From Wiktionary, the free dictionary

Some distinct CJK characters are easily confused, because they differ in subtle respects. On Wiktionary, graphically similar characters are listed at the top of entry pages, and are collected here for reference.

In practice, to recall distinctions between these characters one may use sound, meaning, or familiarity with compounds, in addition to pure graphical or etymology of characters distinctions.

This page discusses characters or components which may be confused with each other, yielding the wrong character or an invalid character, not characters which are complicated but unlikely to be confused with other characters.

This page does not treat variant forms, subtle or otherwise, such as to , as these are very numerous and are more correctly incorrect forms or simply errors, rather than confusions. Variant forms are listed at individual characters. An exception is made when such a variation illustrates a more general distinction, such as vs. , which illustrates vs. .

Issues

[edit]

Beyond confusing distinct characters, easily confused components can yield writing errors when one substitutes an incorrect component into a compound character. For example, in Japanese, writing instead of in yields , which is not a Japanese character (it is a traditional Chinese one, however). Similarly, writing as + misses an stroke in the bottom part of the inner component, yielding an invalid character.

This is particularly an issue in complex characters, as the difference may be visually very small – a single small stroke.

Note also that forms of characters differ to some extent between scripts, which, besides being a separate complexity, means that in some cases one must specify the script (Tradition Chinese Hanzi vs. Japanese shinjitai Kanji, for instance) to clearly see the confusion.

Isolated characters

[edit]

Some easily confused characters occur only or primarily in isolation:

    • Notes: extra dash on the second one, extra dash and hook top of vertical on the third one.
    • As combining forms, the first and second ones (water and ice) occur as and respectively; the third one finds use as a phonetic, notably (swim, dive).
    • Notes: Extra stroke at top of the second one.
    • The first one is very common, the latter is specialized and relatively rare.
  • (and )
    • Notes: in the first one, the vertical strokes forms a T, rather than crossing the top horizontal; the bottom strokes form a point in the second one, but are separate in the third one (compare ).
    • The first one is a radical (), and is also used in compound characters even when not the radical, as in .
    • The second and third ones are primarily used in isolation, though they appear in a few compound characters, such as (for ) and (for ).
    • Also, the second and third ones both appear in the common Japanese phrase お先に失礼します (See you (I am leaving before you.))
    • Beware also of , which is used in a number of compound characters; see below.
    • Notes: angled top stroke in the first one, horizontal top stroke forming a T in the second one, horizontal top stroke forming a cross (+) in the third one, different bottom strokes (especially bottom right) in the fourth one (compare and ).
    • The first one () is used in some compound characters, such as , while the second and third ones ( ) are used in isolation, and the last is a radical, used in some characters – see Appendix:Chinese radical/无
    • The second one () is a very commonly used character, while the fourth one () is a Simplified Chinese character.
    • The second and fourth ones appear in the Chinese idiom 无法无天 (wúfǎwútiān, unruly, no respect for law and order) (traditional 無法無天无法无天 (wúfǎwútiān)).
    • Notes: the second one () has a longer curved stroke and an additional stroke at the top, the third one () is similar but has a on top.
    • The first and third ones are radicals, of which is way more common than , but is usually used in the right (as in and ), while 方 is usually used in the left (as in and ).
    • The second one is a Simplified Chinese and Japanese shinjitai form of .
    • Notes: added hook at top of the second one.
    • In some print fonts, characters have symmetric legs, and are easily confused; in handwriting, they are further distinguished by having asymetric legs: 入 has a shorted left leg and looks like λ (lambda), while 人 has a shorter right leg and looks like a backwards λ.
    • Note the number of strokes (2, 3, 4, 3, 3), and whether the horizontal line intersects the leftmost vertical line.
    • Notes: the first one has at bottom right, while the second one has .
    • In some fonts the first one also uses 儿 rather than 几.

Characters used in compounds

[edit]

Traditional characters

[edit]

Some easily confused characters also occur in compounds, yielding easily confused compounds or potential errors in compounds.

    • Notes: The upper-left to lower-right stroke in crosses the main line; the strokes in are different, especially the lower right; , , and each have an extra dot; and are less easily confused, but similar.
    • Note the term 大丈夫, where several similar characters occur.
    • Compound characters:
  • (also , 退, and )
    • Notes: Extra stroke at top of the second one.
    • This distinction is particularly confusing because both components are widely used in compounds, where the distinction is often small; contrast and .
    • Neither is categorically preferred: (silver) has no mark on top ( is rare), while (daughter) and (wolf) have a mark on top ( is rare).
    • The top stroke is sometimes dropped in simplification; traditional Chinese and Japanese kyujitai use , while Japanese shinjitai uses (without the top stroke).
    • The common but unrelated character (food) is graphically 人+良, and hence when there is a on top, then there always is an extra stroke; ×人+艮 does not occur. This reflects distinct etymology, deriving from 𠊊 = + . This is written as a left radical – note different foot. Thus, for example, 館 must have a stroke at the top, as indeed it does.
    • Further, the left radical of is sometimes written slightly differently from , as in the Japanese form of , with two horizontal strokes at the bottom and a horizontal stroke at the top, rather than a short vertical; this is a traditional variant, found in Japanese kyujitai (used for hyōgaiji), and is closer to older forms.
    • The unrelated character 退 is written similarly, though in careful writing the character is distinguished, with the lower right stroke being unattached and convex, rather than attached and concave.
    • Lastly, is occasionally used in compounds, and should not be confused with . In Japanese, the most notable example is , which should not be confused with .
    • Examples:
    • Notes: Cross strokes differ – in 月 they are horizontal and attached to sides, while in ⺼ they are diagonal and unattached.
    • In compounds, these may be drawn identically, as ; in more careful usage is drawn distinctly.
    • Etymologically, is a form of (flesh, meat), where the diagonals and gaps are clearer, while is rather (moon).
    • This is especially an issue in looking up characters by radical; is more widely used, particularly for parts of the body, such as (back) – compare Appendix:Chinese radical/月 and Appendix:Chinese radical/肉.
    • Notes: Bottom stroke is long (longer than middle) in the first one, short (shorter than middle) in the second one.
    • Both used as radicals: and .
    • is much more widely used – it is one of the most common radicals – and when used as a left radical has a slanted bottom stroke, but 士 is also used in common characters.
    • In general, is used in the left or bottom (as in or ), while 士 is used on the top or right (as in or ), though there are exceptions: uses 士.
  • 广 (also )
    • Notes: Extra stroke on top of the second one (and 2 extra strokes on left of the third one).
    • Another subtle difference, more manifested in writing errors than confusions of characters.
    • Compare also the top component in and , a variant of (hand).
    • Compare: , , , ( ), , ,
    • In Japanese, compare (the first two use 厂 while the last one uses 广).
    • Notes: extra stroke on top of the second one, the third one is wider than the first one and the middle stroke does not go entirely across (in some fonts).
    • is one of the subtlest differences, particularly in looking up radicals.
    • The presence or absence of an extra stroke can be very subtle when these characters are used at the bottom: compare (no top stroke) to (top stroke).
    • Notes: extra stroke on the second, third and fourth ones, and differing forms of these characters.
    • Beyond the issue of including a top stroke or not, there are subtle differences in composition, notably whether the top stroke goes upper-right to lower-left, upper-left to lower-right, or horizontally left-to-right, and whether (in the first case) it connects with the left stroke or not. There are also stroke order differences: is 丿 (at left) the second stroke or last (fourth) stroke.
    • is more widely used as a radical, often connected with other strokes as in , , and – see Appendix:Chinese radical/尸 – and not used in isolation in common Japanese.
    • // is used alone (especially to mean “door”), and sometimes used as part of compounds, though generally not connected with other strokes – see Appendix:Chinese radical/戶.
    • In Japanese, (place) (on left) is the most common use, while are also found.
    • Notes: extra stroke on top of the second one, bottom stroke of the third one is long.
    • is a radical, widely used in complex characters (Appendix:Chinese radical/目), almost always on the left (as in ), or top or bottom, with the notable exception of , which itself is used in complex characters such as .
    • , by contrast, is primarily used as the right component in characters such as , and on the bottom in characters such as , , though it is also sometimes found on the left, as in .
    • is also a radical, though less-used (Appendix:Chinese radical/自).
    • Notes: the first one is a rectangle, while in the second one the bottom stroke extends past the sides.
    • Both widely used radicals, 罒 is always used on the top, and cannot be used in isolation; while 皿 is always used on the bottom, and can be used in isolation.
    • Contrast:
    • Notes: the second one has 3 horizontal lines (), while the third one has 4 horizontal lines.
    • The first one has a 彳 radical. The first, second and fourth ones are only used in isolation, while the third one, (bird), is a radical and used in some compound characters, such as , , and .
    • The third and fourth ones start with the same 6 strokes in practically the same positions, which can trip up muscle memory.
    • Notes: the second one has an extra dot.
    • These are from and .
    • Another subtle difference, more manifested in writing errors than confusions of characters.
    • In Japanese, when used in isolation there is no dot – – but when used in compounds there often is a dot, as in and the similar .
    • Notes: Longer top stroke in the second one.
    • These are used both in isolation and in compounds.
    • Another subtle difference, more manifested in writing errors than confusions of characters.
    • In compounds, the shorter stroke appears in a number of compounds: , while the longer stroke appears in .
    • Notes: No top stroke in the first one, top stroke in the second one crosses (+) the hook, top stroke in the third one touches (T) the hook; in the first two, in the third one.
    • Compounds: 使 便
  • (also )
    • Notes: in the first one, vertical stroke touches (T) the top stroke of ; in the second one, the vertical stroke cuts through ; in the third one, the vertical stroke touches the bottom.
    • All occur in some compounds, but rarely ambiguously, hence cause of writing errors rather than confusions of characters. In Japanese, are the most common uses.
    • Examples:
  • 西
    • Notes: Extra stroke 一 in the second one.
    • 酉 is mostly used in compounds (it is not commonly used in isolation in Japanese, and is mostly used as the 10th zodiac symbol in China), and is particularly associated with (alcohol). 西 (west) is often used in isolation.
    • As parts of compounds, both are used as radicals: 西 is mostly used as a top component, as in , while is rather used as a left component, as in .
    • contrast:
  • (also )
    • Notes: Extra stroke 一 in the second one, and the top-right part is two strokes in the first one, one turning stroke in the second one.
    • Some use in more complex characters, as in and .
    • Some risk of confusion, beware also of composition and stroke order errors.
    • contrast:
    • Notes: Extra stroke on left half of the second one.
    • In Japanese, neither is used in isolation, but occur in compounds or similar characters: for the first one, and for the second one.
    • Notes: in the second one, in the third one (extra middle stroke).
    • In Japanese, is used in isolation and in compounds, such as , while 㐬 is used only in compounds, namely and .
    • Notes: 人 at bottom of the first one, at bottom of the second one (extra vertical line).
    • In Japanese, little risk of confusion; is primarily used in (I (male)), while 業 is primarily used in isolation.
    • Both are also used in a number of compounds.
    • Contrast:
    • Notes: longer vertical line in the second one.
    • In Japanese, is not used in isolation, but compare , while is used in isolation, and in compound characters such as .
    • Notes: the second one has an extra 一 stroke on the top, while the third and fourth ones are in two parts, and the fourth one has an extra 一 stroke in the bottom (the first three: 1 bottom stroke; the last one: 2 bottom strokes)
    • Compound characters:
    • In Japanese, is commonly used alone, and in and ; 幸 is used alone and in ; 羍 is not used alone, but is used in .
    • Notes: on the second one, on the third one
    • 巾 is a radical (index), and in almost all cases appears unmodified in compounds, either on the left or the bottom.
    • In rare cases, it will appear on the right, in which case it may appear as 帀, as in (compare ) (simplified ), or as 市, as in .
    • Contrast:
  • (and corresponding simplified: )
    • Notes: Extra horizontal 一 stroke in the first one, which is the more common character.
    • Notes: Center stroke goes all the way down in the first one, but only forms square in the second one (as in ), at least in Japanese (in some Chinese forms, has a long center stroke, identical to ).
    • Primarily a composition error, rather than confusion.
    • is a common radical – see Appendix:Chinese radical/角.
    • Beware also of vs. .
    • Also note similar , and distinguish from and the bottoms of (more below).
  • (also )
    • Notes: the first one has an extra 一 stroke on the top (first: 2 top strokes; second: 1 top stroke), and bottom differs: the first one is , while the second one looks like 貝 (with 田 instead of 目).
    • Compound characters:
    • Notes: the first one is on , the second one reverses order, and the third one has on the bottom.
    • is used in isolation and widely used as a phonetic. The others are more rarely used; the most prominent example in Japanese is .
    • Contrast:
    • Notes: the first one has on top, a longer vertical stroke (like ), and on the bottom; the second one is on
    • Not widely used, but a possible composition error, as in , , or .
    • Notes: different lower portion ( in the first one, in the second one); also the middle component of the second one differs between scripts.
    • This distinction is primarily a Japanese shinjitai issue; 売 is used in isolation, while 壳 is not, but forms of both are used in compound characters: compare with .
    • Notes: the first one has two vertical strokes, while the second one instead has diagonal strokes, and 3 strokes at top.
    • Not widely used, but a possible composition error, as in
  • / /
    • Notes: in the first group, extra dashes in the second and third ones; in the second group, reversed order of 口 and 一 and extra side stroke in the second one; in the third group, stroke on left differs.
    • There are several variants and derivatives of (halberd, spear) which differ in various respects.
    • In Japanese, the characters , , and are not commonly used in isolation, but they are used in compound characters, such as , , and .
    • Contrast:
    • Notes: extra vertical stroke in the second one.
    • 比 is a radical, and used in a number of compound characters (such as 皆), while 此 finds some use, as in .
    • Notes: bottom of the first one looks like 冊, bottom of the second one looks like bottom of (月 + 刂).
    • Contrast: , ,
    • Notes: the first one has a flat top stroke, and straight center stroke; the second one has a slanted top stroke, and center stroke hooks at bottom.
    • In Japanese, 平 is used in isolation, and in the compound character , while 乎 is not used in isolation but is used in the compound character . The confusion thus primarily occurs in composition errors in these characters, particularly for 呼, which is unlike the others.
    • Notes: the first one has hooked head, the vertical stroke starts at the hook, not the top, and bottom is hooked; the second one has flat top stroke, the vertical stroke starts at the top, and bottom is hooked; the third one has flat top and straight vertical; the fourth one has slanted top and straight vertical.
    • 子 is a very basic, and widely used in compounds; 于 is common in isolation in Chinese, but not Japanese; 干 is a basic graphic form, used in isolation and with variants found in many characters; 千 is a basic character, found in some compounds.
    • Contrast: , , ,
    • Notes: the second one has an extra diagonal stroke on right vertical.
    • In Japanese, both occur in isolation; 斤 is also widely used in compound characters, while 斥 occurs in .
    • Contrast:
    • Notes: the first one has a rectangular border, while the second one has strokes that pass the border.
    • Primarily a composition error, when the characters appear as small components, they can be difficult to distinguish, as in (), which uses a variant of 耳 at lower left.
    • Notes: the first one has open top, the second one has almost closed top, and the third one has closed loop at top.
    • Primarily a composition error, particularly between Japanese and traditional Chinese.
    • Notes: the first one has no 十 on top, the third one has no 八 below.
    • Beyond confusion, there are various typographical differences, such as whether the top 目 connects to the bottom or is separate – see discussion at and .
    • Contrast: ;
    • Notes: the first one has one stroke 一 at top and bottom, and middle strokes extend past verticals, while the second one has two strokes 二 at top and bottom, and middle strokes stop at verticals (form a 日).
    • Primarily a composition error; in Japanese 垂 is used in isolation, and in compounds in the common character and rare character , while 重 is used in isolation, and in compounds in the common characters (and ) and , and in the rare character .
    • Contrast:

Simplified characters

[edit]
    • Notes: the second one has a 冖 radical and the third stroke extends above it, while the first one has a 丶 on top.
  • 𫠣
    • Notes: the first one three are standalone characters, while the fourth one is only used as a component (e.g. ).
    • Notes: the first one has 𦣻, while the second one has 𭥍.
    • Notes: also pronounced the same.

Components

[edit]

Certain strokes do not appear, or do not often appear, as a distinct character, but are used as a component in other characters.

  • (as left radical, as in )
    • Notes: extra stroke on the second one; the first and second ones have long vertical that crosses bottom stroke, while the third one has short vertical that T-junctions at bottom stroke.
    • and are both extremely common radicals, while in Japanese common shinjitai, 牜 is only used for five characters: .
    • In cases where both 扌 and 土 occur as a radical, particularly in a less-used character, confusion is very common.
    • Contrast:
  • (first one same form as katakana )
    • Notes: the second one has an extra diagonal stroke in the lower right; it derives from (clothes), where the stroke is clearer.
    • This is one of the subtlest differences, as both components are frequently used as left radicals, where the small stroke is easily missed.
    • In Japanese, a rough guide is that the first one (礻) is used in simpler characters such as , while the second one () is used in more complicated characters such as , with the notable exception of .
    • Contrast:
  • (and )
    • Notes: the first one has no side strokes, the third one has a single stroke from top to bottom left, and all but the fourth one have a hook stroke (乚) at bottom right.
    • Beware also that while is a common radical (Appendix:Chinese radical/穴), the variant is also found (no top stroke, hook (乚) at right), notably in (whence , ) and .
    • This should also not be confused with , as in (in Japan), which has no side strokes, or , whose top does not extend.
    • The variant is also found, notably in .
    • Contrast: , ,
  • (as in and ) vs. the top of (also used in various compounds) vs. the top of / (as in )
    • Notes; differences include (“1 horizontal + 1 diagonal stroke” is actually a combined hook stroke):
      • – in the first one, 1 diagonal stroke on the left, 2 diagonal strokes on the right, 1 horizontal stroke on the left;
      • – in the second one, 2 diagonal strokes on the left, 1 diagonal stroke on the right, 1 horizontal stroke and 1 parallel diagonal stroke on the left, 1 horizontal stroke on the right;
      • – in the third one, 1 diagonal stroke on the left, 1 diagonal stroke on the right, and 1 single horizontal stroke across the top of both.
  • 𰃮 (; as in ) vs. (; as in ) vs. (as in ).
    • Notes: the middle stroke differs: in the middle stroke is diagonal and does not touch, while in the middle stroke is vertical and does touch; () is like the first one, but with an added stroke across the top.
    • All three are used in several characters.
    • The stroke order differs between these, which aids distinguishing them: in the middle stroke is the first one, while in the left stroke is the first one.
  • vs. (as in ; also or as a bottom)
    • Notes: in the second one, the second stroke is vertical and long.
    • Mostly a writing issue, not a confusion one.
    • 灬 is very common, being the bottom form of (fire).
    • The other is unusual, being seen in Japanese common shijintai only in and .
    • When 心 is drawn as a bottom radical, while the second stroke from the left is drawn longer, passing under the third stroke from left, to avoid confusion with 灬.
    • Compare:
    • While is not likely to be confused with other components, it is the combining form of two radicals: and . These are distinguished by whether it is used on the left (form of ) or on the right (form of ), which distinction is necessary when looking up by radical.
  • (also )
    • Notes: top horizontal stroke extends to right in the second one; relatively different, with one stroke on top-left to bottom-right.
    • Mostly a composition error; note that the first one has 3 strokes, while the second one has 4.
    • Both appear with some frequency as components – Appendix:Chinese radical/夂 and Appendix:Chinese radical/攴 – but the second one () is much more common.
    • As a simple rule, occurs on the top and bottom as in and , while 攵 appears on the right, as in .
    • As small components, these may be confused with , which occurs widely – for example, in , the bottom left is , while the right is .
    • The phonetic component in (寉/隺) is graphically very similar to the component in Japanese shinjitai (simplified from ). Both contain as the bottom, but the top is slightly different.
  • (as in and ) vs. 𡗗 (as in and )
    • Notes: the first one has 2 horizontal lines (二) and 2 top diagonal strokes, while the second one has 3 horizontal lines () and no top diagonals.
    • Note also that where the last stroke (top-left to bottom-right, in lower right of character) starts varies between fonts – may start at any of the horizontal lines, yielding minor differences.
    • Mostly a composition error.
  • (approximately)
    • Notes: the first one has a long bottom horizontal stroke, while in the second one the bottom stroke stops at the vertical stroke at right. (The second one component does not appear in isolation; the katakana is being used to represent it.)
    • This is a subtle composition issue, related to etymology: 彐 derives from , where the bottom horizontal stroke is more evident, while ヨ derives either from , where the right vertical stroke is more evident, or top of ().
    • 彐 is particularly found in , used in common Japanese characters and and the less common .
    • ヨ is found in the Japanese form of , found in the common characters , , and , and the less common , , and 耀, as well as and .
  • /
    • Notes: the first one has 4 strokes (2 upper ones), while the second one pair are a 2-stroke ‘X’ shape, the third one has a on top.
    • Various shapes of the approximate form + (an ‘X’ shape/) exist; these resemble / but are unrelated etymologically.
    • The character is used in the common characters () .
    • The Japanese shinjitai character has a simplified left component which is not (the 6-stroke) 交, but rather a 4-stroke 亠 + 乂 (not ).
    • Mostly a composition error.
  • 𠂤 (as in ) vs. (as in )
    • Notes: the first one has a top stroke, while the second one has no top stroke (the third one has middle stroke in the center, rather than at left)
    • This is a minor composition error – the form with the top stroke is common, due to this being a productive phonetic, but the related does not have a top stroke, and a number of characters are derived from , which is unrelated (it is a simplification of ).
    • Writing either form with or without the top stroke has existed as a variant form since Bronze script.
    • Compare (but and ) with ( ) .

Complex characters

[edit]

Some complex characters differ in a small component; the components may not be similar, but play a minor part in the overall character.

    • Notes: in the first one, in the second one.
    • Not a very subtle distinction, but these are both common characters in Japanese (grades 3 and 4, respectively), hence a common confusion.
    • Note the extra vertical stroke in the second one.
    • In the first one, the lower left is ; in the second one, it is .
    • In the first one, the bottom is 木; in the second one, the bottom is 十.
  • Derivatives of
    • Various derivatives of are quite complicated and hard to distinguish, because the additional strokes are in a small area at the bottom; see Appendix:Chinese radical/虍 for examples.
    • For example, in Japanese and are all common kanji.
    • The first one has on top left, while the second one has .
    • The first one has at bottom, while the second one has .
    • The first one has at bottom, the second one has , and the third one has (in all cases 𡗗 on top).
    • The first one has at bottom, while the second one has (both have on top).
    • The first one has on right, while the second one has .
    • The first one has at bottom, while the second one has .

Spoofing variants

[edit]

Unicode database has special set of characters for these, called "Spoofing variants". They define spoofing variants as variants that are potentially used in bad faith to direct users to unexpected URLs, evade email filters, or otherwise deceive end-users. Determining whether or not two ideographs are spoofing variants is based entirely on the glyph shape, without regard for semantics. Etymologically unrelated pairs such as U+571F 土 and U+58EB 士 or U+672A 未 and U+672B 末 are considered spoofing variants. A common source of spoofing variants is deliberate confusion between Radicals 74 (⽉) and 130 (⾁). These two radicals, when used in Han ideographs, look very similar or identical (for example, in U+3B35 㬵 and U+80F6 胶). Similarly, even if the visual appearance of two radicals is distinct, they may be similar enough that a user might overlook the distinction (for example, ⼎ and ⺡), especially in a spoofing context such as https://凊水.org/ versus https://清水.org/. Spoofing variants also include instances where two highly similar shapes are separately encoded because of source code separation, without regard to other considerations. Cases include the following pairs: U+672C 本 and U+5932 夲; U+520A 刊 and U+520B 刋.

Some spoofing variants might be sufficiently dissimilar in shape that they can be distinguished at large point sizes. Others are dissimilar in meaning so that they can be distinguished in running text. They might also be visually distinct in one font but not another, due to the language or region that the font supports. These considerations are irrelevant to their status; even dissimilar pairs can be used to misdirect users (particularly when URLs are displayed at small point sizes).

Their list has about 300 pairs of spoofing variants. Further reading: https://www.unicode.org/reports/tr38/index.html#Spoofing

See also

[edit]

References

[edit]
  • Chinese characters easily confused, Henry C. Fenn, 1953
  • Dictionary of Easily Confused Chinese Characters, →ISBN
  • Dictionary of confused Chinese characters from A to Z (ISBN-10: 1724743406)