Wrong pronunciation pinying because of Tone Sandhi rule

I just showed my Chinese friend my progress in pronunciation, going through the first 100 words I learned. He was impressed with the progress I made! But he also pointed out a common mistake with the tones in the hanzi hero. Here are some examples:

  • 不大 bùdà should be bú dà
  • 不用
    bùyòng should be bú yòng
  • 不太
    bùtài should be bú tài
  • 一边
    yībiān should be yìbiān

Online i read this is called Tone Sandhi, when is followed by a fourth tone (e.g., 大, 用, 太), it changes from (fourth tone) to (second tone).

As far I understand dictionaries may show the but spoken it is and incorrect to use (and also hard to speak two fourth tones after each other)

I dont know if there is a reason hanzi hero writes it like this or if it is a mistake. Any thoughts? Perhaps worth adding clarification for this in related characters? Like having a ‘dictionary’ pronunciation and a ‘spoken’ pronunciation. Seems like there are some systems to this too that can be learned.

Some rules:

  • When two third-tone (T3) syllables occur consecutively, the first one changes to the second tone (T2) for smoother pronunciation.
  • (bù, fourth tone) changes to the second tone (bú) when followed by another fourth tone (T4)
  • 一 (yī) changes to (second tone) before fourth tones and (fourth tone) before first, second, and third tones.

Could perhaps be shown with a popup when hovering the word or something. Or like a :speaking_head: icon or something … just some ideas. I think it is helpfull to point this out to the learner.

1 Like

We have a documentation page that outlines our stance on this, which is inline with the way pinyin transcription is done in all Chinese->English dictionaries.

The pronunciation of the pinyin is different from its transcription for tone sandhi in most cases, as you point out. For example, if you look up 不用 in nearly any Chinese->English, the headword pronunciation will be “bu4yong4”. Likewise, in any sentences that also have pinyin, it will be transcribed as “bu4yong4”.

However, for the audio generated, we have a list of exceptions for these cases to ensure that it is generated in a way that reflects its actual pronunciation. E.g., for “bu4yong4” the audio that plays is indeed “bu2yong4”.

I think there are things we can add here to reduce confusion for those new to this concept:

  1. Block e.g., “bu2yong4” as an accepted answer with a pop-up that indicates one should not correct for tone sandhi when in the transcription.
  2. Accept the tone sandhi as an answer with a note.

I think we have an item for the first one, which I can see ourselves doing at some point for special cases like 不 and 一.

(edit: here is another thread of a user encountering the same issue: (Feature Request) "Phonetic" tones as alternatives to "written" tones)

Thanks for the answer! And agree it would be nice to have this feedback, as time audio recognition is not a beginners best trait.

1 Like