Will vocabulary with different readings ever be a part of the curriculum?

I noticed today while learning 睡 that 睡觉 is not in the curriculum, and also realized that the second characters reading here (jiao4) is different than 觉 as in 听觉 (jue2). This sort of thing impacted me when I saw 都 in a graded reader - the context was like 我们都, and realized its actually (dou1) here not (du1).

What is the long term plan for cases like this? This seems like a pretty big gap in the curriculum if only same pinyin readings are allowed.


Right now there are no immediate plans to include different readings as part of the curriculum.

The main purpose of HanziHero, as we see it, is to get users familiar with the main meaning and pronunciation of each character. The components we teach, the sound mnemonics we have, and the vocabulary we add is all towards that end. While the vocabulary we choose is from the HSK to maximize its usefulness, there is no intent on our end to teach all vocabulary in the HSK (~10k) as doing so would detract from the purpose of the application.

For 都 specifically, I was mulling over changing it to dou1 as that is the more common pronunciation/usage.

Teach more than one pronunciation per character?

Now, to play devil’s advocate, let’s assume we do add this to the application. Once we do, the line becomes blurred as to when we teach it and don’t.

For something like 都, do we teach both du1 and dou1 and the related meanings? Now when we do, the user would need to remember both (and their divergent meanings) when seeing any new or existing vocabulary that includes that character.

Additionally, the way the application works currently would have to be changed, as there is no way of being able to reliably quiz a user on both possible meanings/pronunciations without some sort of indicator as to which is what.

But what about the case of 沒 mei2 which actually can also be mo4 with a different meaning? Should that not be taught as well? I believe it is buried in the HSK somewhere, after all.

Here’s a particularly devilish case: 著 is mostly zhe5 but can also be: zhao2 zhuo2 zhu4 zhao1 depending on context.

So we can see that teaching multiple meanings/pronunciation is not great for individual characters. But what if we just do it within our vocabulary instead?

One character pronunciation, but vocabulary contains exceptions?

As in the case you point out in 睡覺/睡觉, we could teach this word and thus the different pronunciation/meaning of this character as well. But I don’t think this helps as much as one would think.

To begin with, each vocabulary word in HanziHero currently:

  1. Has a meaning that is mnemonically-related to the meaning of the character.
  2. Has pinyin that contains the pinyin of each individual character learned, with the exception of when it is neutralized.
  3. The mnemonic included relies on both of these facts to make understanding and remembering the vocabulary easy.
  4. The system relies on this fact to make it possible to remember the pronunciation of the character if one can remember the pronunciation of a word that contains a non-neutralized form of it (I rely on a similar method when reading Chinese every day).

Once we also include exceptions as well, all of these nice attributes are removed. Which is why we do not teach words that are exceptions to the rule.

Shouldn’t everything be taught, though?

We would certainly like to teach everything, but simply can’t. In fact, earlier versions of HanziHero we scoped out the idea of being able to add custom vocabulary, have custom prioritization lists, etc. But we found that the more we expanded the scope of HanziHero, the more complicated and thus less useful it became.

Learning Chinese characters is the most laborious aspect of learning Chinese, and we take great care to make it as easy as possible - even enjoyable - and remove avoidable diversions along the way.

Any language curriculum or application will always only be able to cover a subset of the language it teaches. The depth of any language is nearly infinite, after all. At HanziHero, we are intentional about that subset we teach to maximize the value provided for what we do teach. We do not include some common vocabulary, but on the flip side we plan to include nearly 5k characters in the final scope of things (for traditional we currently “only” have 3.8k) whereas nearly every structured curriculum I’m aware of almost never goes beyond the ballpark of 3k (despite the average literate Chinese knowing 4-5k).


I guess I never understood why some apps are characters first e.g. Heisig, it seems like a strange choice language learning wise. At the end of the day, surely people use this app to learn to read Mandarin. Drilling through reviews everyday ingrains the reading reflex into you so it become second nature. I could be half asleep woken up at 4am and if I see 好看, I would immediately and involuntarily see hao3kan4 and hear robot lady saying it in my mind and thinking of “good looking”. With characters only, you get decently far, but the complexity of multisyllabic word morphology is a huge obstacle. E.g. 左右, I don’t think anybody would guess “approximately”.

You could argue that vocab doesn’t help you actually acquire usage conventions or grammar, but those are completely outside of the model.

How so? What better way to reinforce character readings/meaning than through vocab.

Why not take the WK approach of just having an alternative reading? In the HH model, you could have an excepted alternative pronunciation that you don’t worry about teaching w/mnemonics. I don’t know about other people, but when doing WK I would on purpose alternate the answers.

But I don’t think this helps as much as one would think

Well the help would be that the learner would know the word 睡觉. Those points seem to be about the loss of simplicity/neatness of the system.

I agree that including more that one reading on a given card should be in place as on WaniKani (same deal as the meanings)… don’t think it hurts anything at all :slight_smile:

That being said, often times the first one is the one that gets learnt and the others go by the wayside for me, or, the one with the best mnemonic or the one which is eaisest takes priority (same with the meanings too) - Which is one of the downsides of the Wanikani approach.

What I did in my spanish deck, and I’ll use the word “como” as an example as it has multiple meanings and all of them are super common (all forms probably in the top 500 spoken words) and I’ll show you from the meaning perspective but it’d be the inverse of course for the “reading”

Card one:

Front of the card: “Reading”
Como (verb)

Back of the card: Meaning
I Eat (Verb)

Card two:

Front of the card: “Reading”
Como (Comparison)

Back of the card: Meaning
Like (comparison)

Card 3:

Front of the card: “Reading”
Cómo (question)

Back of the card: Meaning
How (Question)

All very much need to be learned which is why I broke them up instead of bundling them all into one card like Wanikani would. Maybe not the best example (given my deck is of course Spanish and based on word frequency, all forms) , but I this is how I came to realize how hopeless Duolingo was, as even up to level 20 on the spanish course they only ever taught the “I eat” meaning which is the least common out the three above, and I was non the wiser, which made me rethink Wanikani’s route too on the Japanese front.

Which is why you should learn words in context from comprehensible input sources such as TV shows and podcasts. Now the next time you see the characters.

Remember, we’re talking about the scope of this application. You cannot learn characters this way. At all.

I don’t disagree with the value of learning vocab. I’m just saying this application is not the place for it.

This is a false dichotomy. The choice is “many characters, little vocab” vs. “few characters, much vocab”. We’re trying to solve a problem given less than infinite time.

If you were to flip the comparison completely on its head, how would that even work? Vocab with no characters? I remember those people. Bless their hearts.

I’m not advocating against vocabulary, but you do realize that flashcarding all the HSK vocabulary would easily amount to hundreds of additional hours of numbingly boring studying time, and easily 1 hour+ flashcard sessions a day. I just struggle to see how that’s realistic.

The brain is very good with words, and I know many words just from hearing/speaking them, and not drilling them with flashcards. The problem with characters is that the language “expects” you to know all of them ahead of time, because that’s how it works. It’s pretty much impossible to reliably refer pronunciation and meaning from an unknown character.

1 Like

This is an overly broad statement about Chinese morphology. While it may be true that most 2 character words are in some way related to the morphemes from each character, its not always obvious what the meaning is of the word. Also you have the problem of not knowing which morpheme(s) of the character to use. For example, 男子, 子 is playing more of a nominalizing grammatical role than conveying the meaning “child, son”. Hence the word means “a man”, not “a boy”.

It’s more likely to help you learn the meanings of words by studying the roots than to actually learn the meanings of the words themselves?

  1. I seriously doubt the marginal effort to learn a word is the same as to learn a character.

  2. Learning vocab reinforces meaning. You may have seen the experiment where someone is given a list of 20 random words to memorize vs. given 10 pairs of antonyms.

  3. If you were to only learn characters, text would be incomprehensible.

  4. If you really think it’s better to learn characters, why should we include vocabulary at all in the application?

Yes, if you learn a strict subset of something, it will be trivially faster.

Personally I really agree with @lorentz , I remember this topic coming up in this post too: Definitions for functional (grammar) Hanzi - #10 by damia

While I understand limiting the scope and the focus on characters (especially when the whole vocabulary part is still very underdeveloped, I feel I’m learning words as they get added already :sweat_smile:), I think learning more common words in which the hanzi appears is very useful, especially if it involves different meaning and pronunciations.
I don’t consider I really learned a hanzi if I don’t know some of it’s most common meanings and pronunciations, so to me not learning that feels like a pretty big gap in the curriculum too.

Other than that, I do think that the ultimate goal of learning hanzi is learning how to read and for that you need to learn as many words as possible. I understand this is about characters, but the process of learning new words in these steps (component->hanzi->word) is so efficient that it’s a bit of a shame to leave it half-way. Even if it’s more efficient to learn vocabulary through reading/watching tv, having it here wouldn’t hurt at all.

It also makes the studying of new items more palatable: learning a new character takes a lot of effort, while learning a new vocabulary out of characters you already know not so much. If the proportion is similar to the one in WK, in which you have 3x vocabulary compared to characters, new lessons are less overwhelming because it’s not all completely new stuff.

My experience with WK changed from “This is really great” at lvl 30 to “So many words I never use…” at lvl 55, at which point I decided to downgrade (I mean, it’s useful to know how to say “urinary organs” in Japanese but I don’t encounter the term much). I find that I need to start using new words right away to ensure they feel useful. If I lived in Japan where I’d be forced to actively use the new vocabulary, I am sure I’d feel differently, but I was not, and my daily consumption of Japanese wasn’t up to the task.

Same with Chinese as I learn better by finding new words while reading: memorizing them separately doesn’t work very well for me. Actually, I found it rather refreshing that I could scoot to hanzi #1000+ without a burden of learning the associated 3x1000 words, unlike in WK. Reviewing doesn’t feel that onerous - so far. (Now, of course, good people of HH throw lots of new words at me :sweat_smile: but most of them I already know anyway.) For me right now, HH is a tool to quickly learn as many hanzi as possible to move into the media consumption ASAP, and learn everything in context. But I can see how some people will find it efficient to use HH as a learning tool for words as well; makes sense.


I don’t think I agree with this. To me this indicates that, with your daily consumption of Japanese, there was no need for you to reach that WK level at that point.

If the system works correctly, it should teach you the most relevant words for a specific character. If you find that you’re not running into these words, it means you didn’t really need to learn that character.

I definitely agree with moving to media consumption ASAP, but that’s not incompatible with also getting a complete understanding of the character’s most important words here.
And I feel the issue would be the same: if you learn tons of characters in HH and move to media consumption but these characters are not comming up because they appear in words like “urinary organs”, it will be as useless as having learnt the word, won’t it?

1 Like

Oh yes, absolutely. There is a lot of merit in moving a bit slower and learning how to use those learned hanzi. However, in Japanese, one can start reading something interesting early on thanks to an active use of hiragana in versions for children. 魔女の宅急便 series of books, for example, are very readable with ~1k Kanji. But my limited experience with Chinese shows that I can’t find much interesting to read if my hanzi count is below 2k. Hence my plan to get above that, and then slow down. Which comes down to personal preferences.

I’m not sure what your overall study plan is, but let me put it this way: Imagine a learner learned 5 characters a day for 4 months ~ 600 characters and 0 multisyllable words, and attempted read a graded reader. I think a reasonable text at least in terms of character coverage is Mandarin Companion’s adaptation of H.G. Wells “The Country of the Blind”. Its ~10,000 characters long but only uses 300 unique characters, presumably the vast majority of which are very common.

I would expect the learner to struggle pretty greatly here if they legit never studied vocab. Looking at the first paragraph of the sample, sure, there a lot of words that are self explanatory like 女人. But many words would require some thought like 往前. And there’d be other words like 好像 which I doubt anyone would understand without looking it up or mass input. In effect, this text is not really comprehensible. Now people have tried and succeeded at learning second languages with a lot of incomprehensible input that became comprehensible over a long amount of time. But that’s horribly inefficient.

Fair point! I did study vocab before HH. I went through a few basic textbooks, and learned ~500 (800?) words before starting on graded readers. It was enough to start with the MC breakthrough level (150 characters) readers. They were pretty readable. From that point on, moving to the rest of their readers, and beyond, it was all comprehensible input for me.

So, I agree that one needs to learn words rather extensively in the beginning, to jumpstart the learning process. The question is whether one should use HH for that. If someone uses HH as a primary resource for learning Chinese, then yes. In my opinion, it’s OK to do that in the beginning. However, after the number of words to learn exceeds a critical mark (that can be different for different people, but for me it would be ~ 500-1000 words), learning them in (relative) isolation is no longer efficient. One should start reading/watching/listening and finding new words in the wild, according to their interests.

1 Like

Yeah good point. I was thinking that the characters aren’t very useful without words, but you’re right, words aren’t very useful without context. Honestly maybe it’s just up to preference - I was hoping to learn vocab in HH so that I could more easily acquire them in the wild. It feels nice to open a graded reader and have the vocab (mostly) down pat. That way I can focus on the phrase and sentence level structures.

A month later and I have a different take on this.

  • The SRS almost perfectly covers your needs for reading individual characters
  • Vocab is completely different, the reading part is only a minority of the difficulty compared to how to use the words, and the SRS will never teach you that
  • So it’s okay that Vocab cards mostly serve as a “hey this word exists, please vaguely understand it so you can pick it up in immersion”.

That said, I still think custom vocab + pronunciations would be a huge game changer. And then just don’t worry about the mnemonics. Basically I want the vocab equivalent of “prioritize character”, and because it’s infeasible to add in every word/phrase in the system, just let us decide. Basically, I want this to replace Anki completely, because the interface is nice + typed input.


Thanks for the additional thoughts.

Yeah, right now the vocab mainly serves the end of helping users better remembering the primary meaning/pronunciation of characters. I can definitely see the utility in people being able to add their own custom vocab on top of that. I myself don’t really like Anki (I only used it briefly years ago during an ill-fated attempt at going through Heisig’s Remembering the Hanzi), so I understand the appeal of having one SRS system to do it all. :+1:


I was going to write something similar.

At this point, my biggest pain-point is that when reading or listening to stuff I run into many words that I already know the characters for, which is a great feeling, but without adding them into some sort of constant practice (like SRS) I forget them.

I currently have an anki list in which I add all these words (the ones I’ve already learned the characters for), but it’s not I system I’m enjoying: I really prefer typed input, and I’d definitely prefer to have all my SRS practice on the same place.

The only other solution I was thinking about was getting Hack Chinese, but it still doesn’t solve having to SRS systems and, to be honest, it’s quite expensive.

Custom vocab list here would be the perfect solution to this, I’d happily pay a bit more for this option too since it saves using another system.