Card splitting and other feature requests

FilmLilyDawnRoot · July 24, 2024, 2:51am

I’ve been using Hanzi Hero for a month. Overall I am very happy! It’s much more engaging than Anki flashcards. Here are a few suggestions for minor improvements.

Split the SRS stage for the sound and meaning versions of a word. This is considered best practice in the SRS community.
Allow the user to turn off learning one or the other.
Sound and meaning cards are still not distinct enough. Answer validation on the meaning card should at least block any valid pinyin response (not just the correct pinyin). On the sound card block any answer without a number. Using a different color could help.
Once cards are split it immediately becomes clear that there are a few redundant cards where the component and character have the same keyword.
When an HSK level is skipped, only use that to mark items skipped. Continue to use the earliest unlearned components/characters/words in lessons if they get unskipped.
Add pictures for each actor, set, and object. E.g. rather than a description “It looks just like a person walking, from the side”, overlay 人 on such an image.
To take this a step furthur, you can overlay a character on top of the images for each of its components.

Some more speculative stuff

Many words are AB with meaning A or meaning B. Can you just teach this rule? Is it possible to create a mnemonic for this?
There are a decent number of exact sound components. Can you avoid creating extra mnemonics for these somehow?
To take the images another step further, you can arrange the images for actor, set, and components to show the full mnemonic.
Can you include part of speech?

Also a bug

The first word in a lesson doesn’t play sound automatically.

kevin · July 24, 2024, 11:08pm

Thanks for the detailed feedback!

Split the SRS stage for the sound and meaning versions of a word.

I understand why some people would have this view. It comes from the view that the “pronunciation” and “meaning” are two different things (they are separate ‘questions’ in our application after all!), so they should be scheduled differently as a result.

However, I think there is utility to grouping them together. In other words, defining the “smallest meaningful unit of information learned” when it comes to words and characters as BOTH the pronunciation and meaning together.

I think this is best illustrated by drawing a parallel to learning English, by supposing a Chinese speaker is learning the word “though”. The meaning in Chinese is 雖然, and the English pronunciation using phonetic approximation found in some dictionaries is “thoh”. I’d argue that someone learning English doesn’t really know “though” if they pronounce is “thouf”, as that is completely unintelligible. They also don’t know it if they don’t know the meaning is 雖然.

I hope this demonstrates the benefits of keeping the two tied together. The fact that getting one wrong will make you have to answer the other one a bit more than is ‘necessary’ actually has a benefit of making sure you internalize both fully.

Sound and meaning cards are not distinct.

Yeah, I hear you. One thing we’ve had in the backlog is to make the pronunciation question have the placeholder text pin1yin1, which I think should make it even more clear. I think this will further help with differentiating them, and we’ll look into adding this soon.

(meaning should block on any pinyin at all)

I can see us adding that as an advanced setting that can be enabled, similar to what we do for Enable pinyin answer validation | HanziHero Docs .

When an HSK level is skipped, only use that to mark items skipped. Continue to use the earliest unlearned components/characters/words in lessons if they get unskipped.

Can you expand on this a bit more? The way it should work is that when an HSK level is skipped, it marks all of the words/characters in that level as skipped, but one can still unskip any words/characters within the levels skipped by navigating to its subject page and clicking the “skip” button to make it unskipped. To ensure that it shows up as soon as possible, one can go one step further and “prioritize” it as well.

I’m not sure if this is not clear in our documentation, or if you have encountered a bug that leads to behavior contrary to what I described above.

You have some other great questions/feedback that I also plan to answer, but to keep my response a reasonable size I’ll save that for another time.

FilmLilyDawnRoot · July 25, 2024, 1:25am

Ever heard someone pronounce “epitome” as /epitoum/? It’s not that strange to learn meaning without sound by natural reading. Learning sound without meaning has a less clear analog but e.g. kana characters wouldn’t be treated as having a meaning. There are also multiple senses of a word, in some cases quite distinct. Why choose to put only one sense on a card?

The goal in my mind is simply to most effectively get the user to fluency. I think splitting cards would be a tiny bit more efficient. This isn’t a big deal if it’s a lot of engineering work, but if you think it’s worth investigating then you could always AB test.

Also an idea I forgot earlier - Characters are pronounced identically in most words where they appear. Why have a card for the pronunciation of a word at all? Or if I fail that card, you should at least trigger failure on the character card as well.

I did one lesson, then skipped HSK levels 1 and 2, then switched to Traditional, then unskipped over 200 individual characters. Despite unskipping, I receive characters exactly in lesson ordering from number 600. Currently I’ve learned all characters from 600-705. Words behave similarly. I thought this could be due to generating the items for a rank all at once and only reevaluating at the end, but I’ve increased my rank and still get this.

kevin · July 25, 2024, 2:22am

Thanks for the additional details about the unskipping behavior. I filed a bug for this, and I think I’ve found the root cause which we will look into.

Yeah, part of it is definitely the amount of engineering work involved to split the cards entirely. I can see the rationale for wanting to split them, but we don’t have any plans to do so currently.

As to the rest of your questions, I hope to answer them in a reply later this week.

kevin · July 25, 2024, 4:57am

We’ve found and issued a fix for the skipping issue you outlined. Items that are “unskipped” which have an HSK level that is below the configured “HSK skipped level” will now properly show up in lessons in the expected order. Thanks again for the report!

kevin · August 4, 2024, 12:39am

Back again to answer some other things I did not have time to reply to last week.

I don’t think there is any straightforward rule here to teach, unfortunately. While many words in Chinese are often the sort of intersection or union of the main meanings of the characters (like 朋友 being “friend” which is more or less the same as either 朋 “companion” or 友 “friend”) there are others where this is not the case. Such as 東西 which nearly always refers to “thing” and not 東 “east” or 西 “west”, though it can literally also mean “east and west” in some rare cases too.

Can you include part of speech?

I hope to make changes in the next months to make the difference between nouns and verbs more clear at least, as that is a common form of confusion. However, one difficulty of parts of speech in Chinese (and English and other languages to lesser degrees too!) is that a single word can cover multiple of them. I’m trying to find a balance between teaching enough to help with clarity without overloading the user or inadvertently narrowing the meaning we convey to the user by making it seem that a word only corresponds to a single part of speech.

There are a decent number of exact sound components. Can you avoid creating extra mnemonics for these somehow?

There are about ~74 sound components, each corresponding to an initial/final/tone of Chinese which we then combine in character stories to help make them more memorable. I don’t think there is any way for us to reduce them, but you can skip them if you don’t find them useful.

FilmLilyDawnRoot · August 4, 2024, 10:28am

I’m by no means saying that you should teach every word as the left or right part. I’m just saying that for e.g. 朋友, 希望, 咖啡, 車子, 跑步, 房屋 you gave a name to one of the two characters exactly equal to the name for the pair. Is there a way to not have this card at all, or use a special mnemonic for this situation that could be easier to remember? Ditto for compound words that translate directly into English e.g. 人工. Well, there might not be much to be gained here but something to think about.

On the other hand, when there’s a separate component card and character card with identical names, it’s a clear error of the spaced repetition system. There are many of these, e.g. 人, 大, 子, 山, 車 just in the first row of components. Reviewing the card twice as often will both waste time and actually slow learning, according to the theory of spaced repetition. In addition, the visual subcomponents are missing in the mnemonic story for the sound. For example, there’s a story for 玉 talking about Yugi and a separate story talking about 丶 and 王. There’s a story with Goofy and 瓜 and a separate story with 爪 and 厶. Is that a good thing or a bad thing? Well, if it’s a good thing then why not split mnemonics in this way for all cards?

I’m not referring to these. I’m referring to the 422 exact phonetic sets at HanziCraft - Chinese Character Phonetic Sets. E.g. 鯉, 哩, 理, and 里 all begin “[li-] Link is in the [3] basement of the [-_] house,”. Is there a way to create a single mnemonic for this, like “A line of 鯉 carp swim around in a perfect circle that is nearly a 里 Chinese mile long. As they swim they sing [say 里] a fish song about the Chinese mile.” Similar to your mnemonic for weakened tones ([5] silent). Idk, creating mnemonics is hard but sound components are useful.

FilmLilyDawnRoot · August 4, 2024, 10:55am

Regarding part of speech - yeah I agree it’s quite messy. Maybe you could start by dealing with rarer and clearer categories. E.g. I think these categories tend to be the only part of speech for a word

Adverb
Measure
Preposition
Conjunction