Alternate Pronunciations & Typo Tolerance ⌨

phil · August 27, 2023, 8:57pm

Since there are varying pronunciations for words, we’ve added support for those pronunciations.

You can see these alternate pronunciations on the respective page for the word. Check out 朋友 friend for example.

output

Second, we’ve added some typo tolerance to meaning questions. Previously we were doing some sanitizing and other things to clean up the input as much as possible, but this still required you to type things in perfectly.

With the new typo tolerance, you can be off by a bit and it’ll still be accepted.

typo

As a last bonus, you may have noticed how now the answer is displayed above the input after you get the item correct.

This is to help reinforce the official label of an item while still using our own synonyms or alternatives. As the curriculum is built off of using these official labels within higher up mnemonics, knowing the official label helps when learning new items.

That’s all for now. Happy studying

lorentz · August 27, 2023, 10:07pm

Awesome! I noticed this today.

damia · August 29, 2023, 4:06pm

Relating to this topic, it would be great if it could be able to recognize when you’ve typed the correct pronunciation but forgot to add the tone (e.g: typing “pengyou” instead of “peng2you3”). In that case, it could give you a message like the one when you put the meaning instead of the pronunciation and force you to put the tones instead of marking it as a mistake.

Maybe it’s just me but I’ve had it happen that, if I know a word very well, I type fast and forget to put the tone even though I know it. It’s a bit frustrating to lose the srs rank for that

kevin · August 29, 2023, 11:45pm

Great idea, and we’ve been thinking of doing something similar. We were literally just discussing this yesterday! We are trying to find the best way to implement it.

There are two versions of this.

The one you mentioned, where we protect against toneless pinyin. This would trigger if and only if the pinyin is valid but simply lacks tones. If we did something like that, we would probably get rid of the shortcut where one can omit “5” for the neutral tone. In fact, I think it would be required.

The “stronger” version of this is to prevent incorrect pinyin entirely. I can see the use of this, but I can also see how it may be too hand-holdy. It would work by only allowing you to submit the answer if the pinyin is completely valid. That is, each syllable has a tone and is a valid pinyin syllable. The one downside of this is perhaps being too strong of a guardrail for those less familiar with pinyin. One can always press the “don’t know answer” button if they are completely unable to produce any valid pinyin entirely. But I think for most users they are familiar enough with pinyin that their typos are always just that, and not because they lack enough familiarity to type valid pinyin.

We’ll think about it some more, but happy to hear your thoughts on these two approaches.

damia · August 30, 2023, 10:14am

I wasn’t even aware you can omit writing the “5”

I can see the second option being slightly more annoying if someone (like me) is used to just typing whatever when I’ve forgotten a word to get the wrong result. It would still be easy to get used to it, and it would prevent accidental typos with the pinyin, so I think ultimately both implementations would be a great improvement!

lorentz · August 30, 2023, 10:40am

This would trigger if and only if the pinyin is valid but simply lacks tones. If we did something like that, we would probably get rid of the shortcut where one can omit “5” for the neutral tone. In fact, I think it would be required.

Wait why so? The cases seem to be

all syllables either had the correct tone or had null tone when expecting 5 [correct]
all syllables had a null tone but at least one expected a non-5 [warning] // this is an error today
all other cases [error]

I think in practice it would be intuitive

Youyujuan · August 30, 2023, 7:51pm

Is there a chance you could update your typo tolerance to guard against fat fingers?

kevin · August 30, 2023, 10:52pm

BTW you can press CMD+Enter on Mac or I believe CTRL+Enter to reveal the answer and auto-expand the item info if you don’t know it. This way you don’t have to type anything in at all when you know you don’t know it.

kevin · August 30, 2023, 10:56pm

Gotcha. We’ll have to think about it some more. Right now we don’t do any in-depth parsing of the input but just a simple string comparison for pronunciation. For these more intricate guards - both of the ones I mentioned above - we will have to add some parsing logic to our front-end grader.

kevin · August 30, 2023, 10:57pm

I think this case could be prevented by simply blocking any non-alphanumeric input for pinyin, since it should only have a-z and 1-5. Thanks for the example.

lorentz · October 2, 2023, 3:20pm

Could we have wrong answer category protection for alternate pronunciations?

In the CN course, I was supposed to type the meaning for 学生 but accidentally typed xue2sheng1. This was intentional b\c I recognize 学生 to be one of the many words where different speakers may or may not use the neutral tone.

phil · October 2, 2023, 5:28pm

Good idea

I just updated the quiz to support this

alternate_pinyin_block

tuobiyasi · October 8, 2023, 1:39am

Could there be an option for having a pop up asking if I forgot to add an accent - if I’m missing it? I see that sometimes I write to fast and click enter before I notice that I forgot to add the accent (1-5).
I also see that I sometimes add the wrong accent by writing too fast, and see my mistake immediately after getting an error - but uncertain if those cases should be “savable”…

phil · October 8, 2023, 1:07pm

Yes, I think a current painpoint is the lack of typo-safety when inputting pinyin.

Certain parts of pinyin are critical, and it’s hard to differentiate between some of them, i.e. chan2 and chang2 or si4 and shi4 – these types are technically “one-off” as typos. The tone is especially important, and so we would have to at least parse the tone out, otherwise shi3 and shi4 could both be accepted for a character with pinyin shi4, when the shi3 shouldn’t be accepted at all.

The “accent-forget” is a good middle-ground at least! We do have the special case of tone 5 being omitted, i.e. xue2sheng and xue2sheng5 are both valid

I think another possible middleground is to have a list of “other” pinyin that one cannot absolutely answer in, while still helping with typos. So, for example, if a character with the pronunciation of si4 is answered with shi4 then that would be marked as wrong. But if one typed instead sio4 that would be “one-off” and possibly accepted.

It would be nice to see some of the typos people are making, if anyone would like to share

edit: there was already one example above–non-alphanumeric input could be blocked

tuobiyasi · October 9, 2023, 12:00am

Maybe the easiest solution would be self policing if there is something that is possibly a typo. Meaning that if some algorithm (diff(answer, solution) < X) says that the typed answer could be a typo, then you can ask the user if they made a true mistake or if it was a typo and let them try again if so.

This could lead to people doing themselves a false favor by trying again even when they made a true mistake though - but it would be their own problem. But I see myself potentially retrying if I’m tired and doing stuff to fast, when I actually made a mistake because I just wanted to do stuff fast
So I’m on the fence with this solution, but I think it could be nice to experiment with if it’s an opt-in solution - then if you keep on shooting yourself in the fot you might understand that you should turn the option off

tuobiyasi · October 16, 2023, 10:58pm

I would’ve liked this typo to be accept as a typo (Writing too fast )

See Your Synonyms

kevin · October 17, 2023, 12:07am

Yeah I think one thing we many want to consider doing at some point is expanding the acceptable “edit distance” for longer words.

Right now we have it at one, and this is because even a distance of one can be huge for smaller words.