One thing is pronunciation help. There is this wonderful website Forvo, but it can be tedious to search up every word. I wish there was some kind of browser integration. Like I love using the Zhongwen chrome extension that has dictionary listings on hover. I wish there was that, but somehow it allowed you to quickly navigate to https://forvo.com/word/[hovered-over-word]/#zh
Pronunciation is definitely a big pain point. If you ever read epubs, I recommend using the Pleco reader (it is a paid add-on, but I think the entire add-on bundle is essential). You can just tap on any character/word and it will show the Pleco entry and auto-play the pronunciation.
I believe the Android version of it can work with websites somehow, but I’ve never tried it as I primarily use iOS.
I like your thinking!
For me, learning from sentences is supremely helpful for learning read, listen, write, and speak.
That said, if the sentences aren’t translated/curated in advance, it is extremely difficult and time-consuming to
1)gather the sentences in Chinese (Mandarin) and
2)ACCURATELY translate them into English.
The most robust source of new sentences for me are subtitles, especially on popular, well-known streaming services that offer subtitle support for all the major languages. This is something that was certainly not available on a wide scale say 20, even 10 years ago, but it certainly is now.
The beauty of subtitles is that the exact same sentences are available in a completely matched set in Chinese (Mandarin) and English or your target native language.
These subtitles are curated for broadcast to subscribers and therefore accurate and reliable enough for learning.
I know how to save the screenshots of the subtitles and copy and paste the text from the images but this is extremely time consuming and inefficient. I’ve stubbornly tried and that’s definitely not the way to go.
As I’ve mentioned before, I’m a beginning computer science/IT student at university, but with my current basic knowledge/skill set I do not know how to go about finding where in streaming service platforms the subtitles are stored or generated from.
The ability to intersect and capture that data in text format would be a monumental, force-multiplying game changer.
It could easily be imported into a text file, spreadsheet, and manipulated rapidly for importing into an SRS platform.
All these steps can certainly be programmed.
Imagine being able to efficiently mine sentences en masse from your absolute favorite television shows and movies while you watch them and then efficiently learn them with the power of SRS!
Any guidance or direction would be appreciated. I’ll do the work. Just point. Thank you!
I’ll chime in hear.
Pronunciation has 2 components.
1)The listening to quality input from native speakers (this sets the goal for step 2).
2)The speaking with quality output to sound more like native speakers.
As I’ve read in other posts, it seems there is somewhat of a consensus about the importance of quality input first before doing ANY speaking (output). That’s great. I certainly agree.
There is an opportunity though to ease the transition between step 1 and 2 that gives your mind concrete building blocks to take step 1 and with some persistence execute step 2.
The quality listening and input provides the “goal” for the mind. The mind however needs a framework for physically executing step 2. It comes down to motor skills and the area of study in in the field of linguistic study phonetics. Even a superficial understanding of phonetics and how to use the parts of your mouth, tongue, etc and accessory anatomical muscles (all covered in phonetics) will make it significantly easier to achieve accurate pronunciation. Every language relies on a very specific set of muscle movements and memory. This is why it is so difficult to learn pronunciation of foreign languages that possess muscle movements that do not exist in your native language. Your body has never used those muscles to move in that way before. Sometimes all it takes to dramatically improve your pronunciation is for example knowing that you need to round your lips to make a sound. This is something that we only do occasionally in English but in Chinese (Mandarin) it a key motor movement required to verbally reproduce a huge proportion of sounds and words. If no one ever tells you the exact motor movement or it’s never learned (either actively or passively), you’ll always struggle. Before I ramble on, I recommend just starting with a simple google search for phonetics or even go here.
I highly recommend you start with Production and Acoustics. It will take a bit of time investment to learn, internalize, and apply but such knowledge will pay massive dividends in pronunciation effectiveness and efficiency with Chinese (Mandarin) or any other foreign language you learn.
I’m sure we’ve all encountered that situation where we listen carefully to high quality input from a native speaker recording but cannot for the life of us even begin to emulate speaking with quality output. The improvements from incorporating knowledge of phonetics into your arsenal won’t be instant, but they will be very robust earlier on which will give you the persistence to practice them and see sustained results. This is what we want! Imagine how awesome it would be to blow away native speakers with stellar pronunciation and how proud you’ll feel!
I think Migaku might be the kind of tool you’re looking for. I had used it before for Japanese but they also have support for Mandarin. I’m still working on improving my Chinese a bit before I start using it though, but it should work fine! Migaku Website
I agree with the importance of input. For a language like Chinese, it really becomes a matter of getting over the elementary bump as quick as possible then setting up your time so that you get some quality listening time in every day for years.
In terms of pronunciation, just listening a bunch will help as you note. I think listening is still criminally underrated, despite studies and anecdotal evidence pointing towards massive amount of listening input being perhaps the least painful and most effective way to build familiarity with a language.
For pronunciation training, shadowing is pretty good. I hope to make some videos about my learnings in this area.
Word of warning though: I think using things like Glossika or other forms of shadowing are WAY less useful until one gets to an intermediate or above listening level. When I first started with Chinese I spent so much time on pronunciation, but in retrospect I should’ve focused on listening more. The reason for this is quite simple: it does not matter how good your pronunciation is if you cannot even understand what the person talking to you is saying! Haha
I think it is more useful to find audio that is intrinsically slow instead of applying a slow-down effect in a media player or YouTube or whatever.
I hope to make a video at some point describing a good process for getting to a level where one can understand just barely enough of Chinese that watching a show is somewhat useful and enjoyable (e.g., understand ~50%).
How I did this is as follows:
First I listened to the dialog audios provided with any textbook series. These recordings have three important attributes that will help you:
It is much slower and simpler than native speaking.
They are associated with a chapter in the book, which will have a “new vocabulary” list, identifying words you may not know yet.
They have a full transcription of the audio you can read along to.
However, the key is how you use this valuable resource! Like so:
Listen to audio alone on repeat. Do this as many times as you can bear, trying to parse the language phonetically (recognize which sounds are being said) and a bit semantically (try to map those sounds to the words/meanings you know). You will not understand most of it, but that’s fine!
Listen to audio on repeat while looking at the associated “new vocab list”. Try to better understand the sounds and words you hear in the audio by mapping them to the words in the list. That list will have pinyin, allowing you to also better understand the sound->pinyin mapping. You will understand a bit more, now.
Listen to the audio on repeat while reading along with the transcript. At this point you will be able to “understand” all of it… because there is a character-per-character transcript with direct English translation in front of you! This will help you cover any remaining gaps.
(The next couple of days later) Carve out some time to listen to that same audio on repeat once more while walking around or doing chores. You already have memorized it by now, but just listening to it will allow you to work with it a bit more with “no training wheels”.
This method is how I got from “can’t understand ANYTHING” to “able to dive into children cartoons” within a couple of months. Of course, I could understand very little of the childrens cartoon, but they key was getting to the point where immersing in native material was bearable and marginally useful. After that, I would try other techniques like watching episodes multiple times, sometimes with subtitles, sometimes without, and so on. But this post is getting long so I’ll save those strategies for another day.
I was trying to keep studying Japanese while starting out Chinese but for the past year I felt I couldn’t improve as much as I wanted with Chinese.
Because Chinese is my priority now (after all, we’re going to Taiwan with my wife at least once a year and I’d like to follow the conversation whenever we visit her parents), I’m allocating almost all my study time to Chinese now. I only do basic stuff with Japanese to make sure I don’t lose the level I have now: watching some anime, keeping up with review in WK, etc.
Hopefully once I get to a level with Chinese in which I can get a lot more comprehensible input I can put more time to JP again!
Depending on your respective language ability levels over time, you can try to learn Japanese through Chinese or vice-versa at some point.
One downside of this is that the amount of language-learning resources/dictionaries/etc in English for learning nearly any language far outweighs the resources within other source-languages. But I’ve flirted with the idea of learning Japanese through Chinese at some point, to practice both at once.
Glad to hear that. I think especially for people with family, social, or culture ties to Taiwan learning both Chinese and Japanese is indeed supremely useful. Love your motivation(s). Learning new languages really is so key for connecting with new people and experiencing new cultures in more than just a superficial way and it’s a great way to also strengthen mental facilities. Keep up the strong work!
I finally got a chance to sign up and give Migaku a thorough test run.
I am extremely excited and impressed with the UI and functionality!
I have been searching for a solution to this problem of efficiently mining sentences from TV show/movie subtitles, processing them, and incorporating them into my language learning workflow and systems for so so long.
Migaku has enabled me to accomplish all this more effectively and efficiently by several orders of magnitude. I’ll run some numbers on some data I’ve collected and post again later.
Harnessing technology has equipped us with far better tools to learn foreign languages (and everything really) that previous generations wouldn’t even dare dream of and this is a simple, but prime example.
Once the learning habits and systems are in place with the power of SRS tools, the sky is truly the limit!
What an exciting prospect.
Thank you so so so much!
Approaching language learning using data can really streamline the process.
I think what would be also helpful to supplement a most common words list would be a most common sentences list.
I also think the vast majority of use cases involve social settings so that’s where the focus should be.
I’ve found for myself at least that the mining sentences is far more efficient with web based or digital (vs physical textbooks if that’s even a thing anymore) resources due to the copy and paste functionality and ability to efficiently import in large volume with a spreadsheet or simple programming into Anki. For all of us, I feel the work of getting content INTO an SRS tool like Anki needs to be streamlined as much as possible utilizing digital tools and programming so that we can focus most of our time on actually USING the SRS tool.
For those resources, basic language learning websites that have example sentences grouped by level are a good starting place if you are a beginner or intermediate.
Other great sources are streaming services as nearly all offer comprehensive subtitles in all the major languages of the world. There are tools like Migaku (that @damia made me aware) that can efficiently capture all the audio, subtitles, and their translations into your native language. The additional processing step with this approach would be obtaining the pronunciations and noting the timestamps are for reference and matching up the audio.
Again, most of this process can be greatly streamlined using spreadsheets and basic programming. Digital Audio Workstation software, like Ableton LIVE especially, might also be able also help corral and organize the audio efficiently.
Absolutely, this would be so helpful. From my point of view, one has to exist with the other, the sentences on there own, while usueful, present some issues (in my experience) … we tend to absorb the sentences as a whole, so when we are presented with anything that deviates we lose track and/or meaning if anything is out of place or a word is used out of place or out of order (for example) . Where if we have the individual words which were learned prior then we have that to fall back on. The inverse is true also, the individual words don’t account for the speed (and other factors) used in an actual spoken sentence.
Yeah, for sure
Hmm this was a thought, can the platform recommended earlier find words in a sentence based on frequency (?) this would be so handy - Like, here is the most common words (uppto a point / level), find a sentence which uses 90% in a sentence so we can get the audio too. Or maybe just the sentence as a whole like you say.
@kevin It’s saying I deletd my post earlier…I didn’t go anywhere near it (?) Surely not copyright?
Looking through the edit history it says that the last revision was done by you(?). I think what may have happened is an accidental submit on the trash bin button or something of that sort. I agree that it may have been a glitch due to keybindings or improper focus, probably.