@lorentz @damia @tuobiyasi @Igor101
Great questions!
In my experience, the quality and preservation of “naturalness” is dependent on two factors.
1)The quality of the time stretching algorithm and program.
I know some teaching websites provide the functionality to be able to slow down the audio lines in the lessons, but the algorithms they use, in my opinion are far and away from being the best ones and they don’t allow you to control the speed of playback by specific percentages. Net resulst: not that useful, despite the good intentions.
You can maintain very high quality of the speech and sound if a higher quality algorithm found in nearly all Digital Audio Workstations, like Logic Pro X or Ableton LIVE, among many others.
This technology is nothing new or cutting edge relatively speaking, but it is very robust, and still fairly CPU intensive, so few if any websites out there offer this functionality and the robust features that go with it). The digital technology was first introduced in the late 1990s and since then has improved signficantly over the decades to the present time period and will continue to make leaps and bounds with AI. Seems like everything we talk about nowadays, AI has to be mentioned so I’ve done my duty there for the day. LOL.
Audacity is a free, open source platform that I believe has higher quality time stretching algorithms as well which may work as well, but I myself am only familiar with Logic Pro X and Ableton LIVE.
To fully explain, I’d have to go on a fairly lengthy tangent about properties of sound and audio editing and processing, which right now might not be the best time for it. The downside is that using a DAW to slow down target audio does take a modest familiarity with the software and basic recording and audio production concepts and techniques. If you have some familarity with a DAW already, this will of course make it easier.
It’s really up to the person and their goals. While it sounds like a lot of time to invest, for me, it was worth it and has made listening comprehension so much easier. The other major benefit is that you can capture ANY audio source that’s available digitally and is output by your computer’s sound card. Movies, TV shows, YouTube, etc. If you can make it make a sound on your computer you can record it and edit it and time stretch it. You aren’t confined to “lessons” which is especially useful for those of you who are advanced beginners or intermediates or higher in my opinion. Additionally, you can even record, edit, and time stretch those “lessons” as well. The power is in your hands.
I will say this.
It’s crazy - once the audio is slow enough for your brain/mind fully track and process the data just once - how quickly (almost instantly) you become able to catch and comprehend it in real time and in context in any situation. I’m exaggerating a bit for effect here, but not by much.
2)The multiple of time stretching being less than or equal to 2 (in other words not slowing down the audio by more than 50%). 50%-75% seems to work really well for maintaining “naturalness” and allowing for the beginner’s ear to easily capture and comprehend each syllable as well as easily notice which syllables are sort of “mashed” together.