I am not sure where in the forum to put this question, but as it related to video dialogue let's start here.
Does anyone know which program is YouTube (and others) using to convert audio words into text for the subtitles you can open now?
Some time ago, when my father had a stroke and thought continuing to use his computer might be a good thing, a friend of mine told me about a program that allowed you to talk into a microphone and convert your words into text.
You had to "teach" the program until it recognized your voice. But this new program seems to be more sophisticated than that, as it recognizes any person's voice speaking on the videos.
My wife also told me of X-Ray machines, working with computers, also allowed talking the diagnose into the computer's mic and being converted to text.
I haven't yet googled about this, which I will do, but maybe someone here is already familiar or used some program that does this.
+ Reply to Thread
Results 1 to 6 of 6
Windows itself has voice to text converting builtin. You have to spend 30 minutes training it but it seemed to work for me after I did the training. Not sure how good it is with languages other than English though. It's called Windows Speech Recognition.
Sorry, but I did not explain what I was going to use it for.
What I want is to feed an audio from a video, not a microphone, and have the program convert it into text, exactly like they do in YouTube apparently.
Of course it would be even better to convert that audio into a timed text, like a subtitle, but that should be too much to ask probably.
But if the subtitle can show on the screen, I might get an OCR program like SubtitleEdit to recognize it.
Dragon Naturally Speaking is a popular payware speech to text converter that has been around for a long time.
Speech recognition software has improved quite a bit in the last few years.
Even my car has speech recognition and and it works surprising well. Haven't tried the W10 speech recognition program.
Generally, speech recognition has a fair amount of errors for conversion to perfectly correct text. No surprise as we all talk differently.
Training/teaching the interface does help. Some Languages are even more complex for conversion. Good luck with that.
I often do better listening and just typing.But I would try some of the programs available to see what may work for you.
These type of programs may save some time trying to transcribe audio to text.
But be sure to check and correct any text for errors.
Humans are still smarter than machines.
Last edited by redwudz; 9th Aug 2018 at 23:09.
Correcting the recognized text is certainly a must.
But I am amazed, lately, on how dialogue on YouTube videos is coming accurate, with very few words wrong. That was not the case until not too much ago.
The questions still remains, on the videos I want to recognize the audio in, on the timings for each speak. There probably isn't a way to do that automatically, unless the text shows on the video. I wonder how YouTube does that.