VideoHelp Forum
+ Reply to Thread
Results 1 to 8 of 8
Thread
  1. Member
    Join Date
    May 2004
    Location
    Brazil
    Search Comp PM
    I am not sure where in the forum to put this question, but as it related to video dialogue let's start here.

    Does anyone know which program is YouTube (and others) using to convert audio words into text for the subtitles you can open now?

    Some time ago, when my father had a stroke and thought continuing to use his computer might be a good thing, a friend of mine told me about a program that allowed you to talk into a microphone and convert your words into text.

    You had to "teach" the program until it recognized your voice. But this new program seems to be more sophisticated than that, as it recognizes any person's voice speaking on the videos.

    My wife also told me of X-Ray machines, working with computers, also allowed talking the diagnose into the computer's mic and being converted to text.

    I haven't yet googled about this, which I will do, but maybe someone here is already familiar or used some program that does this.
    Quote Quote  
  2. Dinosaur Supervisor KarMa's Avatar
    Join Date
    Jul 2015
    Location
    US
    Search Comp PM
    Windows itself has voice to text converting builtin. You have to spend 30 minutes training it but it seemed to work for me after I did the training. Not sure how good it is with languages other than English though. It's called Windows Speech Recognition.
    Quote Quote  
  3. Member
    Join Date
    May 2004
    Location
    Brazil
    Search Comp PM
    Sorry, but I did not explain what I was going to use it for.

    What I want is to feed an audio from a video, not a microphone, and have the program convert it into text, exactly like they do in YouTube apparently.

    Of course it would be even better to convert that audio into a timed text, like a subtitle, but that should be too much to ask probably.

    But if the subtitle can show on the screen, I might get an OCR program like SubtitleEdit to recognize it.
    Quote Quote  
  4. Mod Neophyte Super Moderator redwudz's Avatar
    Join Date
    Sep 2002
    Location
    USA
    Search Comp PM
    Dragon Naturally Speaking is a popular payware speech to text converter that has been around for a long time.

    Speech recognition software has improved quite a bit in the last few years.
    Even my car has speech recognition and and it works surprising well. Haven't tried the W10 speech recognition program.

    Generally, speech recognition has a fair amount of errors for conversion to perfectly correct text. No surprise as we all talk differently.
    Training/teaching the interface does help. Some Languages are even more complex for conversion. Good luck with that.
    I often do better listening and just typing.But I would try some of the programs available to see what may work for you.

    These type of programs may save some time trying to transcribe audio to text.
    But be sure to check and correct any text for errors.

    Humans are still smarter than machines.
    Last edited by redwudz; 9th Aug 2018 at 23:09.
    Quote Quote  
  5. Member
    Join Date
    May 2004
    Location
    Brazil
    Search Comp PM
    Correcting the recognized text is certainly a must.

    But I am amazed, lately, on how dialogue on YouTube videos is coming accurate, with very few words wrong. That was not the case until not too much ago.

    The questions still remains, on the videos I want to recognize the audio in, on the timings for each speak. There probably isn't a way to do that automatically, unless the text shows on the video. I wonder how YouTube does that.
    Quote Quote  
  6. Member
    Join Date
    Mar 2011
    Location
    Nova Scotia, Canada
    Search Comp PM
    Originally Posted by carlmart View Post
    Correcting the recognized text is certainly a must.

    But I am amazed, lately, on how dialogue on YouTube videos is coming accurate, with very few words wrong. That was not the case until not too much ago.

    The questions still remains, on the videos I want to recognize the audio in, on the timings for each speak. There probably isn't a way to do that automatically, unless the text shows on the video. I wonder how YouTube does that.
    YT is owned by Google and uses their speec recognition tech. i suspect the reason it's become more accurate is the same reason Google Translate works spo much better now ... it's running on newer billion dollar server farms. Whether you can get the same functionality on software running on a personal computer is another question. I dopn't actually know the answer to that but I'm not optimistic.
    Quote Quote  
  7. Member p_l's Avatar
    Join Date
    Jun 2002
    Location
    Montreal, Canada
    Search Comp PM
    YT uses machine learning with the largest data sampling you could wish for: YT videos.
    Quote Quote  
  8. Member Cornucopia's Avatar
    Join Date
    Oct 2001
    Location
    Deep in the Heart of Texas
    Search PM
    IIRC, they do have realtime transcription services (both human and AI), but they cost quite a bit for a subscrption (and they often don't offer a one or two time deal).
    Those will also still have errors, regardless.


    Scott
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!