I've recently seen a tutorial from Youtube on the Subtitle Edit ability to translate speech to text.
It requires a number of steps and Youtube videos go too fast for any sort of comprehension for me.
Is there a printed how-to for this feature?
Like the guy says in the tutorial, it looks like a major advance and would avoid me going
to YouTube Create to get my subtitle out of a video mkv etc. for fine tuning.
+ Reply to Thread
Results 1 to 24 of 24
I have to agree with you - there always should be a text transcript of all Youtube "How to" video's and podcasts and news reports. There are many people with hearing defects and sometimes comprehension issues when a printed version is unavailable. IMO pretty basic like you learn in management school, if you are giving a technical lecture full of 3 letter acronyms, all the acronyms used should be spelled out in full and even written on a whiteboard behind or near the speaker.SONY 75" Full array 200Hz LED TV, Yamaha A1070 amp, Zidoo UHD3000, BeyonWiz PVR V2 (Enigma2 clone), Chromecast, Windows 7 Ultimate, QNAP NAS TS851
Thanks for the supportive comments on the issue. Seeing mouse "twiddling" while some point of explanation is going on makes me want to tear something. That's a polite way of putting it. The youtuber who does these, David M., seems to be the only one dedicated to Subtitle Edit You Tube tutorials. Simply put he should be told to have a care in how his presentation appears to others. I'll not do it or try to because he provides a valuable service.
I made some progress today by starting up David M.s Speech to Text tutorial with a piece of cereal box over the upper portion of the screen while I read the CC captions of instruction. I made it through a few steps and this is something valuable for me to learn so I'll keep at it. But still, screen shots in sequence with text description is about the only thing I can _tolerate_. And I mean this twiddling crap is that bad and insufferably annoying to the point of turning it off.
My request is that someone step in to fill this void.
I'll make a separate post on my questions that come up during the tutorial. I have used SE a bit so it's not totally new, just this speech to text element.
[Note I say David M. because I'm prone to mispell his name.]
link is here: https://www.youtube.com/watch?v=jd19iOWpj_4&t=307s
Last edited by loninappleton; 3rd Aug 2022 at 23:56.
This command will download Closed Captions Only (SEdit-CCs.vtt).
yt-dlp.exe "https://www.youtube.com/watch?v=jd19iOWpj_4&t=307s" --sub-langs en --write-auto-subs -o SEdit-CCs
Last edited by pcspeak; 4th Aug 2022 at 16:21.
This is a very good tip I wasn't aware of. Thanks for making it.
Subtitle Edit feature could replace such a procedure.
I'll step away from David M. stuff for a while just to ask if anyone but he has ever worked with the speech to text setup in Subtitle Edit?
If so please respond.
There are questions that the David M. video can't answer. This is one: In going to try to download the English model needed for
VOSK (I think) for two days I have just gotten the screen message that it won't complete and to try again later. This happens
each time. So I'm wondering if anyone else has used this "game changer" feature with any success in even completing setup.
But I've just encountered barriers in the setup in SE I cannot get past.
Just some ideas. Adobe Premiere pro has speech to text that works in multiple languages. There is a free one week trial and one month costs $32. I paid for one month and I think I can get all the videos I want translated within that time. I can always pay for another month in the future. Kdenlive also has speech to text and is free. But it requires Python for the speech to text. I have not been able to make the Python installation and kdenlive work together. I have another thread asking for help with that but no help yet. I tried the trial of Subtitle Edit and was not impressed. It does not let you try foreign languages with the trial.
Thanks for the reply.
I have no knowledge of advanced professional tools like Adobe nor the ability to pay for long term use. It would take me the free month to figure it out is my take on it.
Also I want to get the free tools working to show this can be done outside of Adobe, Sonix or the other pay for tools but within
What is needed, if this SE is really the "game changer" it is touted to be for this process of audio to text, is a more simple and text oriented explanation of the steps.
Firstly, I created a WAV file from the video using this command.
ffmpeg -i "The Fifth Element.mp4" -c:v none -c:s none -threads 0 -y "The Fifth Element.wav"
Then I ran SubtitleEdit.exe and did this: (See attached video)
You have to wait.... and wait.... and wait... and.... for the process to complete.
The output .SRT needed some work. I copied 'The Fifth Element.srt' -> 'The Fifth Element.txt'
Then opened ''The Fifth Element.txt' with a word processor and spellchecked. That fixed a lot of the errors. Saved as a .TXT file (not .DOC)
Renamed to .SRT. Lots more work with the file needs to be done.
The diction on my test clip was quite clear, with no odd accents. The results were not great for me. ymmv.
Certainly better than nothing, but I'm wondering if going through Google would be the better way.
My screen cap is a bit ordinary.
Last edited by pcspeak; 5th Aug 2022 at 20:20. Reason: Doubled up on attachment. Oops!
Hello pcspeak and all.
I know you are trying to help but I have no knowledge of command line code and so cannot follow much of it. Second I want to
stay in the SE program proper in order to make a guide for anyone I know that wants to make subtitles for ( in my case )
classic plays in the public domain previously Proshot from major theatres. Some of these are on disc and some come
My question was why I cannot download the proper language template to SE. Could be 3.6.5 or 3.6.6 etc.
I don't see where this language template would be picked up by your code, but I can't use code. I'm just
trying to get a method for using SE in the way the program provides.
I have loaded a clip with wave form
I have gone to SE > Video > Video/audio to text
in that I've followed the prompts to download ffmpeg
and the prompt to download VOSK 'models' in English. (English becomes visible under the three dots)
That's where the error box pops up and says try again later.
If I can get a good looking screen shot where everything is visible.... ok got it.
From this point I have made no progress. Things like VOSK and the Kali Speech recognition engine are installed
as the setup goes on. They are just terms I do not know the meaning of yet.
Last edited by loninappleton; 5th Aug 2022 at 23:31.
I was wrong. There is more than one how-to for video/audio to text in SE. This one seems to be a bit more straighforward
but I still follow the text rather than the screen. I mention this for a reason. This guy shows a prompt to
download libvosk. I don't know how to proceed.
But this tutorial is short and I'm give a one time through.
The author of this tutorial is Yosef K and he lost me on this one after some initial steps. These Youtube tutes with the tiny screens and tinier mouse pointers need a different format for presentation. Some people can follow them. I have difficulty to the point of giving up. Anyway the download problem is still there. Yosef K mentioned the possibility of bugs in the version he had.
If i understand your question correctly, youre trying to figure out how to use the text to speech feature? If so, it's easy
On a blank "new project" click the video tab
From there click "audio/video to text"
In the models folder, click the 3 dots to download your language
On the input files click add to choose your media file ( note you have 2 option, default is video, but your can click the media type in bottom right and switch it to audio. This allows for quicker upload/processing)
Depending on video length, it will will start processing
Ok, when I get on this again I'll look for the "Add" which I don't recall seeing.
Thanks for answering. I don't know how I'll get past selecting the 'model for English download part,
but I'll try it out again.
I copied the steps to paper so I can view them at the same time.
Last edited by loninappleton; 9th Aug 2022 at 22:21.
up through seeing the three dots. Still the same. I am not getting that next set of menus. In using the steps from the other user, I tried to
get through this without adding the mkv first, and an error box pops up saying "No Video" and I can't continue. Or I get the message about
"try Again Later" in trying to get the model.
But just to try something else, I used the option top screen left to download VOSK from Alpha Cepei It's a huge zip file of an accurate
english model. I unzipped that but don't know it's usage. It offers the option to open with a program. I selected Subtitle Edit. That opened
with some code in the subtitle window but I don't know it's usage.
To sum up I've made no progress. But I'll run the vid you made all the way through. Something is preventing me from getting past
the three dot opens the procedure screen. It just isn't there either with MKV loaded or not.
From the video clip I see the full form open at three dots-- whatevs. That full form with the box below with Add etc I am not seeing. Perhaps some error checking is needed to see if I have things installed. I know I did the ffmpeg step and libvosk. Where should I find those things located to be in the right place? How can I error check this or start over?
What I saw in the vid was that the EN model appears on the form-- looks like the same name as what I downloaded from alphaceiphi.
I did a reinstall and have seen some progress. Perhaps I missed a prompt at download libvosk. I now have the add box on screen and since no
prompt for ffmpeg I'll assume that is installed correcrtly. I clearly missed something previously so apologies for all the delays and circular questions.
I'll review the videoclip again and follow my notes from the instruction above.
On making those screen tutorials, can you change your screen resolution to Mr. Magoo size type to make following easier?
I've made a new screen shot. It shows that I was able to add my clip into the processing area. I hit "Generate" to avoid the errors at the three dots which fails to complete the download.
As you can see a similar message that a download doesn't complete but just says to try later repeats in the Add area.
I had downloaded a model at Alphacephai which may be the wrong one. Where does Subtitle Edit look for the model-- where is the model supposed to land if it downloads? And which is the right model? The one I got just by guessing is called vosk-model-en-us-0.22.zip and is something over 2 gigabytes or something in size unzipped.
That business with WAV files and such-- just too confusing to watch a mouse pointer.
I have a system that works through Google Create. I can deal with that without command line or WAV files etc. I'll come back to this later.
I'm still thinking about why the routine in Subtitle Edit for audio to video to speech fails to complete.
The error is generated by SE not Windows 7 Pro which I have used only for many years. Can a problem be traced to how SE handles
the Windows 7 Pro OS?
In my defense, I have able to download Teseract within the SE program with no problem.
For the future, since I have gone to Alphacepei several times to download the model, can this be provided
as an option so that SE can find it?
I'm still thinking about why the routine in Subtitle Edit for audio to video to speech fails to complete. The error is generated by SE not Windows 7 Pro which I have used only for many years. Can a problem be traced to how SE handles the old Windows 7 Pro OS compared to Win10 as example? Also in looking at Alphacephei there seems to be Linux style code and refs to Github. But I cannot diagnose things like that.
In my defense, I am able to download Teseract within the SE program with no problem.
For the future, since I have gone to Alphacepei several times to download the model, can that .zip file once unpacked be used to manually install at the proper folder so that SE can find it. The 'download portion' of the routine is the part that fails and gives the error.
I went to Subtitle Edit and registred, then downloaded the newest 3.6.7 SE.
On this one I got a help message displayed in the screenshot attached. I don't know anything about secure server settings and the like but at least the error is explained a bit.
And I can put the 128 mb version of English I have where it needs to go but haven't done that yet.
Success with initial manual install.
With the help of that message above I was able to manually install the lgraph version of the vosk. Once all of this is done the program works fine. And I've experimented with selecting the larger vosk which after unzipping the file is 2.Gb. I'm hoping that that is more accurate. Errors are common but some of the things I do are in dialect which may cause more errors.